2008 | ||
---|---|---|
47 | Maria Cutumisu, Duane Szafron, Michael H. Bowling, Richard S. Sutton: Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. AIIDE 2008 | |
46 | EE | David Silver, Richard S. Sutton, Martin Müller: Sample-based learning and search with permanent and transient memories. ICML 2008: 968-975 |
45 | EE | Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei: A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616 |
44 | EE | Elliot A. Ludvig, Richard S. Sutton, Eric Verbeek, E. James Kehoe: A computational model of hippocampal function in trace conditioning. NIPS 2008: 993-1000 |
43 | EE | Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling: Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536 |
2007 | ||
42 | EE | Richard S. Sutton, Anna Koop, David Silver: On the role of tracking in stationary environments. ICML 2007: 871-878 |
41 | EE | David Silver, Richard S. Sutton, Martin Müller: Reinforcement Learning of Local Shape in the Game of Go. IJCAI 2007: 1053-1058 |
2006 | ||
40 | Alborz Geramifard, Michael H. Bowling, Richard S. Sutton: Incremental Least-Squares Temporal Difference Learning. AAAI 2006 | |
39 | EE | Alborz Geramifard, Michael H. Bowling, Martin Zinkevich, Richard S. Sutton: iLSTD: Eligibility Traces and Convergence Analysis. NIPS 2006: 441-448 |
2005 | ||
38 | EE | Brian Tanner, Richard S. Sutton: TD(lambda) networks: temporal-difference networks with eligibility traces. ICML 2005: 888-895 |
37 | EE | Eddie J. Rafols, Mark B. Ring, Richard S. Sutton, Brian Tanner: Using Predictive Representations to Improve Generalization in Reinforcement Learning. IJCAI 2005: 835-840 |
36 | EE | Brian Tanner, Richard S. Sutton: Temporal-Difference Networks with History. IJCAI 2005: 865-870 |
35 | EE | Doina Precup, Richard S. Sutton, Cosmin Paduraru, Anna Koop, Satinder P. Singh: Off-policy Learning with Options and Recognizers. NIPS 2005 |
34 | EE | Richard S. Sutton, Eddie J. Rafols, Anna Koop: Temporal Abstraction in Temporal-difference Networks. NIPS 2005 |
2004 | ||
33 | EE | Richard S. Sutton, Brian Tanner: Temporal-Difference Networks. NIPS 2004 |
2001 | ||
32 | Doina Precup, Richard S. Sutton, Sanjoy Dasgupta: Off-Policy Temporal Difference Learning with Function Approximation. ICML 2001: 417-424 | |
31 | Peter Stone, Richard S. Sutton: Scaling Reinforcement Learning toward RoboCup Soccer. ICML 2001: 537-544 | |
30 | EE | Michael L. Littman, Richard S. Sutton, Satinder P. Singh: Predictive Representations of State. NIPS 2001: 1555-1561 |
29 | EE | Peter Stone, Richard S. Sutton: Keepaway Soccer: A Machine Learning Testbed. RoboCup 2001: 214-223 |
2000 | ||
28 | Doina Precup, Richard S. Sutton, Satinder P. Singh: Eligibility Traces for Off-Policy Policy Evaluation. ICML 2000: 759-766 | |
27 | EE | Peter Stone, Richard S. Sutton, Satinder P. Singh: Reinforcement Learning for 3 vs. 2 Keepaway RoboCup 2000: 249-258 |
1999 | ||
26 | EE | Richard S. Sutton: Open Theoretical Questions in Reinforcement Learning. EuroCOLT 1999: 11-17 |
25 | EE | Richard S. Sutton, David A. McAllester, Satinder P. Singh, Yishay Mansour: Policy Gradient Methods for Reinforcement Learning with Function Approximation. NIPS 1999: 1057-1063 |
24 | EE | Richard S. Sutton, Doina Precup, Satinder P. Singh: Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artif. Intell. 112(1-2): 181-211 (1999) |
1998 | ||
23 | Doina Precup, Richard S. Sutton, Satinder P. Singh: Theoretical Results on Reinforcement Learning with Temporally Abstract Options. ECML 1998: 382-393 | |
22 | Richard S. Sutton, Doina Precup, Satinder P. Singh: Intra-Option Learning about Temporally Abstract Actions. ICML 1998: 556-564 | |
21 | EE | Robert Moll, Andrew G. Barto, Theodore J. Perkins, Richard S. Sutton: Learning Instance-Independent Value Functions to Enhance Local Search. NIPS 1998: 1017-1023 |
20 | EE | Richard S. Sutton, Satinder P. Singh, Doina Precup, Balaraman Ravindran: Improved Switching among Temporally Abstract Actions. NIPS 1998: 1066-1072 |
19 | EE | Richard S. Sutton: Reinforcement Learning: Past, Present and Future. SEAL 1998: 195-197 |
18 | EE | Richard S. Sutton, Andrew G. Barto: Reinforcement Learning: An Introduction. IEEE Transactions on Neural Networks 9(5): 1054-1054 (1998) |
1997 | ||
17 | Richard S. Sutton: On the Significance of Markov Decision Processes. ICANN 1997: 273-282 | |
16 | Doina Precup, Richard S. Sutton: Exponentiated Gradient Methods for Reinforcement Learning. ICML 1997: 272-277 | |
15 | Doina Precup, Richard S. Sutton: Multi-time Models for Temporally Abstract Planning. NIPS 1997 | |
1996 | ||
14 | Satinder P. Singh, Richard S. Sutton: Reinforcement Learning with Replacing Eligibility Traces. Machine Learning 22(1-3): 123-158 (1996) | |
1995 | ||
13 | Richard S. Sutton: TD Models: Modeling the World at a Mixture of Time Scales. ICML 1995: 531-539 | |
12 | EE | Richard S. Sutton: Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. NIPS 1995: 1038-1044 |
1993 | ||
11 | Richard S. Sutton, Steven D. Whitehead: Online Learning with Random Representations. ICML 1993: 314-321 | |
1992 | ||
10 | Richard S. Sutton: Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta. AAAI 1992: 171-176 | |
1991 | ||
9 | Richard S. Sutton, Christopher J. Matheus: Learning Polynomial Functions by Feature Construction. ML 1991: 208-212 | |
8 | Richard S. Sutton: Planning by Incremental Dynamic Programming. ML 1991: 353-357 | |
7 | EE | Terence D. Sanger, Richard S. Sutton, Christopher J. Matheus: Iterative Construction of Sparse Polynomial Approximations. NIPS 1991: 1064-1071 |
6 | Richard S. Sutton: Dyna, an Integrated Architecture for Learning, Planning, and Reacting. SIGART Bulletin 2(4): 160-163 (1991) | |
1990 | ||
5 | Richard S. Sutton: Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. ML 1990: 216-224 | |
4 | EE | Richard S. Sutton: Integrated Modeling and Control Based on Reinforcement Learning. NIPS 1990: 471-478 |
1989 | ||
3 | EE | Andrew G. Barto, Richard S. Sutton, Christopher J. C. H. Watkins: Sequential Decision Probelms and Neural Networks. NIPS 1989: 686-693 |
1988 | ||
2 | Richard S. Sutton: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3: 9-44 (1988) | |
1985 | ||
1 | Oliver G. Selfridge, Richard S. Sutton, Andrew G. Barto: Training and Tracking in Robotics. IJCAI 1985: 670-672 |