2008 |
47 | | Maria Cutumisu,
Duane Szafron,
Michael H. Bowling,
Richard S. Sutton:
Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games.
AIIDE 2008 |
46 | EE | David Silver,
Richard S. Sutton,
Martin Müller:
Sample-based learning and search with permanent and transient memories.
ICML 2008: 968-975 |
45 | EE | Richard S. Sutton,
Csaba Szepesvári,
Hamid Reza Maei:
A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation.
NIPS 2008: 1609-1616 |
44 | EE | Elliot A. Ludvig,
Richard S. Sutton,
Eric Verbeek,
E. James Kehoe:
A computational model of hippocampal function in trace conditioning.
NIPS 2008: 993-1000 |
43 | EE | Richard S. Sutton,
Csaba Szepesvári,
Alborz Geramifard,
Michael H. Bowling:
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping.
UAI 2008: 528-536 |
2007 |
42 | EE | Richard S. Sutton,
Anna Koop,
David Silver:
On the role of tracking in stationary environments.
ICML 2007: 871-878 |
41 | EE | David Silver,
Richard S. Sutton,
Martin Müller:
Reinforcement Learning of Local Shape in the Game of Go.
IJCAI 2007: 1053-1058 |
2006 |
40 | | Alborz Geramifard,
Michael H. Bowling,
Richard S. Sutton:
Incremental Least-Squares Temporal Difference Learning.
AAAI 2006 |
39 | EE | Alborz Geramifard,
Michael H. Bowling,
Martin Zinkevich,
Richard S. Sutton:
iLSTD: Eligibility Traces and Convergence Analysis.
NIPS 2006: 441-448 |
2005 |
38 | EE | Brian Tanner,
Richard S. Sutton:
TD(lambda) networks: temporal-difference networks with eligibility traces.
ICML 2005: 888-895 |
37 | EE | Eddie J. Rafols,
Mark B. Ring,
Richard S. Sutton,
Brian Tanner:
Using Predictive Representations to Improve Generalization in Reinforcement Learning.
IJCAI 2005: 835-840 |
36 | EE | Brian Tanner,
Richard S. Sutton:
Temporal-Difference Networks with History.
IJCAI 2005: 865-870 |
35 | EE | Doina Precup,
Richard S. Sutton,
Cosmin Paduraru,
Anna Koop,
Satinder P. Singh:
Off-policy Learning with Options and Recognizers.
NIPS 2005 |
34 | EE | Richard S. Sutton,
Eddie J. Rafols,
Anna Koop:
Temporal Abstraction in Temporal-difference Networks.
NIPS 2005 |
2004 |
33 | EE | Richard S. Sutton,
Brian Tanner:
Temporal-Difference Networks.
NIPS 2004 |
2001 |
32 | | Doina Precup,
Richard S. Sutton,
Sanjoy Dasgupta:
Off-Policy Temporal Difference Learning with Function Approximation.
ICML 2001: 417-424 |
31 | | Peter Stone,
Richard S. Sutton:
Scaling Reinforcement Learning toward RoboCup Soccer.
ICML 2001: 537-544 |
30 | EE | Michael L. Littman,
Richard S. Sutton,
Satinder P. Singh:
Predictive Representations of State.
NIPS 2001: 1555-1561 |
29 | EE | Peter Stone,
Richard S. Sutton:
Keepaway Soccer: A Machine Learning Testbed.
RoboCup 2001: 214-223 |
2000 |
28 | | Doina Precup,
Richard S. Sutton,
Satinder P. Singh:
Eligibility Traces for Off-Policy Policy Evaluation.
ICML 2000: 759-766 |
27 | EE | Peter Stone,
Richard S. Sutton,
Satinder P. Singh:
Reinforcement Learning for 3 vs. 2 Keepaway
RoboCup 2000: 249-258 |
1999 |
26 | EE | Richard S. Sutton:
Open Theoretical Questions in Reinforcement Learning.
EuroCOLT 1999: 11-17 |
25 | EE | Richard S. Sutton,
David A. McAllester,
Satinder P. Singh,
Yishay Mansour:
Policy Gradient Methods for Reinforcement Learning with Function Approximation.
NIPS 1999: 1057-1063 |
24 | EE | Richard S. Sutton,
Doina Precup,
Satinder P. Singh:
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.
Artif. Intell. 112(1-2): 181-211 (1999) |
1998 |
23 | | Doina Precup,
Richard S. Sutton,
Satinder P. Singh:
Theoretical Results on Reinforcement Learning with Temporally Abstract Options.
ECML 1998: 382-393 |
22 | | Richard S. Sutton,
Doina Precup,
Satinder P. Singh:
Intra-Option Learning about Temporally Abstract Actions.
ICML 1998: 556-564 |
21 | EE | Robert Moll,
Andrew G. Barto,
Theodore J. Perkins,
Richard S. Sutton:
Learning Instance-Independent Value Functions to Enhance Local Search.
NIPS 1998: 1017-1023 |
20 | EE | Richard S. Sutton,
Satinder P. Singh,
Doina Precup,
Balaraman Ravindran:
Improved Switching among Temporally Abstract Actions.
NIPS 1998: 1066-1072 |
19 | EE | Richard S. Sutton:
Reinforcement Learning: Past, Present and Future.
SEAL 1998: 195-197 |
18 | EE | Richard S. Sutton,
Andrew G. Barto:
Reinforcement Learning: An Introduction.
IEEE Transactions on Neural Networks 9(5): 1054-1054 (1998) |
1997 |
17 | | Richard S. Sutton:
On the Significance of Markov Decision Processes.
ICANN 1997: 273-282 |
16 | | Doina Precup,
Richard S. Sutton:
Exponentiated Gradient Methods for Reinforcement Learning.
ICML 1997: 272-277 |
15 | | Doina Precup,
Richard S. Sutton:
Multi-time Models for Temporally Abstract Planning.
NIPS 1997 |
1996 |
14 | | Satinder P. Singh,
Richard S. Sutton:
Reinforcement Learning with Replacing Eligibility Traces.
Machine Learning 22(1-3): 123-158 (1996) |
1995 |
13 | | Richard S. Sutton:
TD Models: Modeling the World at a Mixture of Time Scales.
ICML 1995: 531-539 |
12 | EE | Richard S. Sutton:
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding.
NIPS 1995: 1038-1044 |
1993 |
11 | | Richard S. Sutton,
Steven D. Whitehead:
Online Learning with Random Representations.
ICML 1993: 314-321 |
1992 |
10 | | Richard S. Sutton:
Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta.
AAAI 1992: 171-176 |
1991 |
9 | | Richard S. Sutton,
Christopher J. Matheus:
Learning Polynomial Functions by Feature Construction.
ML 1991: 208-212 |
8 | | Richard S. Sutton:
Planning by Incremental Dynamic Programming.
ML 1991: 353-357 |
7 | EE | Terence D. Sanger,
Richard S. Sutton,
Christopher J. Matheus:
Iterative Construction of Sparse Polynomial Approximations.
NIPS 1991: 1064-1071 |
6 | | Richard S. Sutton:
Dyna, an Integrated Architecture for Learning, Planning, and Reacting.
SIGART Bulletin 2(4): 160-163 (1991) |
1990 |
5 | | Richard S. Sutton:
Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming.
ML 1990: 216-224 |
4 | EE | Richard S. Sutton:
Integrated Modeling and Control Based on Reinforcement Learning.
NIPS 1990: 471-478 |
1989 |
3 | EE | Andrew G. Barto,
Richard S. Sutton,
Christopher J. C. H. Watkins:
Sequential Decision Probelms and Neural Networks.
NIPS 1989: 686-693 |
1988 |
2 | | Richard S. Sutton:
Learning to Predict by the Methods of Temporal Differences.
Machine Learning 3: 9-44 (1988) |
1985 |
1 | | Oliver G. Selfridge,
Richard S. Sutton,
Andrew G. Barto:
Training and Tracking in Robotics.
IJCAI 1985: 670-672 |