dblp.uni-trier.dewww.uni-trier.de

Richard S. Sutton

List of publications from the DBLP Bibliography Server - FAQ
Coauthor Index - Ask others: ACM DL/Guide - CiteSeer - CSB - Google - MSN - Yahoo

2008
47 Maria Cutumisu, Duane Szafron, Michael H. Bowling, Richard S. Sutton: Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. AIIDE 2008
46EEDavid Silver, Richard S. Sutton, Martin Müller: Sample-based learning and search with permanent and transient memories. ICML 2008: 968-975
45EERichard S. Sutton, Csaba Szepesvári, Hamid Reza Maei: A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616
44EEElliot A. Ludvig, Richard S. Sutton, Eric Verbeek, E. James Kehoe: A computational model of hippocampal function in trace conditioning. NIPS 2008: 993-1000
43EERichard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling: Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536
2007
42EERichard S. Sutton, Anna Koop, David Silver: On the role of tracking in stationary environments. ICML 2007: 871-878
41EEDavid Silver, Richard S. Sutton, Martin Müller: Reinforcement Learning of Local Shape in the Game of Go. IJCAI 2007: 1053-1058
2006
40 Alborz Geramifard, Michael H. Bowling, Richard S. Sutton: Incremental Least-Squares Temporal Difference Learning. AAAI 2006
39EEAlborz Geramifard, Michael H. Bowling, Martin Zinkevich, Richard S. Sutton: iLSTD: Eligibility Traces and Convergence Analysis. NIPS 2006: 441-448
2005
38EEBrian Tanner, Richard S. Sutton: TD(lambda) networks: temporal-difference networks with eligibility traces. ICML 2005: 888-895
37EEEddie J. Rafols, Mark B. Ring, Richard S. Sutton, Brian Tanner: Using Predictive Representations to Improve Generalization in Reinforcement Learning. IJCAI 2005: 835-840
36EEBrian Tanner, Richard S. Sutton: Temporal-Difference Networks with History. IJCAI 2005: 865-870
35EEDoina Precup, Richard S. Sutton, Cosmin Paduraru, Anna Koop, Satinder P. Singh: Off-policy Learning with Options and Recognizers. NIPS 2005
34EERichard S. Sutton, Eddie J. Rafols, Anna Koop: Temporal Abstraction in Temporal-difference Networks. NIPS 2005
2004
33EERichard S. Sutton, Brian Tanner: Temporal-Difference Networks. NIPS 2004
2001
32 Doina Precup, Richard S. Sutton, Sanjoy Dasgupta: Off-Policy Temporal Difference Learning with Function Approximation. ICML 2001: 417-424
31 Peter Stone, Richard S. Sutton: Scaling Reinforcement Learning toward RoboCup Soccer. ICML 2001: 537-544
30EEMichael L. Littman, Richard S. Sutton, Satinder P. Singh: Predictive Representations of State. NIPS 2001: 1555-1561
29EEPeter Stone, Richard S. Sutton: Keepaway Soccer: A Machine Learning Testbed. RoboCup 2001: 214-223
2000
28 Doina Precup, Richard S. Sutton, Satinder P. Singh: Eligibility Traces for Off-Policy Policy Evaluation. ICML 2000: 759-766
27EEPeter Stone, Richard S. Sutton, Satinder P. Singh: Reinforcement Learning for 3 vs. 2 Keepaway RoboCup 2000: 249-258
1999
26EERichard S. Sutton: Open Theoretical Questions in Reinforcement Learning. EuroCOLT 1999: 11-17
25EERichard S. Sutton, David A. McAllester, Satinder P. Singh, Yishay Mansour: Policy Gradient Methods for Reinforcement Learning with Function Approximation. NIPS 1999: 1057-1063
24EERichard S. Sutton, Doina Precup, Satinder P. Singh: Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artif. Intell. 112(1-2): 181-211 (1999)
1998
23 Doina Precup, Richard S. Sutton, Satinder P. Singh: Theoretical Results on Reinforcement Learning with Temporally Abstract Options. ECML 1998: 382-393
22 Richard S. Sutton, Doina Precup, Satinder P. Singh: Intra-Option Learning about Temporally Abstract Actions. ICML 1998: 556-564
21EERobert Moll, Andrew G. Barto, Theodore J. Perkins, Richard S. Sutton: Learning Instance-Independent Value Functions to Enhance Local Search. NIPS 1998: 1017-1023
20EERichard S. Sutton, Satinder P. Singh, Doina Precup, Balaraman Ravindran: Improved Switching among Temporally Abstract Actions. NIPS 1998: 1066-1072
19EERichard S. Sutton: Reinforcement Learning: Past, Present and Future. SEAL 1998: 195-197
18EERichard S. Sutton, Andrew G. Barto: Reinforcement Learning: An Introduction. IEEE Transactions on Neural Networks 9(5): 1054-1054 (1998)
1997
17 Richard S. Sutton: On the Significance of Markov Decision Processes. ICANN 1997: 273-282
16 Doina Precup, Richard S. Sutton: Exponentiated Gradient Methods for Reinforcement Learning. ICML 1997: 272-277
15 Doina Precup, Richard S. Sutton: Multi-time Models for Temporally Abstract Planning. NIPS 1997
1996
14 Satinder P. Singh, Richard S. Sutton: Reinforcement Learning with Replacing Eligibility Traces. Machine Learning 22(1-3): 123-158 (1996)
1995
13 Richard S. Sutton: TD Models: Modeling the World at a Mixture of Time Scales. ICML 1995: 531-539
12EERichard S. Sutton: Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. NIPS 1995: 1038-1044
1993
11 Richard S. Sutton, Steven D. Whitehead: Online Learning with Random Representations. ICML 1993: 314-321
1992
10 Richard S. Sutton: Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta. AAAI 1992: 171-176
1991
9 Richard S. Sutton, Christopher J. Matheus: Learning Polynomial Functions by Feature Construction. ML 1991: 208-212
8 Richard S. Sutton: Planning by Incremental Dynamic Programming. ML 1991: 353-357
7EETerence D. Sanger, Richard S. Sutton, Christopher J. Matheus: Iterative Construction of Sparse Polynomial Approximations. NIPS 1991: 1064-1071
6 Richard S. Sutton: Dyna, an Integrated Architecture for Learning, Planning, and Reacting. SIGART Bulletin 2(4): 160-163 (1991)
1990
5 Richard S. Sutton: Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. ML 1990: 216-224
4EERichard S. Sutton: Integrated Modeling and Control Based on Reinforcement Learning. NIPS 1990: 471-478
1989
3EEAndrew G. Barto, Richard S. Sutton, Christopher J. C. H. Watkins: Sequential Decision Probelms and Neural Networks. NIPS 1989: 686-693
1988
2 Richard S. Sutton: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3: 9-44 (1988)
1985
1 Oliver G. Selfridge, Richard S. Sutton, Andrew G. Barto: Training and Tracking in Robotics. IJCAI 1985: 670-672

Coauthor Index

1Andrew G. Barto [1] [3] [18] [21]
2Michael H. Bowling [39] [40] [43] [47]
3Maria Cutumisu [47]
4Sanjoy Dasgupta [32]
5Alborz Geramifard [39] [40] [43]
6E. James Kehoe [44]
7Anna Koop [34] [35] [42]
8Michael L. Littman [30]
9Elliot A. Ludvig [44]
10Hamid Reza Maei [45]
11Yishay Mansour [25]
12Christopher J. Matheus [7] [9]
13David A. McAllester [25]
14Robert Moll (Robert N. Moll) [21]
15Martin Müller [41] [46]
16Cosmin Paduraru [35]
17Theodore J. Perkins [21]
18Doina Precup [15] [16] [20] [22] [23] [24] [28] [32] [35]
19Eddie J. Rafols [34] [37]
20Balaraman Ravindran [20]
21Mark B. Ring [37]
22Terence D. Sanger [7]
23Oliver G. Selfridge [1]
24David Silver [41] [42] [46]
25Satinder P. Singh [14] [20] [22] [23] [24] [25] [27] [28] [30] [35]
26Peter Stone [27] [29] [31]
27Duane Szafron [47]
28Csaba Szepesvári [43] [45]
29Brian Tanner [33] [36] [37] [38]
30H. M. W. (Eric) Verbeek (H. M. W. Verbeek, Eric Verbeek) [44]
31Christopher J. C. H. Watkins [3]
32Steven D. Whitehead [11]
33Martin Zinkevich [39]

Colors in the list of coauthors

Copyright © Sun May 17 03:24:02 2009 by Michael Ley (ley@uni-trier.de)