2008 |
91 | EE | Britton Wolfe,
Michael R. James,
Satinder P. Singh:
Approximate predictive state representations.
AAMAS (1) 2008: 363-370 |
90 | EE | David Wingate,
Satinder P. Singh:
Efficiently learning linear-linear exponential family predictive representations of state.
ICML 2008: 1176-1183 |
2007 |
89 | | Vishal Soni,
Satinder P. Singh:
Abstraction in Predictive State Representations.
AAAI 2007: 639-644 |
88 | EE | David Wingate,
Satinder P. Singh:
On discovery and learning of models with predictive representations of state for agents with continuous actions and observations.
AAMAS 2007: 187 |
87 | EE | Vishal Soni,
Satinder P. Singh,
Michael P. Wellman:
Constraint satisfaction algorithms for graphical games.
AAMAS 2007: 67 |
86 | EE | David Wingate,
Vishal Soni,
Britton Wolfe,
Satinder P. Singh:
Relational Knowledge with Predictive State Representations.
IJCAI 2007: 2035-2040 |
85 | EE | Yevgeniy Vorobeychik,
Michael P. Wellman,
Satinder P. Singh:
Learning payoff functions in infinite games.
Machine Learning 67(1-2): 145-168 (2007) |
2006 |
84 | | David Wingate,
Satinder P. Singh:
Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems.
AAAI 2006 |
83 | | Vishal Soni,
Satinder P. Singh:
Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains.
AAAI 2006 |
82 | EE | David Wingate,
Satinder P. Singh:
Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems.
ICML 2006: 1017-1024 |
81 | EE | Britton Wolfe,
Satinder P. Singh:
Predictive state representations with options.
ICML 2006: 1025-1032 |
80 | EE | Matthew R. Rudary,
Satinder P. Singh:
Predictive linear-Gaussian models of controlled stochastic dynamical systems.
ICML 2006: 777-784 |
79 | EE | Ruggiero Cavallo,
David C. Parkes,
Satinder P. Singh:
Optimal Coordinated Planning Amongst Self-Interested Agents with Private State.
UAI 2006 |
78 | EE | Charles Lee Isbell Jr.,
Michael J. Kearns,
Satinder P. Singh,
Christian R. Shelton,
Peter Stone,
David P. Kormann:
Cobot in LambdaMOO: An Adaptive Social Statistics Agent.
Autonomous Agents and Multi-Agent Systems 13(3): 327-354 (2006) |
2005 |
77 | | Michael R. James,
Satinder P. Singh:
Planning in Models that Combine Memory with Predictive Representations of State.
AAAI 2005: 987-992 |
76 | EE | Britton Wolfe,
Michael R. James,
Satinder P. Singh:
Learning predictive state representations in dynamical systems without reset.
ICML 2005: 980-987 |
75 | EE | Michael R. James,
Britton Wolfe,
Satinder P. Singh:
Combining Memory and Landmarks with Predictive State Representations.
IJCAI 2005: 734-739 |
74 | EE | Yevgeniy Vorobeychik,
Michael P. Wellman,
Satinder P. Singh:
Learning Payoff Functions in Infinite Games.
IJCAI 2005: 977-982 |
73 | EE | Doina Precup,
Richard S. Sutton,
Cosmin Paduraru,
Anna Koop,
Satinder P. Singh:
Off-policy Learning with Options and Recognizers.
NIPS 2005 |
72 | EE | Matthew R. Rudary,
Satinder P. Singh,
David Wingate:
Predictive Linear-Gaussian Models of Stochastic Dynamical Systems.
UAI 2005: 501-508 |
71 | | Nicholas L. Cassimatis,
Sean Luke,
Simon D. Levy,
Ross Gayler,
Pentti Kanerva,
Chris Eliasmith,
Timothy W. Bickmore,
Alan C. Schultz,
Randall Davis,
James A. Landay,
Robert C. Miller,
Eric Saund,
Thomas F. Stahovich,
Michael L. Littman,
Satinder P. Singh,
Shlomo Argamon,
Shlomo Dubnov:
Reports on the 2004 AAAI Fall Symposia.
AI Magazine 26(1): 98-102 (2005) |
2004 |
70 | EE | Satinder P. Singh,
Vishal Soni,
Michael P. Wellman:
Computing approximate bayes-nash equilibria in tree-games of incomplete information.
ACM Conference on Electronic Commerce 2004: 81-90 |
69 | EE | Joshua Estelle,
Yevgeniy Vorobeychik,
Michael P. Wellman,
Satinder P. Singh,
Christopher Kiekintveld,
Vishal Soni:
Strategic Interactions in the TAC 2003 Supply Chain Tournament.
Computers and Games 2004: 316-331 |
68 | | Christopher Kiekintveld,
Michael P. Wellman,
Satinder P. Singh,
Joshua Estelle,
Yevgeniy Vorobeychik,
Vishal Soni,
Matthew R. Rudary:
Distributed Feedback Control for Decision Making on Supply Chains.
ICAPS 2004: 384-392 |
67 | EE | Matthew R. Rudary,
Satinder P. Singh,
Martha E. Pollack:
Adaptive cognitive orthotics: combining reinforcement learning and constraint-based temporal reasoning.
ICML 2004 |
66 | EE | Michael R. James,
Satinder P. Singh:
Learning and discovery of predictive state representations in dynamical systems with reset.
ICML 2004 |
65 | EE | David C. Parkes,
Satinder P. Singh,
Dimah Yanovsky:
Approximately Efficient Online Mechanism Design.
NIPS 2004 |
64 | EE | Satinder P. Singh,
Andrew G. Barto,
Nuttapong Chentanez:
Intrinsically Motivated Reinforcement Learning.
NIPS 2004 |
63 | EE | Satinder P. Singh,
Michael R. James,
Matthew R. Rudary:
Predictive State Representations: A New Theory for Modeling Dynamical Systems.
UAI 2004: 512-518 |
62 | EE | Christopher Kiekintveld,
Michael P. Wellman,
Satinder P. Singh,
Vishal Soni:
Value-driven procurement in the TAC supply chain game.
SIGecom Exchanges 4(3): 9-18 (2004) |
2003 |
61 | | Satinder P. Singh,
Michael L. Littman,
Nicholas K. Jong,
David Pardoe,
Peter Stone:
Learning Predictive State Representations.
ICML 2003: 712-719 |
60 | EE | Matthew R. Rudary,
Satinder P. Singh:
A Nonlinear Predictive State Representation.
NIPS 2003 |
59 | EE | David C. Parkes,
Satinder P. Singh:
An MDP-Based Approach to Online Mechanism Design.
NIPS 2003 |
2002 |
58 | | Michael J. Kearns,
Charles Lee Isbell Jr.,
Satinder P. Singh,
Diane J. Litman,
Jessica Howe:
CobotDS: A Spoken Dialogue System for Chat.
AAAI/IAAI 2002: 425-430 |
57 | EE | Satinder P. Singh,
Diane J. Litman,
Michael J. Kearns,
Marilyn A. Walker:
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System.
J. Artif. Intell. Res. (JAIR) 16: 105-133 (2002) |
56 | | Satinder P. Singh:
Introduction.
Machine Learning 49(2-3): 107-109 (2002) |
55 | | Michael J. Kearns,
Satinder P. Singh:
Near-Optimal Reinforcement Learning in Polynomial Time.
Machine Learning 49(2-3): 209-232 (2002) |
2001 |
54 | EE | Peter Stone,
Michael L. Littman,
Satinder P. Singh,
Michael J. Kearns:
ATTac-2000: an adaptive autonomous bidding agent.
Agents 2001: 238-245 |
53 | EE | Charles Lee Isbell Jr.,
Christian R. Shelton,
Michael J. Kearns,
Satinder P. Singh,
Peter Stone:
A social reinforcement learning agent.
Agents 2001: 377-384 |
52 | EE | Charles Lee Isbell Jr.,
Christian R. Shelton,
Michael J. Kearns,
Satinder P. Singh,
Peter Stone:
Cobot: A Social Reinforcement Learning Agent.
NIPS 2001: 1393-1400 |
51 | EE | Michael L. Littman,
Richard S. Sutton,
Satinder P. Singh:
Predictive Representations of State.
NIPS 2001: 1555-1561 |
50 | EE | Michael L. Littman,
Michael J. Kearns,
Satinder P. Singh:
An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games.
NIPS 2001: 817-823 |
49 | EE | Michael J. Kearns,
Michael L. Littman,
Satinder P. Singh:
Graphical Models for Game Theory.
UAI 2001: 253-260 |
48 | EE | János A. Csirik,
Michael L. Littman,
Satinder P. Singh,
Peter Stone:
FAucS : An FCC Spectrum Auction Simulator for Autonomous Bidding Agents.
WELCOM 2001: 139-151 |
47 | EE | Peter Stone,
Michael L. Littman,
Satinder P. Singh,
Michael J. Kearns:
ATTac-2000: An Adaptive Autonomous Bidding Agent.
J. Artif. Intell. Res. (JAIR) 15: 189-206 (2001) |
2000 |
46 | | Charles Lee Isbell Jr.,
Michael J. Kearns,
David P. Kormann,
Satinder P. Singh,
Peter Stone:
Cobot in LambdaMOO: A Social Statistics Agent.
AAAI/IAAI 2000: 36-41 |
45 | | Satinder P. Singh,
Michael J. Kearns,
Diane J. Litman,
Marilyn A. Walker:
Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System.
AAAI/IAAI 2000: 645-651 |
44 | EE | Diane J. Litman,
Michael S. Kearns,
Satinder P. Singh,
Marilyn A. Walker:
Automatic Optimization of Dialogue Management.
COLING 2000: 502-508 |
43 | | Michael J. Kearns,
Satinder P. Singh:
Bias-Variance Error Bounds for Temporal Difference Updates.
COLT 2000: 142-147 |
42 | | Kary Myers,
Michael J. Kearns,
Satinder P. Singh,
Marilyn A. Walker:
A Boosting Approach to Topic Spotting on Subdialogues.
ICML 2000: 655-662 |
41 | | Doina Precup,
Richard S. Sutton,
Satinder P. Singh:
Eligibility Traces for Off-Policy Policy Evaluation.
ICML 2000: 759-766 |
40 | EE | Peter Stone,
Richard S. Sutton,
Satinder P. Singh:
Reinforcement Learning for 3 vs. 2 Keepaway
RoboCup 2000: 249-258 |
39 | EE | Michael J. Kearns,
Yishay Mansour,
Satinder P. Singh:
Fast Planning in Stochastic Games.
UAI 2000: 309-316 |
38 | EE | Satinder P. Singh,
Michael J. Kearns,
Yishay Mansour:
Nash Convergence of Gradient Dynamics in General-Sum Games.
UAI 2000: 541-548 |
37 | | Satinder P. Singh,
Tommi Jaakkola,
Michael L. Littman,
Csaba Szepesvári:
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.
Machine Learning 38(3): 287-308 (2000) |
1999 |
36 | EE | Richard S. Sutton,
David A. McAllester,
Satinder P. Singh,
Yishay Mansour:
Policy Gradient Methods for Reinforcement Learning with Function Approximation.
NIPS 1999: 1057-1063 |
35 | EE | Satinder P. Singh,
Michael J. Kearns,
Diane J. Litman,
Marilyn A. Walker:
Reinforcement Learning for Spoken Dialogue Systems.
NIPS 1999: 956-962 |
34 | EE | Yishay Mansour,
Satinder P. Singh:
On the Complexity of Policy Iteration.
UAI 1999: 401-408 |
33 | EE | David A. McAllester,
Satinder P. Singh:
Approximate Planning for Factored POMDPs using Belief State Simplification.
UAI 1999: 409-416 |
32 | EE | Richard S. Sutton,
Doina Precup,
Satinder P. Singh:
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.
Artif. Intell. 112(1-2): 181-211 (1999) |
1998 |
31 | | Doina Precup,
Richard S. Sutton,
Satinder P. Singh:
Theoretical Results on Reinforcement Learning with Temporally Abstract Options.
ECML 1998: 382-393 |
30 | | Michael J. Kearns,
Satinder P. Singh:
Near-Optimal Reinforcement Learning in Polynominal Time.
ICML 1998: 260-268 |
29 | | John Loch,
Satinder P. Singh:
Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes.
ICML 1998: 323-331 |
28 | | Richard S. Sutton,
Doina Precup,
Satinder P. Singh:
Intra-Option Learning about Temporally Abstract Actions.
ICML 1998: 556-564 |
27 | EE | Richard S. Sutton,
Satinder P. Singh,
Doina Precup,
Balaraman Ravindran:
Improved Switching among Temporally Abstract Actions.
NIPS 1998: 1066-1072 |
26 | EE | John K. Williams,
Satinder P. Singh:
Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes.
NIPS 1998: 1073-1080 |
25 | EE | Timothy X. Brown,
Hui Tong,
Satinder P. Singh:
Optimizing Admission Control while Ensuring Quality of Service in Multimedia Networks via Reinforcement Learning.
NIPS 1998: 982-988 |
24 | EE | Michael J. Kearns,
Satinder P. Singh:
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms.
NIPS 1998: 996-1002 |
23 | | Satinder P. Singh,
Peter Dayan:
Analytical Mean Squared Error Curves for Temporal Difference Learning.
Machine Learning 32(1): 5-40 (1998) |
1997 |
22 | | Satinder P. Singh,
David Cohn:
How to Dynamically Merge Markov Decision Processes.
NIPS 1997 |
1996 |
21 | EE | Lawrence K. Saul,
Satinder P. Singh:
Learning Curve Bounds for a Markov Decision Process with Undiscounted Rewards.
COLT 1996: 147-156 |
20 | EE | Satinder P. Singh,
Peter Dayan:
Analytical Mean Squared Error Curves in Temporal Difference Learning.
NIPS 1996: 1054-1060 |
19 | EE | David A. Cohn,
Satinder P. Singh:
Predicting Lifetimes in Dynamically Allocated Memory.
NIPS 1996: 939-945 |
18 | EE | Satinder P. Singh,
Dimitri P. Bertsekas:
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems.
NIPS 1996: 974-980 |
17 | | Satinder P. Singh,
Richard S. Sutton:
Reinforcement Learning with Replacing Eligibility Traces.
Machine Learning 22(1-3): 123-158 (1996) |
1995 |
16 | EE | Lawrence K. Saul,
Satinder P. Singh:
Markov Decision Processes in Large State Spaces.
COLT 1995: 281-288 |
15 | EE | Peter Dayan,
Satinder P. Singh:
Improving Policies without Measuring Merits.
NIPS 1995: 1059-1065 |
14 | EE | Andrew G. Barto,
Steven J. Bradtke,
Satinder P. Singh:
Learning to Act Using Real-Time Dynamic Programming.
Artif. Intell. 72(1-2): 81-138 (1995) |
1994 |
13 | | Satinder P. Singh:
Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes.
AAAI 1994: 700-705 |
12 | | Satinder P. Singh,
Tommi Jaakkola,
Michael I. Jordan:
Learning Without State-Estimation in Partially Observable Markovian Decision Processes.
ICML 1994: 284-292 |
11 | EE | Tommi Jaakkola,
Satinder P. Singh,
Michael I. Jordan:
Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems.
NIPS 1994: 345-352 |
10 | EE | Satinder P. Singh,
Tommi Jaakkola,
Michael I. Jordan:
Reinforcement Learning with Soft State Aggregation.
NIPS 1994: 361-368 |
9 | | Satinder P. Singh,
Richard C. Yee:
An Upper Bound on the Loss from Approximate Optimal-Value Functions.
Machine Learning 16(3): 227-233 (1994) |
1993 |
8 | EE | Satinder P. Singh,
Andrew G. Barto,
Roderic A. Grupen,
Christopher I. Connolly:
Robust Reinforcement Learning in Motion Planning.
NIPS 1993: 655-662 |
7 | EE | Tommi Jaakkola,
Michael I. Jordan,
Satinder P. Singh:
Convergence of Stochastic Iterative Dynamic Programming Algorithms.
NIPS 1993: 703-710 |
1992 |
6 | | Satinder P. Singh:
Reinforcement Learning with a Hierarchy of Abstract Models.
AAAI 1992: 202-207 |
5 | | Satinder P. Singh:
Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models.
ML 1992: 406-415 |
4 | | Satinder P. Singh:
Transfer of Learning by Composing Solutions of Elemental Sequential Tasks.
Machine Learning 8: 323-339 (1992) |
1991 |
3 | | Satinder P. Singh:
Transfer of Learning Across Compositions of Sequentail Tasks.
ML 1991: 348-352 |
2 | EE | Satinder P. Singh:
The Efficient Learning of Multiple Task Sequences.
NIPS 1991: 251-258 |
1 | EE | N. E. Berthier,
Satinder P. Singh,
Andrew G. Barto,
James C. Houk:
A Cortico-Cerebellar Model that Learns to Generate Distributed Motor Commands to Control a Kinematic Arm.
NIPS 1991: 611-618 |