2009 |
49 | EE | Jean-Yves Audibert,
Rémi Munos,
Csaba Szepesvári:
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits.
Theor. Comput. Sci. 410(19): 1876-1902 (2009) |
2008 |
48 | EE | András Antos,
Varun Grover,
Csaba Szepesvári:
Active Learning in Multi-armed Bandits.
ALT 2008: 287-302 |
47 | EE | Gábor Bartók,
Csaba Szepesvári,
Sandra Zilles:
Active Learning of Group-Structured Environments.
ALT 2008: 329-343 |
46 | EE | Amir Massoud Farahmand,
Mohammad Ghavamzadeh,
Csaba Szepesvári,
Shie Mannor:
Regularized Fitted Q-Iteration: Application to Planning.
EWRL 2008: 55-68 |
45 | EE | Volodymyr Mnih,
Csaba Szepesvári,
Jean-Yves Audibert:
Empirical Bernstein stopping.
ICML 2008: 672-679 |
44 | EE | Richard S. Sutton,
Csaba Szepesvári,
Hamid Reza Maei:
A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation.
NIPS 2008: 1609-1616 |
43 | EE | Sébastien Bubeck,
Rémi Munos,
Gilles Stoltz,
Csaba Szepesvári:
Online Optimization in X-Armed Bandits.
NIPS 2008: 201-208 |
42 | EE | Amir Massoud Farahmand,
Mohammad Ghavamzadeh,
Csaba Szepesvári,
Shie Mannor:
Regularized Policy Iteration.
NIPS 2008: 441-448 |
41 | EE | Alejandro Isaza,
Csaba Szepesvári,
Vadim Bulitko,
Russell Greiner:
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction.
UAI 2008: 306-314 |
40 | EE | Richard S. Sutton,
Csaba Szepesvári,
Alborz Geramifard,
Michael H. Bowling:
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping.
UAI 2008: 528-536 |
39 | EE | András Antos,
Csaba Szepesvári,
Rémi Munos:
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path.
Machine Learning 71(1): 89-129 (2008) |
2007 |
38 | EE | Jean-Yves Audibert,
Rémi Munos,
Csaba Szepesvári:
Tuning Bandit Algorithms in Stochastic Environments.
ALT 2007: 150-165 |
37 | EE | Peter Auer,
Ronald Ortner,
Csaba Szepesvári:
Improved Rates for the Stochastic Continuum-Armed Bandit Problem.
COLT 2007: 454-468 |
36 | EE | Amir Massoud Farahmand,
Csaba Szepesvári,
Jean-Yves Audibert:
Manifold-adaptive dimension estimation.
ICML 2007: 265-272 |
35 | EE | István Bíró,
Zoltán Szamonek,
Csaba Szepesvári:
Sequence Prediction Exploiting Similary Information.
IJCAI 2007: 1576-1581 |
34 | EE | András György,
Levente Kocsis,
Ivett Szabó,
Csaba Szepesvári:
Continuous Time Associative Bandit Problems.
IJCAI 2007: 830-835 |
33 | EE | András Antos,
Rémi Munos,
Csaba Szepesvári:
Fitted Q-iteration in continuous action-space MDPs.
NIPS 2007 |
2006 |
32 | EE | Levente Kocsis,
Csaba Szepesvári,
Mark H. M. Winands:
RSPSA: Enhanced Parameter Optimization in Games.
ACG 2006: 39-56 |
31 | EE | András Antos,
Csaba Szepesvári,
Rémi Munos:
Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path.
COLT 2006: 574-588 |
30 | EE | Levente Kocsis,
Csaba Szepesvári:
Bandit Based Monte-Carlo Planning.
ECML 2006: 282-293 |
29 | EE | Péter Torma,
Csaba Szepesvári:
Local Importance Sampling: A Novel Technique to Enhance Particle Filtering.
Journal of Multimedia 1(1): 32-43 (2006) |
28 | EE | Levente Kocsis,
Csaba Szepesvári:
Universal parameter optimisation in games based on SPSA.
Machine Learning 63(3): 249-286 (2006) |
2005 |
27 | EE | Zoltán Szamonek,
Csaba Szepesvári:
X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs When the Number of Mixtures Is Unknown.
ICDM 2005: 434-441 |
26 | EE | Csaba Szepesvári,
Rémi Munos:
Finite time bounds for sampling based fitted value iteration.
ICML 2005: 880-887 |
2004 |
25 | | Csaba Szepesvári:
Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results.
AAAI 2004: 550-555 |
24 | | Csaba Szepesvári,
András Kocsor,
Kornél Kovács:
Kernel Machine Based Feature Extraction Algorithms for Regression Problems.
ECAI 2004: 1091-1092 |
23 | EE | Péter Torma,
Csaba Szepesvári:
Enhancing Particle Filters Using Local Likelihood Sampling.
ECCV (1) 2004: 16-27 |
22 | EE | András Kocsor,
Kornél Kovács,
Csaba Szepesvári:
Margin Maximizing Discriminant Analysis.
ECML 2004: 227-238 |
21 | EE | Csaba Szepesvári,
William D. Smart:
Interpolation-based Q-learning.
ICML 2004 |
2001 |
20 | EE | Csaba Szepesvári:
Efficient approximate planning in continuous space Markovian Decision Problems.
AI Commun. 14(3): 163-176 (2001) |
19 | EE | András Lörincz,
György Hévízi,
Csaba Szepesvári:
Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops.
Int. J. Neural Syst. 11(2): 125-143 (2001) |
2000 |
18 | EE | György Balogh,
Ervin Dobler,
Tamás Gröbler,
Béla Smodics,
Csaba Szepesvári:
FlexVoice: A Parametric Approach to High-Quality Speech Synthesis.
TSD 2000: 189-194 |
17 | EE | Zsolt Kalmár,
Csaba Szepesvári,
András Lörincz:
Modular Reinforcement Learning: A Case Study in a Robot Domain.
Acta Cybern. 14(3): 507-522 (2000) |
16 | | Satinder P. Singh,
Tommi Jaakkola,
Michael L. Littman,
Csaba Szepesvári:
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.
Machine Learning 38(3): 287-308 (2000) |
1999 |
15 | | Csaba Szepesvári,
Michael L. Littman:
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms.
Neural Computation 11(8): 2017-2060 (1999) |
14 | EE | Zsolt Kalmár,
Zsolt Marczell,
Csaba Szepesvári,
András Lörincz:
Parallel and robust skeletonization built on self-organizing elements.
Neural Networks 12(1): 163-173 (1999) |
13 | | János Murvai,
Kristian Vlahovicek,
Endre Barta,
Csaba Szepesvári,
Cristina Acatrinei,
Sándor Pongor:
The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments.
Nucleic Acids Research 27(1): 257-259 (1999) |
1998 |
12 | | Zoltán Gábor,
Zsolt Kalmár,
Csaba Szepesvári:
Multi-criteria Reinforcement Learning.
ICML 1998: 197-205 |
11 | EE | Csaba Szepesvári:
Non-Markovian Policies in Sequential Decision Problems.
Acta Cybern. 13(3): 305-318 (1998) |
10 | | Zsolt Kalmár,
Csaba Szepesvári,
András Lörincz:
Module-Based Reinforcement Learning: Experiments with a Real Robot.
Auton. Robots 5(3-4): 273-295 (1998) |
9 | | Zsolt Kalmár,
Csaba Szepesvári,
András Lörincz:
Module-Based Reinforcement Learning: Experiments with a Real Robot.
Machine Learning 31(1-3): 55-85 (1998) |
1997 |
8 | | Csaba Szepesvári:
Learning and Exploitation Do Not Conflict Under Minimax Optimality.
ECML 1997: 242-249 |
7 | EE | Zsolt Kalmár,
Csaba Szepesvári,
András Lörincz:
Module Based Reinforcement Learning: An Application to a Real Robot.
EWLR 1997: 29-45 |
6 | | Csaba Szepesvári:
The Asymptotic Convergence-Rate of Q-learning.
NIPS 1997 |
5 | EE | Csaba Szepesvári,
Szabolcs Cimmer,
András Lörincz:
Neurocontroller using dynamic state feedback for compensatory control.
Neural Networks 10(9): 1691-1708 (1997) |
1996 |
4 | | Csaba Szepesvári,
András Lörincz:
Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers.
ICANN 1996: 791-796 |
3 | | Michael L. Littman,
Csaba Szepesvári:
A Generalized Reinforcement-Learning Model: Convergence and Applications.
ICML 1996: 310-318 |
2 | EE | Tibor Fomin,
Tamás Rozgonyi,
Csaba Szepesvári,
András Lörincz:
Self-Organizing Multi-Resolution Grid for Motion Planning and Control.
Int. J. Neural Syst. 7(6): 757- (1996) |
1 | EE | Csaba Szepesvári,
András Lörincz:
Approximate geometry representations and sensory fusion.
Neurocomputing 12(2-3): 267-287 (1996) |