2009 | ||
---|---|---|
49 | EE | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009) |
2008 | ||
48 | EE | András Antos, Varun Grover, Csaba Szepesvári: Active Learning in Multi-armed Bandits. ALT 2008: 287-302 |
47 | EE | Gábor Bartók, Csaba Szepesvári, Sandra Zilles: Active Learning of Group-Structured Environments. ALT 2008: 329-343 |
46 | EE | Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor: Regularized Fitted Q-Iteration: Application to Planning. EWRL 2008: 55-68 |
45 | EE | Volodymyr Mnih, Csaba Szepesvári, Jean-Yves Audibert: Empirical Bernstein stopping. ICML 2008: 672-679 |
44 | EE | Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei: A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616 |
43 | EE | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: Online Optimization in X-Armed Bandits. NIPS 2008: 201-208 |
42 | EE | Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor: Regularized Policy Iteration. NIPS 2008: 441-448 |
41 | EE | Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner: Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction. UAI 2008: 306-314 |
40 | EE | Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling: Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536 |
39 | EE | András Antos, Csaba Szepesvári, Rémi Munos: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1): 89-129 (2008) |
2007 | ||
38 | EE | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165 |
37 | EE | Peter Auer, Ronald Ortner, Csaba Szepesvári: Improved Rates for the Stochastic Continuum-Armed Bandit Problem. COLT 2007: 454-468 |
36 | EE | Amir Massoud Farahmand, Csaba Szepesvári, Jean-Yves Audibert: Manifold-adaptive dimension estimation. ICML 2007: 265-272 |
35 | EE | István Bíró, Zoltán Szamonek, Csaba Szepesvári: Sequence Prediction Exploiting Similary Information. IJCAI 2007: 1576-1581 |
34 | EE | András György, Levente Kocsis, Ivett Szabó, Csaba Szepesvári: Continuous Time Associative Bandit Problems. IJCAI 2007: 830-835 |
33 | EE | András Antos, Rémi Munos, Csaba Szepesvári: Fitted Q-iteration in continuous action-space MDPs. NIPS 2007 |
2006 | ||
32 | EE | Levente Kocsis, Csaba Szepesvári, Mark H. M. Winands: RSPSA: Enhanced Parameter Optimization in Games. ACG 2006: 39-56 |
31 | EE | András Antos, Csaba Szepesvári, Rémi Munos: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588 |
30 | EE | Levente Kocsis, Csaba Szepesvári: Bandit Based Monte-Carlo Planning. ECML 2006: 282-293 |
29 | EE | Péter Torma, Csaba Szepesvári: Local Importance Sampling: A Novel Technique to Enhance Particle Filtering. Journal of Multimedia 1(1): 32-43 (2006) |
28 | EE | Levente Kocsis, Csaba Szepesvári: Universal parameter optimisation in games based on SPSA. Machine Learning 63(3): 249-286 (2006) |
2005 | ||
27 | EE | Zoltán Szamonek, Csaba Szepesvári: X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs When the Number of Mixtures Is Unknown. ICDM 2005: 434-441 |
26 | EE | Csaba Szepesvári, Rémi Munos: Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887 |
2004 | ||
25 | Csaba Szepesvári: Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results. AAAI 2004: 550-555 | |
24 | Csaba Szepesvári, András Kocsor, Kornél Kovács: Kernel Machine Based Feature Extraction Algorithms for Regression Problems. ECAI 2004: 1091-1092 | |
23 | EE | Péter Torma, Csaba Szepesvári: Enhancing Particle Filters Using Local Likelihood Sampling. ECCV (1) 2004: 16-27 |
22 | EE | András Kocsor, Kornél Kovács, Csaba Szepesvári: Margin Maximizing Discriminant Analysis. ECML 2004: 227-238 |
21 | EE | Csaba Szepesvári, William D. Smart: Interpolation-based Q-learning. ICML 2004 |
2001 | ||
20 | EE | Csaba Szepesvári: Efficient approximate planning in continuous space Markovian Decision Problems. AI Commun. 14(3): 163-176 (2001) |
19 | EE | András Lörincz, György Hévízi, Csaba Szepesvári: Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops. Int. J. Neural Syst. 11(2): 125-143 (2001) |
2000 | ||
18 | EE | György Balogh, Ervin Dobler, Tamás Gröbler, Béla Smodics, Csaba Szepesvári: FlexVoice: A Parametric Approach to High-Quality Speech Synthesis. TSD 2000: 189-194 |
17 | EE | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Modular Reinforcement Learning: A Case Study in a Robot Domain. Acta Cybern. 14(3): 507-522 (2000) |
16 | Satinder P. Singh, Tommi Jaakkola, Michael L. Littman, Csaba Szepesvári: Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms. Machine Learning 38(3): 287-308 (2000) | |
1999 | ||
15 | Csaba Szepesvári, Michael L. Littman: A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms. Neural Computation 11(8): 2017-2060 (1999) | |
14 | EE | Zsolt Kalmár, Zsolt Marczell, Csaba Szepesvári, András Lörincz: Parallel and robust skeletonization built on self-organizing elements. Neural Networks 12(1): 163-173 (1999) |
13 | János Murvai, Kristian Vlahovicek, Endre Barta, Csaba Szepesvári, Cristina Acatrinei, Sándor Pongor: The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments. Nucleic Acids Research 27(1): 257-259 (1999) | |
1998 | ||
12 | Zoltán Gábor, Zsolt Kalmár, Csaba Szepesvári: Multi-criteria Reinforcement Learning. ICML 1998: 197-205 | |
11 | EE | Csaba Szepesvári: Non-Markovian Policies in Sequential Decision Problems. Acta Cybern. 13(3): 305-318 (1998) |
10 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module-Based Reinforcement Learning: Experiments with a Real Robot. Auton. Robots 5(3-4): 273-295 (1998) | |
9 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module-Based Reinforcement Learning: Experiments with a Real Robot. Machine Learning 31(1-3): 55-85 (1998) | |
1997 | ||
8 | Csaba Szepesvári: Learning and Exploitation Do Not Conflict Under Minimax Optimality. ECML 1997: 242-249 | |
7 | EE | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module Based Reinforcement Learning: An Application to a Real Robot. EWLR 1997: 29-45 |
6 | Csaba Szepesvári: The Asymptotic Convergence-Rate of Q-learning. NIPS 1997 | |
5 | EE | Csaba Szepesvári, Szabolcs Cimmer, András Lörincz: Neurocontroller using dynamic state feedback for compensatory control. Neural Networks 10(9): 1691-1708 (1997) |
1996 | ||
4 | Csaba Szepesvári, András Lörincz: Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers. ICANN 1996: 791-796 | |
3 | Michael L. Littman, Csaba Szepesvári: A Generalized Reinforcement-Learning Model: Convergence and Applications. ICML 1996: 310-318 | |
2 | EE | Tibor Fomin, Tamás Rozgonyi, Csaba Szepesvári, András Lörincz: Self-Organizing Multi-Resolution Grid for Motion Planning and Control. Int. J. Neural Syst. 7(6): 757- (1996) |
1 | EE | Csaba Szepesvári, András Lörincz: Approximate geometry representations and sensory fusion. Neurocomputing 12(2-3): 267-287 (1996) |