![]() | ![]() |
1999 | ||
---|---|---|
3 | Daishi Harada: Towards Bounded Optimal Meta-Level Control: A Case Study. AAAI/IAAI 1999: 946 | |
2 | Andrew Y. Ng, Daishi Harada, Stuart J. Russell: Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. ICML 1999: 278-287 | |
1997 | ||
1 | Daishi Harada: Reinforcement Learning with Time. AAAI/IAAI 1997: 577-582 |
1 | Andrew Y. Ng | [2] |
2 | Stuart J. Russell | [2] |