EWRL9 (2011) | European Workshops on Reinforcement Learning
The 9th European Workshop on Reinforcement Learning (EWRL-9) ————————————————————
will be co-located with ECML PKDD 2011.
When: Sept 9 – 11
Where: Athens Greece
[description] [submission] [dates] [committees] [keynotes] [papers] [registration] [venue] [schedule] [photos] [sponsors]
Description
The 9th European workshop on reinforcement learning (EWRL-9)
invites reinforcement learning researchers to participate in
the revival of this world class event. We plan to make this an
exciting event for researchers worldwide, not only for the
presentation of top quality papers, but also as a forum for
ample discussion of open problems and future research
directions. EWRL9 will consist of four keynote talks,
contributed paper presentations, discussion sessions spread
over a three day period, and a poster session with refreshments
provided on day two.
Reinforcement learning is an active field of
research which deals with the problem of sequential decision
making in unknown (and often) stochastic and/or partially
observable environments. Recently there has been a wealth of
both impressive empirical results, as well as significant
theoretical advances. Both types of advances are of significant
importance and we would like to create a forum to discuss such
interesting results.
The workshop will cover a range of sub-topics including
(but not limited to):
- Exploration/Exploitation
- Function approximation in RL
- Theoretical aspects of RL
- Policy search methods
- Empirical evaluations in RL
- Kernel methods for RL
- Partial observable RL
- Bayesian RL
- Multi agent RL
- Risk-sensitive RL
- Financial RL
- Knowledge Representation in RL
Keynote Speakers
- Peter Auer – University of Leoben – Leoben, Austria
- Kristian Kersting – Fraunhofer IAIS, University of Bonn – Sankt Augustin, Germany
- Peter Stone – University Of Texas – Austin, USA
- Csaba Szepesvari – University Of Alberta – Alberta, Canada
Paper Submission
We are calling for papers (and posters) from the entire reinforcement
learning spectrum, with the option of either 3 page position
papers (on which open discussion will be held) or longer 12
page LNAI format research papers. We encourage a range of
submissions to encourage broad discussion. Accepted papers will
be published in the prestigious Springer LNAI proceedings.
Double submissions are allowed, however in the event that an EWRL paper is accepted to another conference proceedings or journal, copyright restrictions prevent it from being reprinted in the official EWRL Springer LNCS proceedings. The paper would still be considered, however, for acceptance and presentation at EWRL regardless of whether it can be printed in the official proceedings.
We will offer at least one best paper prize of Euro 500.
A selection of papers from EWRL-9 is to be published in the
Springer Lecture Notes In Artificial Intelligence (LNAI/LNCS) series.
- Submission deadline: June 10, 2011 June 17, 2011
- Page limit: 3 pages for position papers and 12 pages for regular papers.
- Paper format: LNAI Springer: http://www.springer.de/comp/lncs/authors.html
- Paper Submissions: Papers can be submitted here.
Please ensure papers adhere to the Springer Lecture Notes in AI (LNAI) style,
and are maximum 12 pages for long papers, or 3 pages for short papers. - All submissions are to be anonymous!
Poster Submission
- Submission deadline: 20th August, 2011
- Submission by email to ewrl_posters@yahoo.com
- Format: 1 page extended abstract outlining what your poster will be about.
- After EWRL, all poster presenters will have the option of submitting a 12 page version of their poster submission for consideration of acceptance to the EWRL LNCS post-proceedings.
Important Dates
- Paper submissions due: 10 – June – 2011 17 – June – 2011
- Notification of acceptance: 12 – July – 2011
- Camera ready due: 19 – July – 2011
- Poster submission due: 20 – August – 2011
- Workshop begins: 9 – September – 2011
- Workshop ends: 11 – September – 2011
Organizing Committee
- Marcus Hutter (General Workshop Chair)
Australian National University – Canberra, Australia - Matthew Robards (Local Organizing Chair)
Australian National University – Canberra, Australia - Scott Sanner (Program Committee Chair)
NICTA – Canberra, Australia - Peter Sunehag (Treasurer)
Australian National University – Canberra, Australia - Marco Wiering (Miscellaneous)
University Of Groningen – Groningen, Netherlands
Program Committee
- Edwin Bonilla – NICTA- Canberra, Australia
- Emma Brunskill – UC Berkeley – Berkeley, USA
- Peter Dayan – University College – London, UK
- Carlos Diuk – Princeton University – USA
- Marco Dorigo – Université libre de Bruxelles – Brussels, Belgium
- Alan Fern– Oregon State University – Corvallis, USA
- Fernando Fernandez – Universidad Carlos III de Madrid – Madrid, Spain
- Mohammad Ghavamzadeh – INRIA – Lille, France
- Marcus Hutter – Australian National University – Canberra, Australia
- Kristian Kersting – Fraunhofer IAIS, University of Bonn – Bonn, Germany
- Shie Manno r – The Technion – Haifa, Israel
- Ronald Ortner – Montanuniversität Leoben – Leoben, Austria
- Martijn van Otterlo – Katholieke Universiteit Leuven – Heverlee, Belgium
- Joelle Pineau – McGill University – Montreal, Canada
- Doina Precup– McGill University – Montreal, Canada
- Matthew Robards – Australian National University – Canberra, Australia
- Scott Sanner – NICTA – Canberra, Australia
- Juergen Schmidhuber – IDSIA – Manno-Lugano Switzerland
- Guy Shani– Ben-Gurion University – Israel
- David Silver – University College London – UK
- Peter Sunehag – Australian National University – Canberra, Australia
- Prasad Tadepalli– Oregon State University – Corvallis, USA
- William Uther – NICTA – Sydney, Australia
- Nikos Vlassis – Luxembourg Centre for Systems Biomedicine – Luxembourg
- Thomas Walsh – Arizona State University – USA
- Marco Wiering – University Of Groningen – Groningen, Netherlands
Additional Reviewers
- Mayank Daswani – Australian National University – Canberra, Australia
- Shivaram Kalyanakrishnan – University of Texas, Austin – Austin, TX, USA
- Tor Lattimore – Australian National University – Canberra, Australia
- Phuong Minh Nguyen – Australian National University – Canberra, Australia
- Wén Shào – Australian National University – Canberra, Australia
- Daniel Visentin – Australian National University – Canberra, Australia
- Monica Vroman – Rutgers University – Piscataway, NJ, USA
Keynote Speakers’ Abstracts
Peter Auer – University of Leoben – Leoben, Austria
UCRL and autonomous exploration
After reviewing the main ingredients of the UCRL algorithm and its
analysis for online reinforcement learning – exploration vs.
exploitation, optimism in the face of uncertainty, consistency with
observations and upper confidence bounds, regret analysis – I show how
these techniques can also be used to derive PAC-MDP bounds which match
the best currently available bounds for the discounted and the
undiscounted setting. As typical for reinforcement learning, the
analysis for the undiscounted setting is significantly more involved.
In the second part of my talk I consider a model for autonomous
exploration, where an agent learns about its environment and how to
navigate in it. Whereas evaluating autonomous exploration is typically
difficult, in the presented setting rigorous performance bounds can be
derived. For that we present an algorithm that optimistically explores,
by repeatedly choosing the apparently closest unknown state – as
indicated by an optimistic policy – for further exploration.
This talk is based on joint works with Shiau Hong Lim.
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement 231495 (CompLACS).
Kristian Kersting – Fraunhofer IAIS, University of Bonn – Bonn, Germany
Increasing Representational Power and Scaling Inference in Reinforcement Learning
As robots are starting to perform everyday manipulation tasks,
such as cleaning up, setting a table or preparing simple meals,
they must become much more knowledgeable than they are today.
Natural environments are composed of objects, and the possibilities
to manipulate them are highly structured due to the general
laws governing our relational world. All these need to be
acknowledged when we want to realize thinking robots that efficiently
learn how to accomplish tasks in our relational world.
Triggered by this grand vision, this talk discusses the very promising
perspective on the application of Statistical Relational AI techniques
to reinforcement learning. Specifically, it reviews existing symbolic
dynamic programming and relational RL approaches that exploit the symbolic
structure in the solution of relational and first-order logical Markov
decision processes. They illustrate that Statistical Relational AI may
give new tools for solving the ‘scaling challenge’. It is sometimes
mentioned that scaling RL to real-world scenarios is a core
challenge for robotics and AI in general. While this is true in a trivial
sense, it might be beside the point. Reasoning and learning on appropriate
(e.g. relational) representations leads to another view on the
‘scaling problem’: often we are facing problems with symmetries not
reflected in the structure used by our standard solvers. As additional
evidence for this, the talk concludes by presenting our ongoing work on
the first lifted linear programming solvers for MDPs. Given an MDP, our
approach first constructs a lifted program where each variable presents a
set of original variables that are indistinguishable given the objective
function and constraints. It then runs any standard LP solver on this
program to solve the original program optimally.
This talk is based on joint works with Babak Ahmadi, Kurt Driessens,
Saket Joshi, Roni Khardon, Tobias Lang, Martin Mladenov, Sriraam Natarajan,
Scott Sanner, Jude Shavlik, Prasad Tadepalli, and Marc Toussaint.
Peter Stone – University Of Texas – Austin, USA
PRISM – Practical RL: Representation, Interaction, Synthesis, and Mortality
When scaling up RL to large continuous domains with imperfect
representations and hierarchical structure, we often try applying
algorithm that are proven to converge in small finite domains, and
then just hope for the best. This talk will advocate instead
designing algorithms that adhere to the constraints, and indeed take
advantage of the opportunities, that might come with the problem at
hand. Drawing on several different research threads within the
Learning Agents Research Group at UT Austin, I will discuss four types
of issues that arise from these contraints and opportunities: 1)
Representation – choosing the algorithm for the problem’s
representation and adapating the representation to fit the algorithm;
2) Interaction – with other agents and with human trainers; 3)
Synthesis – of different algorithms for the same problem and of
different concepts in the same algorithm; and 4) Mortality – the
opportunity to improve learning based on past experience and the
constraint that one can’t explore exhaustively.
Csaba Szepesvari – University Of Alberta – Alberta, Canada
Towards robust reinforcement learning algorithms
Most reinforcement learning algorithms assume that the system to be controlled can be accurately approximated given the measurements and the available resources. However, this assumption is overly optimistic for too many problems of practical interest: Real-world problems are messy. For example, the number of unobserved variables influencing the dynamics can be very large and the dynamics governing can be highly complicated. How can then one ask for near-optimal performance without requiring an enormous amount of data? In this talk we explore an alternative to this standard criterion, based on the concept of regret, borrowed from the online learning literature. Under this alternative criterion, the performance of a learning algorithm is measured by how much total reward is collected by the algorithm as compared to the total reward that could have been collected by the best policy from a fixed policy class, the best policy being determined in hindsight. How can we design algorithms that keep the regret small? Do we need to change existing algorithm designs? In this talk, following the initial steps made by Even-Dar et al. and Yu et al., I will discuss some of our new results that shed some light on these questions.
The talk is based on joint work with Gergely Neu, Andras Gyorgy and Andras Antos.
Accepted Papers
The following is a list of presentations which will be made at EWRL9.
- Phuong Nguyen, Peter Sunehag and Marcus Hutter – Feature Reinforcement Learning in Practice
- Pablo Castro and Doina Precup – Automatic construction of temporally extended actions for MDPs using bisimulation metrics
- Kazuteru Miyazaki and Masaaki Ida – Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning
- Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi – Introduction of Fixed Mode States into Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped Robot
- Matthew Robards and Peter Sunehag – Near Optimal On-Policy Control
- Orly Avner and Shie Mannor – Stochastic Bandits with Pathwise Constraints
- Soumi Ray and Tim Oates – Locking in Returns: Speeding Up Q-Learning by Scaling
- Matthew Robards and Peter Sunehag – Loss Functions For Improved On-Policy Control
- Ioannis Lambrou, Vassilis Vassiliades and Chris Christodoulou – An extension of a hierarchical reinforcement learning algorithm for multiagent settings
- Dimitris Kalles and Panagiotis Kanellopoulos – A Pendulum Effect in Co-evolutionary Learning in Games
- Yuxi Li and Dale Schuurmans – MapReduce for Parallel Reinforcement Learning
- Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman – Finite-Sample Analysis of Lasso-TD
- Francis Maes, Louis Wehenkel and Damien Ernst – Optimized look-ahead tree search policies
- Francis Maes, Louis Wehenkel and Damien Ernst – Automatic discovery of ranking formulas for playing with multi-armed bandits
- Yann-Michaël De Hauwere, Peter Vrancx and Ann Nowé – Future sparse interactions: a MARL approach
- Mauricio Araya-López, Olivier Buffet, Vincent Thomas and François Charpillet – Active Learning of MDP models
- Edouard Klein, Matthieu Geist and Olivier Pietquin – Batch, Off-policy and Model-free Apprenticeship Learning
- Georgios Boutsioukis, Ioannis Partalas and Ioannis Vlahavas – Transfer Learning in Multi-agent Reinforcement Learning Domains
- Christos Dimitrakakis – Robust Bayesian reinforcement learning through tight lower bounds
- Kfir Levy and Nahum Shimkin – Unified Inter and Intra Options Learning Using Policy Gradient Methods
- Bruno Scherrer and Matthieu Geist – Recursive Least-Squares Learning with Eligibility Traces
- Christos Dimitrakakis and Constantin Rothkopf – Bayesian multitask inverse reinforcement learning
- Abdel Rodriguez Abed, Matteo Gagliolo, Peter Vrancx, Ricardo Grau and Ann Nowe – Improving the performance of Continuous Action Reinforcement Learning Automata
- Kyriakos Chatzidimitriou, Ioannis Partalas, Pericles Mitkas and Ioannis Vlahavas – Transferring Evolved Reservoir Features in Reinforcement Learning Tasks
- Nikolaos Tziortziotis and Konstantinos Blekas – Value Function Approximation through Sparse Bayesian Modeling
- Boris Lesner and Bruno Zanuttini – Handling Ambiguous Effects in Action Learning
- Charles Elkan – Reinforcement learning with a bilinear Q function
- Adrien Couetoux and Hassen Doghmen – Adding Double Progressive Widening to Upper Confidence Tree to Cope with Uncertainty in Planning Problems
- Lutz Frommberger – Task Space Tile Coding: In-Task and Cross-Task Generalization in Reinforcement Learning
- Matthijs Snel and Shimon Whiteson – Multi-Task Reinforcement Learning: Shaping and Feature Selection
- Tohgoroh Matsui, Takashi Goto, Kiyoshi Izumi and Yu Chen – Compound Reinforcement Learning: Theory and An Application to Finance
- Cosmin Paduraru, Doina Precup and Joelle Pineau – A Framework for Computing Bounds for the Return of a Policy
- Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos – Regularized Least Squares Temporal Difference learning with nested l2 and l1 penalization
- Matteo Leonetti, Luca Iocchi and Subramanian Ramamoorthy – Learning Finite State Controllers from Simulation
- Kvn Pradyot and Balaraman Ravindran – Beyond Rewards: Learning from richer supervision
- Lewis Fishgold – Towards Online Learning of Noisy Deictic Action Models
- Anestis Fachantidis, Ioannis Partalas, Matthew Taylor and Ioannis Vlahavas – Transfer Learning via Multiple Inter-Task Mappings
- Sylvie Ong, Yuri Grinberg and Joelle Pineau – Goal-Directed Online Learning of Predictive Models
- Munu Sai and Balaraman Ravindran – Options With Exceptions
Registration
We are pleased to announce that registration for EWRL9 is free!
Please simply send the following details to **ewrl_registration
- Full Name:
- Email Address:
- Home Institution:
- Country:
- Are you a student (this is simply for our records)?:
- Do you intend to present a poster?:
(Note that poster presentation is not obligatory, however we would like to encourage all attendees to take the opportunity to present a poster at our fun poster evening. This evening will include free food and drinks.)
Workshop Venue

Athens Royal Olympic Hotel
EWRL9 is co-located with ECML PKDD 2011. It is to be held at Athens Royal Olympic Hotel, which is a family run five star property in the centre of Athens. It lays just in front of the famous Temple of Zeus and the National Gardens. It is underneath the Acropolis and only 2 minutes walk to the new Athens Acropolis Museum.
After its complete renovation that finished in 2009, the Royal Olympic was transformed to an art hotel very elegantly decorated and more important very well looked after in every detail. One of the aspects given particular attention to, was to create a very personal hotel and as much environmentally friendly as possible.
Workshop Schedule
Day 1 – Sept 09:
- Welcome (0900 – 0930)
-
Session 1 – Online Learning in RL 1 (0930 – 1030)
- Francis Maes, Louis Wehenkel and Damien Ernst – Automatic discovery of ranking formulas for playing with multi-armed bandits
- Lewis Fishgold – Towards Online Learning of Noisy Deictic Action Models
-
Sylvie Ong, Yuri Grinberg and Joelle Pineau – Goal-Directed Online Learning of Predictive Models
- Coffee Break (1030 – 1100)
-
Session 2 – Online Learning in RL 2 (1100 – 1140)
-
Matthew Robards and Peter Sunehag – Near Optimal On-Policy Control
-
Matthew Robards and Peter Sunehag – Loss Functions For Improved On-Policy Control
- Lunch – Not Provided (1140 – 1300)
-
Session 3 – Invited talk (1300 – 1400)
-
Csaba Szepesvari – Towards robust reinforcement learning algorithms
-
Session 4 – Multi-Agent Reinforcement Learning (1400 – 1520)
- Ioannis Lambrou, Vassilis Vassiliades and Chris Christodoulou – An extension of a hierarchical reinforcement learning algorithm for multiagent settings
- Dimitris Kalles and Panagiotis Kanellopoulos – A Pendulum Effect in Co-evolutionary Learning in Games
- Yann-Michaël De Hauwere, Peter Vrancx and Ann Nowé – Future sparse interactions: a MARL approach
-
Georgios Boutsioukis, Ioannis Partalas and Ioannis Vlahavas – Transfer Learning in Multi-agent Reinforcement Learning Domains
- Coffee Break (1520 – 1550)
-
Session 5 – Learning And Exploring MDPs (1550 – 1720)
- Phuong Nguyen, Peter Sunehag and Marcus Hutter – Feature Reinforcement Learning in Practice
- Soumi Ray and Tim Oates – Locking in Returns: Speeding Up Q-Learning by Scaling
- Mauricio Araya-López, Olivier Buffet, Vincent Thomas and François Charpillet – Active Learning of MDP models
- Christos Dimitrakakis – Robust Bayesian reinforcement learning through tight lower bounds
- Boris Lesner and Bruno Zanuttini – Handling Ambiguous Effects in Action Learning
Day 2 – Sept 10:
-
Session 1 – Invited Talk (0900 – 1000)
-
Peter Auer – UCRL and autonomous exploration
-
Session 2 – Function Approximation Methods For Reinforcement Learning 1 (1000 – 1040)
- Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman – Finite-Sample Analysis of Lasso-TD
-
Bruno Scherrer and Matthieu Geist – Recursive Least-Squares Learning with Eligibility Traces
- Coffee Break (1040 – 1100)
-
Session 3 – Function Approximation Methods For Reinforcement Learning 2 (1100 – 1220)
- Abdel Rodriguez Abed, Matteo Gagliolo, Peter Vrancx, Ricardo Grau and Ann Nowe – Improving the performance of Continuous Action Reinforcement Learning Automata
- Nikolaos Tziortziotis and Konstantinos Blekas – Value Function Approximation through Sparse Bayesian Modeling
- Charles Elkan – Reinforcement learning with a bilinear Q function
-
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos – Regularized Least Squares Temporal Difference learning with nested l2 and l1 penalization
-
Lunch – Not provided (1220 – 1400)
- Session 4 – Best Paper Presentation (1400 – 1430)
-
Session 5 – Invited Talk (1430 – 1530)
-
Kristian Kersting – Increasing Representational Power and Scaling Inference in Reinforcement Learning
- Coffee Break (1530 – 1600)
-
Session 6 – Macro-Actions in Reinforcement Learning (1600 – 1700)
- Pablo Castro and Doina Precup – Automatic construction of temporally extended actions for MDPs using bisimulation metrics
- Kfir Levy and Nahum Shimkin – Unified Inter and Intra Options Learning Using Policy Gradient Methods
-
Munu Sai and Balaraman Ravindran – Options With Exceptions
-
Business Meeting (1730 – 1800)
-
Please feel free to stay around and discuss the future of EWRL
-
Poster Evening (1830 – 2130)
- Drinks and fingerfoods will be provided.
Day 3 – Sept 11:
-
Session 1 – Invited Talk (0900 – 1000)
-
Peter Stone – PRISM – Practical RL: Representation, Interaction, Synthesis, and
Mortality - Coffee Break (1000 – 1030)
-
Session 3 – Policy Search Methods 2 (1030 – 1150)
- Adrien Couetoux and Hassen Doghmen – Adding Double Progressive Widening to Upper Confidence Tree to Cope with Uncertainty in Planning Problems
- Francis Maes, Louis Wehenkel and Damien Ernst – Optimized look-ahead tree search policies
- Cosmin Paduraru, Doina Precup and Joelle Pineau – A Framework for Computing Bounds for the Return of a Policy
-
Matteo Leonetti, Luca Iocchi and Subramanian Ramamoorthy – Learning Finite State Controllers from Simulation
- Lunch – Not provided (1150 – 1330)
-
Session 4 – Multi-Task and Transfer Learning in RL (1330 – 1450)
- Kyriakos Chatzidimitriou, Ioannis Partalas, Pericles Mitkas and Ioannis Vlahavas – Transferring Evolved Reservoir Features in Reinforcement Learning Tasks
- Lutz Frommberger – Task Space Tile Coding: In-Task and Cross-Task Generalization in Reinforcement Learning
- Matthijs Snel and Shimon Whiteson – Multi-Task Reinforcement Learning: Shaping and Feature Selection
-
Anestis Fachantidis, Ioannis Partalas, Matthew Taylor and Ioannis Vlahavas – Transfer Learning via Multiple Inter-Task Mappings
- Coffee Break (1450 – 1520)
-
Session 5 – Learning With Supervision (1520 – 1620)
- Edouard Klein, Matthieu Geist and Olivier Pietquin – Batch, Off-policy and Model-free Apprenticeship Learning
- Christos Dimitrakakis and Constantin Rothkopf – Bayesian multitask inverse reinforcement learning
-
Kvn Pradyot and Balaraman Ravindran – Beyond Rewards: Learning from richer supervision
-
Session 6 – Real World Reinforcement Learning (1620 – 1740)
- Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi – Introduction of Fixed Mode States into Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped Robot
- Kazuteru Miyazaki and Masaaki Ida – Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning
- Yuxi Li and Dale Schuurmans – MapReduce for Parallel Reinforcement Learning
-
Tohgoroh Matsui, Takashi Goto, Kiyoshi Izumi and Yu Chen – Compound Reinforcement Learning: Theory and An Application to Finance
- Close (1740 – 1800)
Photos

Keynote speaker Csaba Szepesvari’s world view.

The audience listening in awe.

Keynote speaker Peter Auer’s regret is bounded.

The audience tries to follow his proof.

The organizers are all ears.

Keynote speaker Kristian Kersting indulges in exponential progress.

Keynote speaker Peter Stone’s heavy traffic vision: 12-lanes and 4-way green light.

Such bold vision requires some lighter refreshments at the buffet.


Lively discussion at the poster evening.

Lively discussion at the poster evening.

Lively discussion at the poster evening.
Sponsors
We thank the following sponsors for their generous support which allowed us to make the workshop accessible to everyone.




