EWRL9 (2011) | European Workshops on Reinforcement Learning

The 9th European Workshop on Reinforcement Learning (EWRL-9) ————————————————————

will be co-located with ECML PKDD 2011.
When: Sept 9 – 11
Where: Athens Greece

[description] [submission] [dates] [committees] [keynotes] [papers] [registration] [venue] [schedule] [photos] [sponsors]

Description

The 9th European workshop on reinforcement learning (EWRL-9)
invites reinforcement learning researchers to participate in
the revival of this world class event. We plan to make this an
exciting event for researchers worldwide, not only for the
presentation of top quality papers, but also as a forum for
ample discussion of open problems and future research
directions. EWRL9 will consist of four keynote talks,
contributed paper presentations, discussion sessions spread
over a three day period, and a poster session with refreshments
provided on day two.

Reinforcement learning is an active field of
research which deals with the problem of sequential decision
making in unknown (and often) stochastic and/or partially
observable environments. Recently there has been a wealth of
both impressive empirical results, as well as significant
theoretical advances. Both types of advances are of significant
importance and we would like to create a forum to discuss such
interesting results.

The workshop will cover a range of sub-topics including
(but not limited to):

Exploration/Exploitation
Function approximation in RL
Theoretical aspects of RL
Policy search methods
Empirical evaluations in RL
Kernel methods for RL
Partial observable RL
Bayesian RL
Multi agent RL
Risk-sensitive RL
Financial RL
Knowledge Representation in RL

Keynote Speakers

Peter Auer – University of Leoben – Leoben, Austria
Kristian Kersting – Fraunhofer IAIS, University of Bonn – Sankt Augustin, Germany
Peter Stone – University Of Texas – Austin, USA
Csaba Szepesvari – University Of Alberta – Alberta, Canada

Paper Submission

We are calling for papers (and posters) from the entire reinforcement
learning spectrum, with the option of either 3 page position
papers (on which open discussion will be held) or longer 12
page LNAI format research papers. We encourage a range of
submissions to encourage broad discussion. Accepted papers will
be published in the prestigious Springer LNAI proceedings.

Double submissions are allowed, however in the event that an EWRL paper is accepted to another conference proceedings or journal, copyright restrictions prevent it from being reprinted in the official EWRL Springer LNCS proceedings. The paper would still be considered, however, for acceptance and presentation at EWRL regardless of whether it can be printed in the official proceedings.

We will offer at least one best paper prize of Euro 500.

A selection of papers from EWRL-9 is to be published in the
Springer Lecture Notes In Artificial Intelligence (LNAI/LNCS) series.

Submission deadline: June 10, 2011 June 17, 2011
Page limit: 3 pages for position papers and 12 pages for regular papers.
Paper format: LNAI Springer: http://www.springer.de/comp/lncs/authors.html
Paper Submissions: Papers can be submitted here.
Please ensure papers adhere to the Springer Lecture Notes in AI (LNAI) style,
and are maximum 12 pages for long papers, or 3 pages for short papers.
All submissions are to be anonymous!

Poster Submission

Submission deadline: 20th August, 2011
Submission by email to ewrl_posters@yahoo.com
Format: 1 page extended abstract outlining what your poster will be about.
After EWRL, all poster presenters will have the option of submitting a 12 page version of their poster submission for consideration of acceptance to the EWRL LNCS post-proceedings.

Important Dates

Paper submissions due: 10 – June – 2011 17 – June – 2011
Notification of acceptance: 12 – July – 2011
Camera ready due: 19 – July – 2011
Poster submission due: 20 – August – 2011
Workshop begins: 9 – September – 2011
Workshop ends: 11 – September – 2011

Organizing Committee

Marcus Hutter (General Workshop Chair)
Australian National University – Canberra, Australia
Matthew Robards (Local Organizing Chair)
Australian National University – Canberra, Australia
Scott Sanner (Program Committee Chair)
NICTA – Canberra, Australia
Peter Sunehag (Treasurer)
Australian National University – Canberra, Australia
Marco Wiering (Miscellaneous)
University Of Groningen – Groningen, Netherlands

Program Committee

Edwin Bonilla – NICTA- Canberra, Australia
Emma Brunskill – UC Berkeley – Berkeley, USA
Peter Dayan – University College – London, UK
Carlos Diuk – Princeton University – USA
Marco Dorigo – Université libre de Bruxelles – Brussels, Belgium
Alan Fern– Oregon State University – Corvallis, USA
Fernando Fernandez – Universidad Carlos III de Madrid – Madrid, Spain
Mohammad Ghavamzadeh – INRIA – Lille, France
Marcus Hutter – Australian National University – Canberra, Australia
Kristian Kersting – Fraunhofer IAIS, University of Bonn – Bonn, Germany
Shie Manno r – The Technion – Haifa, Israel
Ronald Ortner – Montanuniversität Leoben – Leoben, Austria
Martijn van Otterlo – Katholieke Universiteit Leuven – Heverlee, Belgium
Joelle Pineau – McGill University – Montreal, Canada
Doina Precup– McGill University – Montreal, Canada
Matthew Robards – Australian National University – Canberra, Australia
Scott Sanner – NICTA – Canberra, Australia
Juergen Schmidhuber – IDSIA – Manno-Lugano Switzerland
Guy Shani– Ben-Gurion University – Israel
David Silver – University College London – UK
Peter Sunehag – Australian National University – Canberra, Australia
Prasad Tadepalli– Oregon State University – Corvallis, USA
William Uther – NICTA – Sydney, Australia
Nikos Vlassis – Luxembourg Centre for Systems Biomedicine – Luxembourg
Thomas Walsh – Arizona State University – USA
Marco Wiering – University Of Groningen – Groningen, Netherlands

Additional Reviewers

Mayank Daswani – Australian National University – Canberra, Australia
Shivaram Kalyanakrishnan – University of Texas, Austin – Austin, TX, USA
Tor Lattimore – Australian National University – Canberra, Australia
Phuong Minh Nguyen – Australian National University – Canberra, Australia
Wén Shào – Australian National University – Canberra, Australia
Daniel Visentin – Australian National University – Canberra, Australia
Monica Vroman – Rutgers University – Piscataway, NJ, USA

Keynote Speakers’ Abstracts

Peter Auer – University of Leoben – Leoben, Austria

UCRL and autonomous exploration

After reviewing the main ingredients of the UCRL algorithm and its
analysis for online reinforcement learning – exploration vs.
exploitation, optimism in the face of uncertainty, consistency with
observations and upper confidence bounds, regret analysis – I show how
these techniques can also be used to derive PAC-MDP bounds which match
the best currently available bounds for the discounted and the
undiscounted setting. As typical for reinforcement learning, the
analysis for the undiscounted setting is significantly more involved.

In the second part of my talk I consider a model for autonomous
exploration, where an agent learns about its environment and how to
navigate in it. Whereas evaluating autonomous exploration is typically
difficult, in the presented setting rigorous performance bounds can be
derived. For that we present an algorithm that optimistically explores,
by repeatedly choosing the apparently closest unknown state – as
indicated by an optimistic policy – for further exploration.

This talk is based on joint works with Shiau Hong Lim.
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement 231495 (CompLACS).

Kristian Kersting – Fraunhofer IAIS, University of Bonn – Bonn, Germany

Increasing Representational Power and Scaling Inference in Reinforcement Learning

As robots are starting to perform everyday manipulation tasks,
such as cleaning up, setting a table or preparing simple meals,
they must become much more knowledgeable than they are today.
Natural environments are composed of objects, and the possibilities
to manipulate them are highly structured due to the general
laws governing our relational world. All these need to be
acknowledged when we want to realize thinking robots that efficiently
learn how to accomplish tasks in our relational world.

Triggered by this grand vision, this talk discusses the very promising
perspective on the application of Statistical Relational AI techniques
to reinforcement learning. Specifically, it reviews existing symbolic
dynamic programming and relational RL approaches that exploit the symbolic
structure in the solution of relational and first-order logical Markov
decision processes. They illustrate that Statistical Relational AI may
give new tools for solving the ‘scaling challenge’. It is sometimes
mentioned that scaling RL to real-world scenarios is a core
challenge for robotics and AI in general. While this is true in a trivial
sense, it might be beside the point. Reasoning and learning on appropriate
(e.g. relational) representations leads to another view on the
‘scaling problem’: often we are facing problems with symmetries not
reflected in the structure used by our standard solvers. As additional
evidence for this, the talk concludes by presenting our ongoing work on
the first lifted linear programming solvers for MDPs. Given an MDP, our
approach first constructs a lifted program where each variable presents a
set of original variables that are indistinguishable given the objective
function and constraints. It then runs any standard LP solver on this
program to solve the original program optimally.

This talk is based on joint works with Babak Ahmadi, Kurt Driessens,
Saket Joshi, Roni Khardon, Tobias Lang, Martin Mladenov, Sriraam Natarajan,
Scott Sanner, Jude Shavlik, Prasad Tadepalli, and Marc Toussaint.

Peter Stone – University Of Texas – Austin, USA

PRISM – Practical RL: Representation, Interaction, Synthesis, and Mortality

When scaling up RL to large continuous domains with imperfect
representations and hierarchical structure, we often try applying
algorithm that are proven to converge in small finite domains, and
then just hope for the best. This talk will advocate instead
designing algorithms that adhere to the constraints, and indeed take
advantage of the opportunities, that might come with the problem at
hand. Drawing on several different research threads within the
Learning Agents Research Group at UT Austin, I will discuss four types
of issues that arise from these contraints and opportunities: 1)
Representation – choosing the algorithm for the problem’s
representation and adapating the representation to fit the algorithm;
2) Interaction – with other agents and with human trainers; 3)
Synthesis – of different algorithms for the same problem and of
different concepts in the same algorithm; and 4) Mortality – the
opportunity to improve learning based on past experience and the
constraint that one can’t explore exhaustively.

Csaba Szepesvari – University Of Alberta – Alberta, Canada

Towards robust reinforcement learning algorithms

Most reinforcement learning algorithms assume that the system to be controlled can be accurately approximated given the measurements and the available resources. However, this assumption is overly optimistic for too many problems of practical interest: Real-world problems are messy. For example, the number of unobserved variables influencing the dynamics can be very large and the dynamics governing can be highly complicated. How can then one ask for near-optimal performance without requiring an enormous amount of data? In this talk we explore an alternative to this standard criterion, based on the concept of regret, borrowed from the online learning literature. Under this alternative criterion, the performance of a learning algorithm is measured by how much total reward is collected by the algorithm as compared to the total reward that could have been collected by the best policy from a fixed policy class, the best policy being determined in hindsight. How can we design algorithms that keep the regret small? Do we need to change existing algorithm designs? In this talk, following the initial steps made by Even-Dar et al. and Yu et al., I will discuss some of our new results that shed some light on these questions.

The talk is based on joint work with Gergely Neu, Andras Gyorgy and Andras Antos.

Accepted Papers

The following is a list of presentations which will be made at EWRL9.

Phuong Nguyen, Peter Sunehag and Marcus Hutter – Feature Reinforcement Learning in Practice
Pablo Castro and Doina Precup – Automatic construction of temporally extended actions for MDPs using bisimulation metrics
Kazuteru Miyazaki and Masaaki Ida – Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning
Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi – Introduction of Fixed Mode States into Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped Robot
Matthew Robards and Peter Sunehag – Near Optimal On-Policy Control
Orly Avner and Shie Mannor – Stochastic Bandits with Pathwise Constraints
Soumi Ray and Tim Oates – Locking in Returns: Speeding Up Q-Learning by Scaling
Matthew Robards and Peter Sunehag – Loss Functions For Improved On-Policy Control
Ioannis Lambrou, Vassilis Vassiliades and Chris Christodoulou – An extension of a hierarchical reinforcement learning algorithm for multiagent settings
Dimitris Kalles and Panagiotis Kanellopoulos – A Pendulum Effect in Co-evolutionary Learning in Games
Yuxi Li and Dale Schuurmans – MapReduce for Parallel Reinforcement Learning
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman – Finite-Sample Analysis of Lasso-TD
Francis Maes, Louis Wehenkel and Damien Ernst – Optimized look-ahead tree search policies
Francis Maes, Louis Wehenkel and Damien Ernst – Automatic discovery of ranking formulas for playing with multi-armed bandits
Yann-Michaël De Hauwere, Peter Vrancx and Ann Nowé – Future sparse interactions: a MARL approach
Mauricio Araya-López, Olivier Buffet, Vincent Thomas and François Charpillet – Active Learning of MDP models
Edouard Klein, Matthieu Geist and Olivier Pietquin – Batch, Off-policy and Model-free Apprenticeship Learning
Georgios Boutsioukis, Ioannis Partalas and Ioannis Vlahavas – Transfer Learning in Multi-agent Reinforcement Learning Domains
Christos Dimitrakakis – Robust Bayesian reinforcement learning through tight lower bounds
Kfir Levy and Nahum Shimkin – Unified Inter and Intra Options Learning Using Policy Gradient Methods
Bruno Scherrer and Matthieu Geist – Recursive Least-Squares Learning with Eligibility Traces
Christos Dimitrakakis and Constantin Rothkopf – Bayesian multitask inverse reinforcement learning
Abdel Rodriguez Abed, Matteo Gagliolo, Peter Vrancx, Ricardo Grau and Ann Nowe – Improving the performance of Continuous Action Reinforcement Learning Automata
Kyriakos Chatzidimitriou, Ioannis Partalas, Pericles Mitkas and Ioannis Vlahavas – Transferring Evolved Reservoir Features in Reinforcement Learning Tasks
Nikolaos Tziortziotis and Konstantinos Blekas – Value Function Approximation through Sparse Bayesian Modeling
Boris Lesner and Bruno Zanuttini – Handling Ambiguous Effects in Action Learning
Charles Elkan – Reinforcement learning with a bilinear Q function
Adrien Couetoux and Hassen Doghmen – Adding Double Progressive Widening to Upper Confidence Tree to Cope with Uncertainty in Planning Problems
Lutz Frommberger – Task Space Tile Coding: In-Task and Cross-Task Generalization in Reinforcement Learning
Matthijs Snel and Shimon Whiteson – Multi-Task Reinforcement Learning: Shaping and Feature Selection
Tohgoroh Matsui, Takashi Goto, Kiyoshi Izumi and Yu Chen – Compound Reinforcement Learning: Theory and An Application to Finance
Cosmin Paduraru, Doina Precup and Joelle Pineau – A Framework for Computing Bounds for the Return of a Policy
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos – Regularized Least Squares Temporal Difference learning with nested l2 and l1 penalization
Matteo Leonetti, Luca Iocchi and Subramanian Ramamoorthy – Learning Finite State Controllers from Simulation
Kvn Pradyot and Balaraman Ravindran – Beyond Rewards: Learning from richer supervision
Lewis Fishgold – Towards Online Learning of Noisy Deictic Action Models
Anestis Fachantidis, Ioannis Partalas, Matthew Taylor and Ioannis Vlahavas – Transfer Learning via Multiple Inter-Task Mappings
Sylvie Ong, Yuri Grinberg and Joelle Pineau – Goal-Directed Online Learning of Predictive Models
Munu Sai and Balaraman Ravindran – Options With Exceptions

Registration

We are pleased to announce that registration for EWRL9 is free!
Please simply send the following details to **ewrl_registration yahoo.com:**

Full Name:
Email Address:
Home Institution:
Country:
Are you a student (this is simply for our records)?:
Do you intend to present a poster?:

(Note that poster presentation is not obligatory, however we would like to encourage all attendees to take the opportunity to present a poster at our fun poster evening. This evening will include free food and drinks.)

Workshop Venue

Athens Royal Olympic Hotel
EWRL9 is co-located with ECML PKDD 2011. It is to be held at Athens Royal Olympic Hotel, which is a family run five star property in the centre of Athens. It lays just in front of the famous Temple of Zeus and the National Gardens. It is underneath the Acropolis and only 2 minutes walk to the new Athens Acropolis Museum.
After its complete renovation that finished in 2009, the Royal Olympic was transformed to an art hotel very elegantly decorated and more important very well looked after in every detail. One of the aspects given particular attention to, was to create a very personal hotel and as much environmentally friendly as possible.

Workshop Schedule

Day 1 – Sept 09:

Welcome (0900 – 0930)
Session 1 – Online Learning in RL 1 (0930 – 1030)
Francis Maes, Louis Wehenkel and Damien Ernst – Automatic discovery of ranking formulas for playing with multi-armed bandits
Lewis Fishgold – Towards Online Learning of Noisy Deictic Action Models
Sylvie Ong, Yuri Grinberg and Joelle Pineau – Goal-Directed Online Learning of Predictive Models
Coffee Break (1030 – 1100)
Session 2 – Online Learning in RL 2 (1100 – 1140)
Matthew Robards and Peter Sunehag – Near Optimal On-Policy Control
Matthew Robards and Peter Sunehag – Loss Functions For Improved On-Policy Control
Lunch – Not Provided (1140 – 1300)
Session 3 – Invited talk (1300 – 1400)
Csaba Szepesvari – Towards robust reinforcement learning algorithms
Session 4 – Multi-Agent Reinforcement Learning (1400 – 1520)
Ioannis Lambrou, Vassilis Vassiliades and Chris Christodoulou – An extension of a hierarchical reinforcement learning algorithm for multiagent settings
Dimitris Kalles and Panagiotis Kanellopoulos – A Pendulum Effect in Co-evolutionary Learning in Games
Yann-Michaël De Hauwere, Peter Vrancx and Ann Nowé – Future sparse interactions: a MARL approach
Georgios Boutsioukis, Ioannis Partalas and Ioannis Vlahavas – Transfer Learning in Multi-agent Reinforcement Learning Domains
Coffee Break (1520 – 1550)
Session 5 – Learning And Exploring MDPs (1550 – 1720)
Phuong Nguyen, Peter Sunehag and Marcus Hutter – Feature Reinforcement Learning in Practice
Soumi Ray and Tim Oates – Locking in Returns: Speeding Up Q-Learning by Scaling
Mauricio Araya-López, Olivier Buffet, Vincent Thomas and François Charpillet – Active Learning of MDP models
Christos Dimitrakakis – Robust Bayesian reinforcement learning through tight lower bounds
Boris Lesner and Bruno Zanuttini – Handling Ambiguous Effects in Action Learning

Day 2 – Sept 10:

Session 1 – Invited Talk (0900 – 1000)
Peter Auer – UCRL and autonomous exploration
Session 2 – Function Approximation Methods For Reinforcement Learning 1 (1000 – 1040)
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman – Finite-Sample Analysis of Lasso-TD
Bruno Scherrer and Matthieu Geist – Recursive Least-Squares Learning with Eligibility Traces
Coffee Break (1040 – 1100)
Session 3 – Function Approximation Methods For Reinforcement Learning 2 (1100 – 1220)
Abdel Rodriguez Abed, Matteo Gagliolo, Peter Vrancx, Ricardo Grau and Ann Nowe – Improving the performance of Continuous Action Reinforcement Learning Automata
Nikolaos Tziortziotis and Konstantinos Blekas – Value Function Approximation through Sparse Bayesian Modeling
Charles Elkan – Reinforcement learning with a bilinear Q function
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos – Regularized Least Squares Temporal Difference learning with nested l2 and l1 penalization
Lunch – Not provided (1220 – 1400)
Session 4 – Best Paper Presentation (1400 – 1430)
Session 5 – Invited Talk (1430 – 1530)
Kristian Kersting – Increasing Representational Power and Scaling Inference in Reinforcement Learning
Coffee Break (1530 – 1600)
Session 6 – Macro-Actions in Reinforcement Learning (1600 – 1700)
Pablo Castro and Doina Precup – Automatic construction of temporally extended actions for MDPs using bisimulation metrics
Kfir Levy and Nahum Shimkin – Unified Inter and Intra Options Learning Using Policy Gradient Methods
Munu Sai and Balaraman Ravindran – Options With Exceptions
Business Meeting (1730 – 1800)
Please feel free to stay around and discuss the future of EWRL
Poster Evening (1830 – 2130)
Drinks and fingerfoods will be provided.