Julien Perez is the group lead of the machine learning and optimization research team at NAVER LABS Europe and co-organiser of the Robot Learning Workshop at NeurIPS 2020. In this interview Julien shares the themes of the workshop and the need for a paradigm to improve robotics and robotic capabilities through machine learning. He also explains why the organisers want to share everyone’s ‘dirty laundry during the event which takes place on December 11th 2020.
Host: Robotics is a hot topic and there are many scientific research events that include robot learning such as IROS and CoRL. Can you explain why this particular workshop is taking place at a conference on neural information processing systems?
Julien Perez: The robot learning workshop and its organisers want to bring together the latest research in data science applied to robotics. More precisely, what we want to have is all the research related to sequential decision making which, I would say, is a particular branch of machine learning that can be applied to the field of robotics. Today we think there’s a huge potential in this new field as an extension of what’s been done so far in terms of data science and, we believe that the recent success observed in other supervised learning fields like text understanding, image understanding or text generation, are good indicators and lead us to believe the field can now have a certain level of success in robotic tasks.
Host: So, what are the applications of robot learning and sequential decision making and data science?
Julien Perez: We believe there are a huge amount of applications. To cite just some of them there is locomotion and navigation and what we call ‘upper body control’ – like arms for manipulation tasks – which we also sometimes call ‘contact-rich manipulation’ tasks where, basically, an arm is supposed to use its force to manipulate objects, to clean the table for example or to place elements on shelves or things like that. There’s an infinity of tasks either in locomotion or manipulation and the handling of uncertainty that machine learning is trying to enable could be very useful in the field of robotics.
Host: So, this is the 3rd annual workshop on robot learning at NeurIPS so you’ve probably seen some changes in the topics since the first one. Can you tell us what we can look forward to most this year and, in particular, the new ideas in robot control?
Julien Perez: We believe that we have not yet found a paradigm that can fully leverage robotics within the context of uncertainty. Optimal control (which has been used for several decades) and control engineering, have been able to produce results such as in robots in factories, but we still have issues when it comes to being able to adapt to new scenarios and to adapt to adversarial situations. For example, if you’re supposed to have a robot accomplish tasks among people. And we believe, once again, that machine learning is able to handle uncertainty in contexts, like supervised learning for example, but that there’s still an open question on how robotics and capabilities in robotics can be improved through machine learning. We still need to find a paradigm, or at least a method or a family of methods, derived, I would say, from machine learning that would allow robotic systems to accomplish new tasks and overcome these barriers.
Host: In the call for papers of the workshop description, the organisers asked for a ‘dirty laundry’ section. Can you explain what that it is?
Julien Perez: That’s an interesting element that was brought forward by the organizing committee this year. To be short, for many reasons the majority of papers, including papers in machine learning and robotics, do not necessarily include, I would say, all the detailed elements of the methods and all the failures that gave rise to the actual method presented in the papers, and that’s a challenge because, in the way that we currently write papers, we’re expected to produce reproducible results. But, often the thoughts and the paths of trial and error that result in the actual paper, are never really expressed and explained and, in a lot of situations, that’s a big problem because, sometimes choices are made because of the limitations of the methods that we use and we don’t necessarily have the space to explain these difficulties and, what’s interesting with robotics is that you deal with the real world, so, you end up having challenges that you would never have thought of.
Host: Could you give an example?
Julien Perez: So, I can give you a very simple example, to try to illustrate that. If you want to do, for example, reinforcement learning with robotics, you have to produce a set of what we call ‘episodes’ for a given task and, the idea behind reinforcement learning is that you learn from trial and error. This means that you have to do a bunch of episodes of the same task and, the more you do, the more you learn and it works very well for example if you apply such a paradigm to simulation for example. But, if you do that in the real world, let’s assume for example you have a very simple task that consists of taking an object and moving it from one place to another. This means that, after every episode, you need somebody to take that object and put it back. So, basically this fundamental paradigm is not applicable and that’s a very simple example of how sometimes we don’t have the space and time to express all these difficulties and limitations that make the protocol or approach unsuitable for robotics – it needs to be improved and adapted. And, this year we have a lot of luck, like in the previous editions, to have great speakers, who are faced with these kinds of difficulties all day long and we really wanted to give them the opportunity and the space to discuss these difficulties and challenges of doing research in machine learning and robotics and maybe also give some room for conversation to go towards finding solutions.
Host: So, it sounds like you really want to have discussions during the workshop about what went wrong in experiment, where people encountered failure and what they learned. How do you plan to actually hold these discussions because, of course, like most events this year, the workshop is going to be a virtual event?
Julien Perez: So, that’s a very good question. The choice we made, which was actually suggested by the NeurIPS organization, was to use the software called GatherTown. GatherTown basically allows participants to embody, in a sense, a small persona in a 2D space, like I would say a 90s-type video game. It’s working pretty well and the idea is that we want people to use this tool, not just to have access to the posters (because the poster session will be organized with this tool) but, we want also to allow people to exchange more. We’ll have a panel session for that which also uses this tool and, once again, the main purpose of this kind of workshop in general, is to allow people to gather together and to speak and to exchange experiences, success and failures and discuss their own ideas with respect to the current state of the art. And, although it’s obviously not a normal year, none the less, as a learning system, I would say we adapt, and so we’re using those infrastructures that NeurIPS allows us to use to still have those conversations that we normally have in any workshop. I’m very optimistic that we’ll end up having the kind of exchanges that we expect from such a workshop.
Host: The special focus of this year’s workshop is in ‘grounding machine learning development in the real world’. Does this mean that there’s still a wide gap between machine learning in the virtual, digital world of simulation and that of our physical world?
Julien Perez: To give a short answer I would say yes but I’ll try to develop it a bit because I think it’s an important point. First of all, I’ve had the opportunity to work in the field of deep learning for almost 15 years now and I started at the time when people basically believed that a neural network could do anything and – what we realise is that – when deep learning started to grow after its first noticeable success in speech recognition (with respect to LSTM) and in vision (with respect to CNNs), we started to realise that the field of deep learning (some people call that differential programming but, whatever) – that adopting deep learning led to overall improvement in each domain of application. Let me give a very specific example that occurred recently.
Deep learning applied to text gives us the transformer. And now we realise that this new kind of ‘convolution’ – I will not go into the details – can actually make a lot of sense not only for text but also for images. And, as we go towards using deep learning for robotics, we see that it doesn’t actually work as well as one could have imagined – but – we’re making progress, we’re starting to better understand why it doesn’t work, what the limitations are and we’re improving the paradigm of deep learning (like we did in the past), but in this case for a new field of application which is robotics.
And, what’s interesting with respect to robotics (as you said) is that, it’s very difficult to simulate. You end up having to deal with constraints that were less present when classifying images for example or even translating text – which are also very challenging tasks of course – but, as we go towards robotics we realise that there are a lot of new challenges we have to deal with for real world applications of machine learning algorithms – in the context of an embodied agent that we call a robot. But one thing that makes me pretty optimistic I would say is that we’re beginning to understand the problems – we’re starting to understand the limitations of the paradigm that we have at hand whether it’s reinforcement learning or deep learning in general.
So, there’s still a lot of work to do to before having an autonomous embodied agent in a human crowded environment but I think we’re going in the right direction and, for simple tasks, we may even very soon start to have embodied agents functioning in environments beyond the factory. I think it’s also a matter of adoption – as we start to have embodied agents in more human crowded places like a café or a bar – well make progress in our understanding of this difficulty of grounding machine learning in the real world and we’ll get better.
For example, if I take the field of computer vision, the more we started using it, the more people started to adopt the models – not forgetting of course the frameworks like TensorFlow or PyTorch that make adoption easier – and the more people use it the more we learn about the capability but also the limitations of the model and the better the paradigm becomes. So, to conclude, that’s why there is still a lot of ground to cover but we’re going in the right direction.
Host: Can you briefly describe the main topics that will be presented and discussed in the workshop?
Julien Perez: Safety and robustness are particularly prominent. Other ones relate to ‘efficient learning’ because, as we discussed before, robotics is a challenge with respect to trying to learn to accomplish a new task from the smallest amount possible of demonstrations. There are also some papers related to visual locomotion – how mobile robots can evolve in an environment, move in an environment using visual sensors, for example. We also have research related to contact-rich manipulation tasks, as we said earlier – how a robot arm can for example assemble a piece of Ikea furniture… this kind of complex task.
Host: Looking at the speakers, the institutions of the authors and even the organisers of the workshop itself with NAVER LABS, Google, Facebook but also MIT and Stanford) there’s a real mix of industry and academia. Are they both working on the same things in the field or is academia focussing on the more fundamental aspects and industry more on ‘the real world’?
Julien Perez: From the recent research I’ve seen I wouldn’t say so. Regarding the institutions you mention, I would say research is fairly balanced between theoretical elements like guaranteeing safety or methods for efficient task acquisition and also some, I would say, some very concrete tasks and, by the way, working on concrete tasks can be a very good way to get a better understanding of the methodology you’re developing. So, I would not say (and at least as far as I know, not only for the field of robotics), I don’t really see such a dichotomy between industry that would only be interested in short term, concrete research achievements, technical achievements and academia doing more theoretical research.
Host: What about the geographical balance? Is there a more dominant area or is it fairly balanced between the Americas, Europe and Asia?
Julien Perez: In terms of institutions I would say it’s the same kind of balance that we’ve already observed in other fields of machine learning and in AI in general so we have, I would say, 3 important contributors who are obviously North America, Asia and Europe. To increase opportunities for underrepresented groups, at the scale of the workshop, we’re financing dozens of registrations for underrepresented groups and among underrepresented groups – there are the geographically underrepresented groups – and we hope that will help to continue to create more diversity in a similar way to these kinds of movements in the general AI and machine learning community.
Host: I have one last question for you Julien. It’s a fast-moving field, but do you already know what you would like to see at next year’s workshop?
Julien Perez: A continuation in the desire to produce realistic paradigms and research results in the field of robotics. More submissions, more families of applications from more, an even bigger number of institutions, more datasets and, basically continue to build this community of what we now call ‘robotic learning’.
Host: Thank you very much for your time Julien and for sharing all the context. I’d like to thank all the organisers for making this workshop such a popular one. The 3rd Robot Learning workshop will take place on Friday 11th December starting at 7am Pacific time 4pm Central European time. All the details of the schedule can be found on the workshop website robot-learning.ml/2020/.
More information on NAVER LABS Europe at NeurIPS
- SuperLoss: A Generic Loss for Robust Curriculum Learning, Thibault Castells, Philippe Weinzaepfel and Jérôme Revaud
- Hard Negative Mixing for Contrastive Learning, Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel and Diane Larlus
- Deep Transformation-Invariant Clustering, Tom Monnier, Thibault Groueix and Mathieu Aubry
- Transformer-based meta-Imitation learning for robotic manipulation, Julien Perez, Theo Cachet, Seungsu Kim. 3rd Robot Learning Workshop: Grounding Machine Learning Development in the Real World
- Robust active learning strategies for model variability, Jose Mena Roldan, Matthias Galle. Workshop on Human and Model in the Loop Evaluation and Training Strategies (HAMLETS)
This article was first published on the blog of NAVER LABS Europe.