Redefining Robots: Demystify Next Generation AI-Enabled Robotics

author avatar

16 Jul, 2019

Photo by Franck V

Photo by Franck V

This is the first in a series about the impact of robotics and artificial intelligence on various industries and the future of work.

In upcoming articles, we’ll talk about how deep reinforcement learning (DRL) unlocks the potential of robotics, the corresponding challenges and opportunities, and how all these will affect us in terms of productivity, employment, and life. Through these articles, we hope to encourage constructive and thoughtful discussions that can guide people through the AI hype and collectively make the right decisions we want to live within the age of artificial general intelligence (AGI). 

Redefining Robots: Demystify Next Generation AI-Enabled Robotics

When speaking about robots, people tend to imagine a wide range of different machines: Pepper, a social robot from Softbank; Atlas, a humanoid that can do backflip made by Boston Dynamics; the cyborg assassin from the Terminator movies; and the lifelike figures that populate the television series — West World. People who are not familiar with the industry tend to hold polarized views. Either they have unrealistically high estimations of robots’ ability to mimic human-level intelligence or they underestimate the potential of new researches and technologies.

Over the past year, my friends in the venture, tech, and startup scenes have asked me what’s “actually” going on in deep reinforcement learning and robotics. The wonder: how are AI-enabled robots different from traditional ones? Do they have the potential to revolutionize various industries? What are their capabilities and limitations? These questions tell me how surprisingly challenging it can be to understand the current technological progress and industry landscape, let alone make predictions for the future. I am writing this article with a humble attempt to demystify AI, in particular, and deep reinforcement learning enabled robotics, topics that we hear a lot about but understand superficially or not at all. To begin, I’ll answer a basic question: what are AI-enabled robots and what makes them unique?

    “Machine learning addresses a class of questions that were previously ‘hard for computers and easy for people,’ or, perhaps more usefully, ‘hard for people to describe to computers.” — Benedict Evans, a16z.

The most important difference that AI brings to robotics is enabling a move away from automation (hard-programmed) to true autonomy (self-directed). You don’t really see the difference if the robot only does one thing. However, if the robot needs to handle a wide variety of tasks or respond to humans or changes in the environment, it needs certain levels of autonomy. We can borrow definitions below that are used to describe autonomous cars to explain the evolution of robots.

Boston Dynamics Atlas

Level 0 — No Automation: people operate machines and no robots are involved. (Robots are generally defined as programmable machines capable of carrying out complex actions automatically).

Level 1- Driver Assistance: single automated operation. A single function is automated but does not necessarily use information about the environment. This is how robots are used traditionally in the automotive or manufacturing industries. Robots are programmed to repeatedly perform specific tasks with high precision and speed. Until now most robots in the field have not been capable of sensing or adapting to changes in the environment.

Level 2 — Partial Automation: machine assists with certain functions using sensory input from the environment to make decisions. For example, robots can identify and handle different objects with a vision sensor. However, traditional computer vision requires pre-registration and clear instruction for each object and lacks the ability to deal with changes, surprises, or new objects.

Level 3- Conditional Autonomy: machine controls all monitoring of the environment but still requires a human’s attention and (instant) intervention

Level 4 — High Autonomy: fully autonomous in certain situations or defined areas.

Level 5 — Complete Autonomy: fully autonomous in all situations.

Where Are We Now In Terms of Autonomy Level?

Today, most robots used in factories are open-looped, or non-feedback controlled. That means their actions are independent of sensor feedback (level 1). Few robots in the field take and act based on sensor feedback (level 2). A collaborative robot (cobot) is designed to be more versatile and able to work with humans; however, the trade-off is less power and lower speeds, especially when compared to industrial robots. Although the cobot is relatively easier to program, it’s not necessarily autonomous. Human workers need to handhold a cobot every time there’s any change in the task or environment.

We’ve begun to spot pilot projects with AI-enabled robots (level 3/4). Warehouse piece-picking is a good example. In shipping warehouses, human workers need to pick and place millions of different products into boxes based on customer requirements. Traditional computer vision cannot handle such a wide variety of objects because each item needs to be registered and each robot needs to be programmed beforehand. However, deep learning and reinforcement learning now enable robots to learn to handle various objects with minimal help from humans. There might be some goods that robots never encountered before and the machines will need help or demonstration from human workers (level 3). But the algorithm will improve and get closer to full autonomy as the robot collects more data and learns from trial and error (level 4).

Like the autonomous car industry, robotics startups are also taking different approaches. Some believe in a collaborative future between humans and robots and focus on level 3. Others believe in a fully autonomous future, skipping level 3 and focusing on level 4 and eventually level 5. This is one reason why it’s so difficult to assess the actual level of autonomy. A startup could claim that it’s working on level 3 human-centered artificial intelligence (e.g. teleoperation) while the solution is actually mechanical turk. On the other hand, startups targeting level 4/5 cannot achieve desirable results overnight, which could scare early adopters away and make data collection even more difficult in early stages. I will talk more about my thinking around different approaches and map startups as examples in the 2nd half of this article.

Robots working at Amazon's warehouse

The Rise of AI-Enabled Robots: Warehouses and Beyond

The bright side is that, unlike cars, robots are used in a lot more use cases and industries. As a result, in a way, level 4 is more accessible for robots than it is for cars. We will first see AI-enabled robots up and running in warehouses because the warehouse is a semi-controlled environment and piece-picking is a critical but fault-tolerant task. Autonomous home or surgical robots will happen much later in the future because more uncertainties exist in the operating environment and some tasks are not reversible. We will see more AI-enabled robots being used across more scenarios and industries as the precision, accuracy, and reliability of the technology improves over time.

Currently, there are only around three million robots in the world, most of which are working on handling, welding, and assembly tasks. So far, almost no robot arms are used in warehouses, agriculture, or industries other than automotive and electronics. The main reason is the limitation of the traditional robot and computer vision mentioned above. For the next few decades, we will see explosive growth and a changing industry landscape brought by next-generation robots as deep learning, reinforcement learning, and the cloud unlock the potential of robots. Not all industries adopt automation at the same pace because of incentives of current players and technical complexities mentioned above.

Next Generation AI-Enabled Robotics Startup Landscape

What are some of the growth opportunities in the AI-enabled robotics sector? And what are the different approaches and business models taken up by startups and incumbents in this market? Below you will find an overview of some example companies in each segment. This is by no means a landscape that includes all companies and I welcome your input and feedback to make it more complete.

Vertical vs. Horizontal

The most interesting findings I have discovered by looking into the startup scene is that there are two fundamentally different approaches. The first one is vertical: most startups in Silicon Valley focus on developing solutions for specific vertical markets such as e-commerce fulfillment, manufacturing, or agriculture. This full-stack approach makes sense because the technology is still nascent. Instead of relying on others to supply critical modules or components, companies build the end-to-end solution which is faster and gives them more control over the end use cases and performance.

However, scalable use cases are not that easy to identify. Warehouse piece-picking is a low hanging fruit with relatively high customer willingness to pay and technical feasibility. Almost every warehouse has the same needs for piece-picking. But in other sections, like manufacturing, assembly tasks could vary factory by factory. Tasks carried out in manufacturing require higher degrees of accuracy and speed than those carried out in warehouses. Even though machine learning allows robots to improve over time, at the moment robots that operate through machine learning still cannot achieve the same accuracy as close-loop robots because it requires learning from trial-and-error. This is why startups such as Mujin and Capsen robotics choose to use traditional computer vision rather than deep reinforcement learning. However, traditional computer vision requires every object to be registered beforehand and hence lacks the ability to scale and adapt to changes. And once deep reinforcement learning reached the performance threshold and becomes the industry mainstream, this kind of traditional approach could become irrelevant.

Another issue with these startups is that their valuation tends to be high. We often see startups raising over tens of millions of dollars in Silicon Valley without the promise of any significant revenue stream. It’s easy for entrepreneurs to paint a rosy future of deep reinforcement learning, but the reality is that it will take us years to get there. Venture capitalists bet on teams with good talents and technologies even though these companies are still far away from making revenue.

A more practical but rarer approach is to go horizontal, building tech stack and enablers that can be used across different industries. We can simplify robotics technology stacks into three components: sensing (input), processing, and actuation (output). And there are development tools in addition to these. I use the term processing loosely here to include everything not in sensing or actuation, including the controller, machine learning, operating system, and modules for robots. This is the segment I think has the most potential for growth in the near future.
One pain point for robotics customers is that the market is extremely fragmented. All robot makers have their proprietary languages and interfaces, making it difficult for system integrators and end users to integrate robots with their systems. As the industry matures and more robots are used beyond automotive and electronics factories, we need standard operating systems, protocols, and interfaces for better efficiencies and shorter time to market. A number of startups in Boston are working on this modular approach. For example, Veo Robotics develops safety modules to allow robots and human to work together and Realtime Robotics provide solutions to accelerate motion planning.

What other topics would you like to read about AI and robotics? I’d love to hear more. Let me know in the comments. You can also check me out on

More by Bastiane Huang

Future of Work, Machine Learning, Deep Reinforcement Learning, Robotics, Startup, Product Mgmt, Harvard Business School

Wevolver 2022