October 12, 2021

Joyce400x275When most people think about humanoid robots, they instantly imagine robots like C-3PO from Star Wars, Rosie from The Jetsons, or Star Trek’s Data, which can all handle multiple tasks. This compares to non-humanoid robots that are seeing success in being able to do one task over and over again, usually faster than a human.

Much of the work being done with humanoid robots, however, is dealing with advanced machine perception, in getting robots to understand the world around them better than before. One of the leaders in this space is Immervision, which launched the JOYCE robot as a way for computer vision experts to improve how machines can gain human-like perception. Earlier this year, Immervision teamed up with Hanson Robotics, creator of the humanoid robot Sophia, to find ways to improve robot vision.

Robotics World chatted with Patrice Roulet Fontani, the co-founder and vice president of technology at Immervision, about the future of humanoid robots and machine perception.

Robotics World: Much of the conversation and growth of robotics over the past few years focused on non-humanoid robots that can help humans get things done easier, such as mobile robots filling orders for e-commerce, to agriculture robots doing weeding, to delivery robots giving you food or parcel orders at your doorstep. How much of an impact will the recent Tesla/Elon Musk announcement of its humanoid robot have on how we think about robots moving forward?

Fontani: The recent Tesla Bot announcement has once again put a spotlight on humanoid robots, and the virtually limitless ways that they can be used in the workplace and in our everyday lives. While there is a lot of fanfare about how Tesla’s robot could be used to navigate its surroundings and eliminate repetitive, mundane or dangerous tasks, the reality is that there has been a lot of advancements in robotics over the years to make humanoid robots possible. 

Our launch of JOYCE - the world’s first humanoid robot developed by the machine perception community, and equipped with three ultra-wide-angle cameras – is a prime example. At Immervision, we’ve spent the last 20+ years in R&D and commercializing advanced vision systems, so that we can bring human-like sight and a visual cortex to technologies from consumer electronics, automotive, drones, robots and more.

As the computer vision community, enterprise, government organizations, and even the general public, become aware of the potential and current realities of what humanoid robots are able to achieve, the more we’ll be able to see robots being used in a variety of applications in the near future.

R-W: A lot of work with humanoid robotics is around social and emotional needs, such as an educational robot that interacts with children with autism, or robots that assist the elderly with daily tasks (social interaction to fight loneliness, or reminders on how to take pills). Can humanoid robots move beyond this category into other areas?

PatriceFontaniFontani: Absolutely. As humanoid robots are developing the ability to perceive their surroundings at a level on par with our perception, we’re still scratching the surface for how robots can be used to automate tasks in every industry.

For example, robots can be used in hospitals for patient care, where contextual awareness could trigger an automated response by the robot to administer medication or emergency care to patients in need. Another example is in the manufacturing industry, where robots can fulfill logistics and warehouse requests at scale.

As technology advancements bring human-like vision to robotics, there will be numerous advantages to many areas like autonomous driving vehicles for self-parking features and in-cabin monitoring, as well as search-and-rescue drones that can explore large geographic areas in low-light and tough weather conditions.

R-W: What innovations in computer vision have allowed humanoid robotics to advance? Better resolution for sure, such as facial recognition for identification, but has the software advanced enough where a robot can tell what kind of mood someone is in, or inflect the tone of a question? I’m not sure yet if robotic assistants understand sarcasm yet, will a humanoid robot be able to tell?

Fontani: Robotic perception is essential to the advancement of humanoid robots. Humanoid robots are trying to mimic our actions, and to accomplish that, they need sensors that act similar to a visual cortex, allowing them to collect and process data required to “see, understand and act.”

The visual information received, integrated and processed is greatly affected by the optical design of the camera and how well it can identify objects in the environment. Most notably, widening the field-of-view to provide humanlike sight, increasing resolution, and incorporating low-light visibility so they can accommodate to see in the dark like humans, is all essential for humanoid robotics to advance.

With superior optical design and high-quality, actionable sensor data, robots will reach a level of full autonomy to interpret their visible surroundings with pinpoint accuracy. As for sarcasm, how well a robot can understand sarcasm will boil down to the data and AI algorithms that it “learns” from over time.

Sophia Joyce400R-W: The other big problem that people have with humanoid robots is that they have a preconceived notion in their head about what tasks they can perform. They picture robots like Rosie from The Jetsons (robot maid that cooked, cleaned, laundry, etc.), or Data from Star Trek, which are able to do multiple tasks. Even the best robots now can really only focus on one or two tasks at most. How can roboticists change perceptions about what humanoid robots can do? Or is that the ultimate goal - to have a robot that can perform multiple tasks?

Fontani: I think it’s a challenge to change perceptions immediately, as we’re still at the tip of the iceberg of what robots can offer, and how they can benefit society. There can certainly be some robots that are specifically designed for basic tasks (e.g., completing mundane activities like housework, stacking shelves in a factory, etc.).

At Immervision, however, our focus is on developing and designing the integral vision systems that can give robots the ability to truly perceive the world around them. Because we focus on the entire development of computer vision and machine perception, this could very well mean that robots can use these sensory inputs to then perform multiple tasks and conduct a variety of actions. Some examples include:

  • Being able to navigate in a complex environment surrounded by people and moving objects, like a robot nurse in a crowded emergency room dealing with patients.
  • Being able to locate and grab tiny objects, including an array of complex day-to-day tasks like locating a keyhole to unlock a door in the dark.
  • Providing a telepresence to someone in a constrained environment – like an astronaut in a confined space station.

Ultimately, my view is that if we can enable robots to conduct automated tasks with a high level of security and comfort, this will give us the opportunity to focus on new goals and aspirations.

R-W: In your experience, do humans need to adapt their behavior around robots as well? In some cases where robots are interfacing with the public, they either are amazed/astonished, or they begin to do things like try to “torture” them, like standing in their way to see if they move around, or other types of things like that. Will robots that look more like humans help in this behavior, or make it worse? Even in Star Wars, C-3PO was generally abused by the humans around him (although he was kind of annoying).

Fontani: Definitely. I believe that we will have to adapt our behavior around robots until we reach a point where they have the same contextual awareness that we do. For example, I recently visited a factory where many tasks were conducted by autonomous robots. The employees who were running the production line were instructed to avoid red zones to reduce the likelihood of accidents. In this situation, the best option would be to place factory robots in a secluded area or cage, given the robots do not yet have the ability to perceive when humans are around them, and react accordingly.

This is one of the problems we want to solve at Immervision, by providing a visual cortex that will help robots better perceive human behavior.

R-W: When you are working on robots, what kinds of robots do you take your inspiration from? Books, TV, movies, or previous robots in the world, or something else?

Fontani: I would say my first inspiration is from literature: Isaac Asimov (Asimov laws of robotics)

  • A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  • A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  • A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

In this context, Asimov assumes that the robot has the technology to allow it to make the right decision, in the same way as a human being.

If we look at the robots available today, - ranging from autonomous vacuum cleaners to robot taxis - there are many technology improvements that must be done to achieve this goal. It all starts by developing efficient machine perception technology to support the robot decision process.

R-W: We spoke with another robotic developer who said his ultimate goal would be to create a robot companion/friend that you get when you’re young (an educational robot that teaches you things), and then it evolves and changes as you grow up. So when you’re an adult, the robot becomes more of an assistant/companion, and then when you become a senior citizen the robot evolves again to become a helper-style robot. Do you see this as a possibility – where people buy a single robot, and that it grows/evolves and changes as you do?

Fontani: This ultimate long-term goal is great to pursue. I see this as a kind of "evolution of species” approach.

If we look closer at this concept, we see different robots optimized for different tasks, but sharing the same developmental building blocks. We call this “software-defined” robots. The idea is simple – a robot whose features and functions are primarily enabled through software that results in the ongoing transformation of the “robot” from a product that is mainly hardware-based, to a software-centric device.

As a concrete example, imagine a robot that has multiple “body casings” depending on the need and task it performs (i.e. humanoid, vehicle, vacuum cleaner, underwater drone, flying drone, etc.), all sharing the same machine perception stack. In this example, the robot capabilities evolve through software.

This idea was the motivation behind JOYCE – to create a humanoid robot with a complex reference design that can work with technology partners to enable state-of-the-art technologies for machine perception.

We are enabling this evolution from “hardware defined” to “software defined” by empowering the computer vision and machine perception community with the fundamental building block for perception through the JOYCE’s ”visual cortex.” Our visual cortex provides computer vision developers with the capabilities to develop new machine perception functions for a wide range of robots, including humanoid robots. These technology partnerships are essential to unlocking the next generation of humanoid robots, which could inspire further innovation in this fast-growing field.

Related video