• Email Alert

research on humanoid robots

论文  全文  图  表  新闻 

  • Abstracting/Indexing
  • Journal Metrics
  • Current Editorial Board
  • Early Career Advisory Board
  • Previous Editor-in-Chief
  • Past Issues
  • Current Issue
  • Special Issues
  • Early Access
  • Online Submission
  • Information for Authors
  • Share facebook twitter google linkedin

research on humanoid robots

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8 , Top 4% (SCI Q1) CiteScore: 17.6 , Top 3% (Q1) Google Scholar h5-index: 77, TOP 5

Advancements in Humanoid Robots: A Comprehensive Review and Future Prospects

Doi:  10.1109/jas.2023.124140.

  • Yuchuang Tong ,  ,  , 
  • Haotian Liu ,  , 
  • Zhengtao Zhang ,  , 

Yuchuang Tong (Member, IEEE) received the Ph.D. degree in mechatronic engineering from the State Key Laboratory of Robotics, Shenyang Institute of Automation (SIA), Chinese Academy of Sciences (CAS) in 2022. Currently, she is an Assistant Professor with the Institute of Automation, Chinese Academy of Sciences. Her research interests include humanoid robots, robot control and human-robot interaction. Dr. Tong has authored more than ten publications in journals and conference proceedings in the areas of her research interests. She was the recipient of the Best Paper Award from 2020 International Conference on Robotics and Rehabilitation Intelligence, the Dean’s Award for Excellence of CAS and the CAS Outstanding Doctoral Dissertation

Haotian Liu received the B.Sc. degree in traffic equipment and control engineering from Central South University in 2021. He is currently a Ph.D. candidate in control science and control engineering at the CAS Engineering Laboratory for Industrial Vision and Intelligent Equipment Technology, Institute of Automation, Chinese Academy of Sciences (IACAS) and University of Chinese Academy of Sciences (UCAS). His research interests include robotics, intelligent control and machine learning

Zhengtao Zhang (Member, IEEE) received the B.Sc. degree in automation from the China University of Petroleum in 2004, the M.Sc. degree in detection technology and automatic equipment from the Beijing Institute of Technology in 2007, and the Ph.D. degree in control science and engineering from the Institute of Automation, Chinese Academy of Sciences in 2010. He is currently a Professor with the CAS Engineering Laboratory for Industrial Vision and Intelligent Equipment Technology, IACAS. His research interests include industrial vision inspection, and intelligent robotics

This paper provides a comprehensive review of the current status, advancements, and future prospects of humanoid robots, highlighting their significance in driving the evolution of next-generation industries. By analyzing various research endeavors and key technologies, encompassing ontology structure, control and decision-making, and perception and interaction, a holistic overview of the current state of humanoid robot research is presented. Furthermore, emerging challenges in the field are identified, emphasizing the necessity for a deeper understanding of biological motion mechanisms, improved structural design, enhanced material applications, advanced drive and control methods, and efficient energy utilization. The integration of bionics, brain-inspired intelligence, mechanics, and control is underscored as a promising direction for the development of advanced humanoid robotic systems. This paper serves as an invaluable resource, offering insightful guidance to researchers in the field, while contributing to the ongoing evolution and potential of humanoid robots across diverse domains.

  • Future trends and challenges , 
  • humanoid robots , 
  • human-robot interaction , 
  • key technologies , 
  • potential applications

Proportional views

通讯作者: 陈斌, [email protected].

沈阳化工大学材料科学与工程学院 沈阳 110142

Figures( 7 )  /  Tables( 5 )

Article Metrics

  • PDF Downloads( 398 )
  • Abstract views( 1532 )
  • HTML views( 105 )
  • The current state, advancements and future prospects of humanoid robots are outlined
  • Fundamental techniques including structure, control, learning and perception are investigated
  • This paper highlights the potential applications of humanoid robots
  • This paper outlines future trends and challenges in humanoid robot research
  • Copyright © 2022 IEEE/CAA Journal of Automatica Sinica
  • 京ICP备14019135号-24
  • E-mail: [email protected]  Tel: +86-10-82544459, 10-82544746
  • Address: 95 Zhongguancun East Road, Handian District, Beijing 100190, China

research on humanoid robots

Export File

shu

  • Figure 1. Historical progression of humanoid robots.
  • Figure 2. The mapping knowledge domain of humanoid robots. (a) Co-citation analysis; (b) Country and institution analysis; (c) Cluster analysis of keywords.
  • Figure 3. The number of papers varies with each year.
  • Figure 4. Research status of humanoid robots
  • Figure 5. Comparison of Child-size and Adult-size humanoid robots
  • Figure 6. Potential applications of humanoid robots.
  • Figure 7. Key technologies of humanoid robots.

For IEEE Members

Ieee spectrum, follow ieee spectrum, support ieee spectrum, enjoy more free content and benefits by creating an account, saving articles to read later requires an ieee spectrum account, the institute content is only available for members, downloading full pdf issues is exclusive for ieee members, downloading this e-book is exclusive for ieee members, access to spectrum 's digital edition is exclusive for ieee members, following topics is a feature exclusive for ieee members, adding your response to an article requires an ieee spectrum account, create an account to access more content and features on ieee spectrum , including the ability to save articles to read later, download spectrum collections, and participate in conversations with readers and editors. for more exclusive content and features, consider joining ieee ., join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of ieee spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, access thousands of articles — completely free, create an account and get exclusive content and features: save articles, download collections, and talk to tech insiders — all free for full access and benefits, join ieee as a paying member., humanoid robots are getting to work, humanoids from agility robotics and seven other companies vie for jobs.

A photo of a robot holding a container in front of rows of containers.

Agility Robotics’ Digit carries an empty tote to a conveyor in an Amazon research and development warehouse.

Ten years ago, at the DARPA Robotics Challenge (DRC) Trial event near Miami, I watched the most advanced humanoid robots ever built struggle their way through a scenario inspired by the Fukushima nuclear disaster . A team of experienced engineers controlled each robot, and overhead safety tethers kept them from falling over. The robots had to demonstrate mobility, sensing, and manipulation—which, with painful slowness, they did.

These robots were clearly research projects, but DARPA has a history of catalyzing technology with a long-term view. The DARPA Grand and Urban Challenges for autonomous vehicles, in 2005 and 2007, formed the foundation for today’s autonomous taxis. So, after DRC ended in 2015 with several of the robots successfully completing the entire final scenario, the obvious question was: When would humanoid robots make the transition from research project to a commercial product?

This article is part of our special report Top Tech 2024 .

The answer seems to be 2024, when a handful of well-funded companies will be deploying their robots in commercial pilot projects to figure out whether humanoids are really ready to get to work.

One of the robots that made an appearance at the DRC Finals in 2015 was called ATRIAS , developed by Jonathan Hurst at the Oregon State University Dynamic Robotics Laboratory . In 2015, Hurst cofounded Agility Robotics to turn ATRIAS into a human-centric, multipurpose, and practical robot called Digit. Approximately the same size as a human, Digit stands 1.75 meters tall (about 5 feet, 8 inches), weighs 65 kilograms (about 140 pounds), and can lift 16 kg (about 35 pounds). Agility is now preparing to produce a commercial version of Digit at massive scale, and the company sees its first opportunity in the logistics industry, where it will start doing some of the jobs where humans are essentially acting like robots already.

Are humanoid robots useful?

“We spent a long time working with potential customers to find a use case where our technology can provide real value, while also being scalable and profitable,” Hurst says. “For us, right now, that use case is moving e-commerce totes.” Totes are standardized containers that warehouses use to store and transport items. As items enter or leave the warehouse, empty totes need to be continuously moved from place to place. It’s a vital job, and even in highly automated warehouses, much of that job is done by humans.

Agility says that in the United States, there are currently several million people working at tote-handling tasks, and logistics companies are having trouble keeping positions filled, because in some markets there are simply not enough workers available. Furthermore, the work tends to be dull, repetitive, and stressful on the body. “The people doing these jobs are basically doing robotic jobs,” says Hurst, and Agility argues that these people would be much better off doing work that’s more suited to their strengths. “What we’re going to have is a shifting of the human workforce into a more supervisory role,” explains Damion Shelton, Agility Robotics’ CEO. “We’re trying to build something that works with people,” Hurst adds. “We want humans for their judgment, creativity, and decision-making, using our robots as tools to do their jobs faster and more efficiently.”

For Digit to be an effective warehouse tool, it has to be capable, reliable, safe, and financially sustainable for both Agility and its customers. Agility is confident that all of this is possible, citing Digit’s potential relative to the cost and performance of human workers. “What we’re encouraging people to think about,” says Shelton, “is how much they could be saving per hour by being able to allocate their human capital elsewhere in the building.” Shelton estimates that a typical large logistics company spends at least US $30 per employee-hour for labor, including benefits and overhead. The employee, of course, receives much less than that.

Agility is not yet ready to provide pricing information for Digit, but we’re told that it will cost less than $250,000 per unit. Even at that price, if Digit is able to achieve Agility’s goal of minimum 20,000 working hours (five years of two shifts of work per day), that brings the hourly rate of the robot to $12.50. A service contract would likely add a few dollars per hour to that. “You compare that against human labor doing the same task,” Shelton says, “and as long as it’s apples to apples in terms of the rate that the robot is working versus the rate that the human is working, you can decide whether it makes more sense to have the person or the robot.”

Agility’s robot won’t be able to match the general capability of a human, but that’s not the company’s goal. “Digit won’t be doing everything that a person can do,” says Hurst. “It’ll just be doing that one process-automated task,” like moving empty totes. In these tasks, Digit is able to keep up with (and in fact slightly exceed) the speed of the average human worker, when you consider that the robot doesn’t have to accommodate the needs of a frail human body.

Amazon’s experiments with warehouse robots

The first company to put Digit to the test is Amazon . In 2022, Amazon invested in Agility as part of its Industrial Innovation Fund , and late last year Amazon started testing Digit at its robotics research and development site near Seattle, Wash. Digit will not be lonely at Amazon—the company currently has more than 750,000 robots deployed across its warehouses, including legacy systems that operate in closed-off areas as well as more modern robots that have the necessary autonomy to work more collaboratively with people. These newer robots include autonomous mobile robotic bases like Proteus , which can move carts around warehouses, as well as stationary robot arms like Sparrow and Cardinal, which can handle inventory or customer orders in structured environments. But a robot with legs will be something new.

“What’s interesting about Digit is because of its bipedal nature, it can fit in spaces a little bit differently,” says Emily Vetterick, director of engineering at Amazon Global Robotics , who is overseeing Digit’s testing. “We’re excited to be at this point with Digit where we can start testing it, because we’re going to learn where the technology makes sense.”

Where two legs make sense has been an ongoing question in robotics for decades. Obviously, in a world designed primarily for humans, a robot with a humanoid form factor would be ideal. But balancing dynamically on two legs is still difficult for robots, especially when those robots are carrying heavy objects and are expected to work at a human pace for tens of thousands of hours. When is it worthwhile to use a bipedal robot instead of something simpler?

“The people doing these jobs are basically doing robotic jobs.” —Jonathan Hurst, Agility Robotics

“The use case for Digit that I’m really excited about is empty tote recycling,” Vetterick says. “We already automate this task in a lot of our warehouses with a conveyor, a very traditional automation solution, and we wouldn’t want a robot in a place where a conveyor works. But a conveyor has a specific footprint, and it’s conducive to certain types of spaces. When we start to get away from those spaces, that’s where robots start to have a functional need to exist.”

The need for a robot doesn’t always translate into the need for a robot with legs, however, and a company like Amazon has the resources to build its warehouses to support whatever form of robotics or automation it needs. Its newer warehouses are indeed built that way, with flat floors, wide aisles, and other environmental considerations that are particularly friendly to robots with wheels.

“The building types that we’re thinking about [for Digit] aren’t our new-generation buildings. They’re older-generation buildings, where we can’t put in traditional automation solutions because there just isn’t the space for them,” says Vetterick. She describes the organized chaos of some of these older buildings as including narrower aisles with roof supports in the middle of them, and areas where pallets, cardboard, electrical cord covers, and ergonomics mats create uneven floors. “Our buildings are easy for people to navigate,” Vetterick continues. “But even small obstructions become barriers that a wheeled robot might struggle with, and where a walking robot might not.” Fundamentally, that’s the advantage bipedal robots offer relative to other form factors: They can quickly and easily fit into spaces and workflows designed for humans. Or at least, that’s the goal.

Vetterick emphasizes that the Seattle R&D site deployment is only a very small initial test of Digit’s capabilities. Having the robot move totes from a shelf to a conveyor across a flat, empty floor is not reflective of the use case that Amazon ultimately would like to explore. Amazon is not even sure that Digit will turn out to be the best tool for this particular job, and for a company so focused on efficiency, only the best solution to a specific problem will find a permanent home as part of its workflow. “Amazon isn’t interested in a general-purpose robot,” Vetterick explains. “We are always focused on what problem we’re trying to solve. I wouldn’t want to suggest that Digit is the only way to solve this type of problem. It’s one potential way that we’re interested in experimenting with.”

The idea of a general-purpose humanoid robot that can assist people with whatever tasks they may need is certainly appealing, but as Amazon makes clear, the first step for companies like Agility is to find enough value performing a single task (or perhaps a few different tasks) to achieve sustainable growth. Agility believes that Digit will be able to scale its business by solving Amazon’s empty tote-recycling problem, and the company is confident enough that it’s preparing to open a factory in Salem, Ore. At peak production the plant will eventually be capable of manufacturing 10,000 Digit robots per year.

A menagerie of humanoids

Agility is not alone in its goal to commercially deploy bipedal robots in 2024. At least seven other companies are also working toward this goal, with hundreds of millions of dollars of funding backing them. 1X , Apptronik , Figure , Sanctuary , Tesla , and Unitree all have commercial humanoid robot prototypes.

Despite an influx of money and talent into commercial humanoid robot development over the past two years, there have been no recent fundamental technological breakthroughs that will substantially aid these robots’ development. Sensors and computers are capable enough, but actuators remain complex and expensive, and batteries struggle to power bipedal robots for the length of a work shift.

There are other challenges as well, including creating a robot that’s manufacturable with a resilient supply chain and developing the service infrastructure to support a commercial deployment at scale. The biggest challenge by far is software. It’s not enough to simply build a robot that can do a job—that robot has to do the job with the kind of safety, reliability, and efficiency that will make it desirable as more than an experiment.

There’s no question that Agility Robotics and the other companies developing commercial humanoids have impressive technology, a compelling narrative, and an enormous amount of potential. Whether that potential will translate into humanoid robots in the workplace now rests with companies like Amazon, who seem cautiously optimistic. It would be a fundamental shift in how repetitive labor is done. And now, all the robots have to do is deliver.

This article appears in the January 2024 print issue as “Year of the Humanoid.”

  • Amazon Shows Off Impressive New Warehouse Robots ›
  • Watch This Giant Chopstick Robot Handle Boxes With Ease ›
  • A Robot for the Worst Job in the Warehouse ›
  • Everything You Wanted to Know About 1X’s Latest Video - IEEE Spectrum ›
  • Toyota’s Bubble-ized Humanoid Grasps With Its Whole Body - IEEE Spectrum ›
  • Drive Unit - ROBOTS: Your Guide to the World of Robotics ›

Evan Ackerman is a senior editor at IEEE Spectrum . Since 2007, he has written over 6,000 articles on robotics and technology. He has a degree in Martian geology and is excellent at playing bagpipes.

Paolo Pennisi

How could be that a rigorous publication such as IEEE Spectrum does not even mention Boston Dynamics' Atlas humanoid robot in the article ?

Irina Rabeja

Interesting presentation.

Management Versus Technical Track

A skeptic’s take on beaming power to earth from space, femtosecond lasers solve solar panels' recycling issue.

  • IEEE Xplore Digital Library
  • IEEE Standards Association
  • Spectrum Online
  • More IEEE Sites

IEEE RAS

TECHNICAL COMMITTEE FOR

Humanoid Robotics

IEEE

Humanoid robotics is an emerging and challenging research field, which has received significant attention during the past years and will continue to play a central role in robotics research and in many applications of the 21st century. Regardless of the application area, one of the common problems tackled in humanoid robotics is the understanding of human-like information processing and the underlying mechanisms of the human brain in dealing with the real world.

Ambitious goals have been set for future humanoid robotics. They are expected to serve as companions and assistants for humans in daily life and as ultimate helpers in man-made and natural disasters. In 2050, a team of humanoid robots soccer players shall win against the winner of most recent World Cup. DARPA announced recently the next Grand Challenge in robotics: building robots which do things like humans in a world made for humans.

Considerable progress has been made in humanoid research resulting in a number of humanoid robots able to move and perform well-designed tasks. Over the past decade in humanoid research, an encouraging spectrum of science and technology has emerged that leads to the development of highly advanced humanoid mechatronic systems endowed with rich and complex sensorimotor capabilities. Of major importance for advances of the field is without doubt the availability of reproducible humanoid robots systems, which have been used in the last years as common hardware and software platforms to support humanoids research. Many technical innovations and remarkable results by universities, research institutions and companies are visible.

The major activities of the TC are reflected by the firmly established annual IEEE-RAS International Conference on Humanoid Robots, which is the internationally recognized prime event of the humanoid robotics community. The conference is sponsored by the IEEE Robotics and Automation Society. The level of interest in humanoid robotics research continues to grow, which is evidenced by the increasing number of submitted papers to this conference. For more information, please visit the official website of the Humanoids TC:  http://www.humanoid-robotics.org

HumanoidsTC teaser image ieee ras

Committee News

Help | Advanced Search

Computer Science > Robotics

Title: the mit humanoid robot: design, motion planning, and control for acrobatic behaviors.

Abstract: Demonstrating acrobatic behavior of a humanoid robot such as flips and spinning jumps requires systematic approaches across hardware design, motion planning, and control. In this paper, we present a new humanoid robot design, an actuator-aware kino-dynamic motion planner, and a landing controller as part of a practical system design for highly dynamic motion control of the humanoid robot. To achieve the impulsive motions, we develop two new proprioceptive actuators and experimentally evaluate their performance using our custom-designed dynamometer. The actuator's torque, velocity, and power limits are reflected in our kino-dynamic motion planner by approximating the configuration-dependent reaction force limits and in our dynamics simulator by including actuator dynamics along with the robot's full-body dynamics. For the landing control, we effectively integrate model-predictive control and whole-body impulse control by connecting them in a dynamically consistent way to accomplish both the long-time horizon optimal control and high-bandwidth full-body dynamics-based feedback. Actuators' torque output over the entire motion are validated based on the velocity-torque model including battery voltage droop and back-EMF voltage. With the carefully designed hardware and control framework, we successfully demonstrate dynamic behaviors such as back flips, front flips, and spinning jumps in our realistic dynamics simulation.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

  • Reference Manager
  • Simple TEXT file

People also looked at

Editorial article, editorial: humanoid robots for real-world applications.

www.frontiersin.org

  • 1 CNRS-AIST Joint Robotics Laboratory (JRL), IRL3218, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
  • 2 Electrical and Computer Engineering Department, Université de Sherbrooke, Sherbrooke, QC, Canada
  • 3 Florida Institute for Human and Machine Cognition, Pensacola, FL, United States

Editorial on the Research Topic Humanoid Robots for Real-World Applications

Since Honda introduced the P2 in 1996, numerous humanoid robots have been developed around the world, and research and development of various fundamental technologies, including bipedal walking, have been conducted. At the same time, attempts have been made to apply humanoid robots to various applications such as plant maintenance, telemedicine, vehicle operation, home security, construction, aircraft manufacturing, disaster response, evaluation of assistive devices, and entertainment.

Humanoid robots have an anthropomorphic body, and their major advantage is that they can move within an environment designed for humans and can use tools and vehicles designed for humans as they are. It is hoped that these advantages can be used to help people focus on more creative activities by replacing activities in harsh environments, hazardous tasks, and low added-value tasks that people are forced to perform because existing fixed, wheeled, or crawler-type robots are unable to deal with them.

In addition, since the fact that something shaped like a human moves like a human has an effect of attracting people, it can be expected to entertain and heal people by interacting with them. It is easier for humans to understand their “intention” through body language. This is related to the avatar application introduced later.

Despite these expectations, even today, more than 25 years after the announcement of P2, there is still no humanoid robot that has been put to practical use other than R&D and communication applications. This is because there is no necessity to use humanoid robots in a structured environment like a conventional factory, where existing robots can be easily applied, and the technology is too immature to use humanoid robots in an environment that is so unstructured that existing robots cannot deal with.

This Research Topic introduces two efforts to improve the basic capabilities of humanoid robots and one effort to apply humanoid robots to remote services, with the aim of practical applications of humanoid robots.

Until now, almost all humanoid robots have used a method in which joints are accurately position-controlled and position commands are updated using joint velocities calculated by inverse kinematics. Recently, methods that updates the position commands by calculating joint accelerations using inverse dynamics calculations, and methods that control the joint torques are being used. Ramuzat et al. implemented these three approaches on the same hardware platform and clarified the advantages and disadvantages of each approach. The method combining position control and inverse kinematics was found to be the least computationally intensive, while the method using torque control was confirmed to have advantages in terms of smoothness of trajectory tracking, energy consumption, and passivity. Recent improvements in computer performance have made it possible to perform inverse dynamics-based torque control at 1 kHz, and there is a possibility that torque control will become the mainstream of joint control in the future.

Multi-contact technology is essential to enable humanoid robots to move in unstructured environments where it is difficult for existing robots to operate. By actively bringing various parts of the body into contact with the environment, humanoid robots can move in confined spaces that are inaccessible to wheeled robots with large footprints. One of the basic functions in multi-contact motion generation is the Posture Generator. This is the problem of calculating joint angles that can realize a given set of contacts without colliding with the environment or the robot itself, and because it is a process that is called frequently in multi-contact motion generation, it must be computationally fast. In the past, collision avoidance was often incorporated into the inverse kinematics solver as an inequality constraint. However, when there are many obstacles in proximity, such as narrow passages, the number of constraints increases, and the computation speed slows down. Rossini et al. tackled the latter problem by proposing a method to generate a collision free posture using an adaptive random velocity vector generator and showed that it is effective especially in narrow environments.

Due in part to the influence of COVID-19, the use of avatar robots to provide remote services has attracted much attention in recent years. Baba et al. compared the performance and perceived workload of face-to-face service delivery and service delivery via avatars in a public space. They found no significant difference in performance, but interestingly found that the perceived workload was smaller when the service was provided via an avatar robot.

Further research and development are needed to enable humanoid robots to autonomously perform tasks that are currently difficult for other robots, and it is expected that it will take more time to achieve this goal. To promote the industrialization of humanoid robots as early as possible, it is thought that their deployment as avatar robots would be an effective way. Using the robot as an avatar robot will enable humans to perform tasks that are harsh or dangerous while being in a safe and comfortable environment. Moreover, it will also enable humans to compensate for the robot’s insufficient abilities, such as advanced situational awareness and higher-level decision making, through remote control. By starting industrialization of humanoid robots in this manner and utilizing them in the real fields every day, a virtuous cycle can be expected to emerge whereby costs are reduced while reliability and autonomy are improved.

Author Contributions

FK drafted the manuscript. WS and RG have made substantial and intellectual contribution to the manuscript. All authors approved it for publication.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Keywords: humanoid robots, industrial robots, collaborative robots, avatar robot, whole-body control, posture generation

Citation: Kanehiro F, Suleiman W and Griffin R (2022) Editorial: Humanoid Robots for Real-World Applications. Front. Robot. AI 9:938775. doi: 10.3389/frobt.2022.938775

Received: 08 May 2022; Accepted: 17 May 2022; Published: 07 June 2022.

Edited and reviewed by:

Copyright © 2022 Kanehiro, Suleiman and Griffin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Fumio Kanehiro, [email protected]

This article is part of the Research Topic

Humanoid Robots for Real-World Applications

Top 22 Humanoid Robots in Use Right Now

research on humanoid robots

While many humanoid robots are still in the prototype phase or other early stages of development, a few have escaped research and development in the last few years, entering the real world as bartenders, concierges, deep-sea divers and as companions for older adults. Some work in warehouses and factories, assisting humans in  logistics and manufacturing . And others seem to offer more novelty and awe than anything else, conducting orchestras and greeting guests at conferences.

What Are Humanoid Robots?

How are humanoid robots being used today.

While more humanoid robots are being introduced into the world and making a positive impact in industries like logistics, manufacturing, healthcare and hospitality, their use is still limited, and development costs are high.

That said, the sector is expected to grow. The humanoid robot market is valued at $1.8 billion in 2023, according to research firm MarketsandMarkets, and is predicted to increase to more than $13 billion over the next five years. Fueling that growth and demand will be advanced humanoid robots with greater AI capabilities and human-like features that can take on more duties in the service industry, education and healthcare .

In light of recent investments, the dawn of complex humanoid robots may come sooner than later. For instance, AI robotics company Figure and ChatGPT-maker OpenAI formed a partnership that’s backed by investors like Jeff Bezos. Under the deal, OpenAI will likely adapt its GPT language models to suit the needs of Figure’s robots. And microchip manufacturer Nvidia revealed plans for Project GR00T , the goal of which is to develop a general-purpose foundation model for humanoid robots. These announcements come in the wake of Elon Musk and Tesla introducing the humanoid robot Optimus in 2022, although the robot remains in the production phase .

How Are Humanoid Robots Being Used?

  • Hospitality: Some humanoid robots, like Kime, are pouring and serving customer drinks and snacks at self-contained kiosks in Spain. Some are even working as hotel concierges and in other customer-facing roles. 
  • Education: Humanoid Robots Nao and Pepper are working with students in educational settings, creating content and teaching programming. 
  • Healthcare: Other humanoid robots are providing services in healthcare settings, like communicating patient information and measuring vital signs.

But before companies can fully unleash their humanoid robots, pilot programs testing their ability to safely work and collaborate alongside human counterparts on factory floors, warehouses and elsewhere will have to be conducted.

It’s unclear how well humanoid robots will integrate into society and how well humans will accept their help. While some people will see the proliferation of these robots as creepy , dangerous or as unneeded  competition in the labor market, the potential benefits like increased efficiency and safety may outweigh many of the perceived consequences. 

Either way, humanoid robots are poised to have a  tremendous impact , and fortunately, there are already some among us that we can look to for guidance. Here are a few examples of the top humanoid robots working in our world today.

Examples of Humanoid Robots

Engineered Arts' humanoid robot, Ameca, looking up.

Ameca (Engineered Arts)

Engineered Arts ’ latest and most advanced humanoid robot is Ameca , which the company bills as a development platform where  AI and machine learning systems can be tested. Featuring sensors that can track movement across the entirety of a room, along with face and multiple voice recognition capabilities, Ameca naturally interacts with humans and detects emotions and age. Ameca is able to communicate common expressions like astonishment and surprise, and gestures like yawning and shrugging. 

More on the future of robotics 35 Robotics Companies on the Forefront of Innovation

A headshot of the Alter-3 humanoid robot.

Alter 3 (Osaka University and mixi)

Dubbed Alter 3 , the latest humanoid robot from Osaka University and mixi is powered by an artificial  neural network and has an ear for music. Earlier iterations of Alter sang in an opera . Alter 3, which has enhanced sensors and improved expressive ability and vocalization system for singing, went even further in 2020 by  conducting an orchestra at the New National Theater in Tokyo and taking part in other live performances. 

The humanoid robot, Armar 6, holding a drill.

ARMAR-6 (Karlsruhe Institute of Technology)

ARMAR-6 is a humanoid robot developed by researchers at the Karlsruhe Institute of Technology in Germany to work in industrial settings. Capable of using drills, hammers and other tools, ARMAR-6 also features  AI technology allowing it to learn how to grasp objects and hand them to human co-workers. It’s also able to take on maintenance duties like wiping down surfaces and even has the ability to ask for help when needed. 

Apptronik's humanoid robot, Apollo, carries a package in a warehouse.

Apollo (Apptronik)

Apptronik ’s  Apollo is the result of the company building on its experiences of previous robots, including its 2022 humanoid robot  Astro . With the ability to carry up to 55 pounds, Apollo is designed to function in plants and warehouses and may expand into industries like retail and construction. An impact zone allows the robot to stop its motion when detecting nearby moving objects while swappable batteries that last four hours each keep Apollo productive.

The humanoid robot, Atlas, running through a field.

Atlas (Boston Dynamics)

Atlas is a leaping, backflipping humanoid robot designed by Boston Dynamics that uses depth sensors for real-time perception and model-predictive control technology to improve motion. Measuring 5 feet tall and weighing in at 196 pounds, Atlas has three onboard computers, 28 hydraulic joints, and moves at speeds of more than 5 miles per hour. Built with  3D-printed parts , Atlas is used by company roboticists as a research and design tool to increase human-like agility and coordination. In April 2024, Boston Dynamics announced plans to retire the hydraulic Atlas in favor of a new electric version that the company says will be stronger and have a broader range of motion.

The humanoid robot, Beomni, holding a medical device in a hospital setting.

Beomni (Beyond Imagination)

Created by Beyond Imagination , the humanoid robot Beomni is controlled remotely by “human pilots” donning virtual reality headsets and other wearable devices like gloves, while AI helps Beomni learn tasks so one day it can become autonomous. In 2022, Beyond Imagination CEO and co-founder Harry Kloor told Built In that he’s hopeful Beomni will transform the care older adults receive, while taking over more tedious and dangerous jobs in other industries. The company also agreed to supply 1,000 humanoid robots over five years to SELF Labs as part of a simulated farm game.

The humanoid robot, Digit, carrying a cardboard box in an office setting.

Digit (Agility Robotics)

Already capable of unloading trailers and moving packages, Digit , a humanoid robot from Agility Robotics , is poised to take on even more tedious tasks. With fully functioning limbs, Digit is able to crouch and squat to pick up objects, adjusting its center of gravity depending on size and weight, while surface plane-reading sensors help it find the most efficient path and circumvent whatever’s in its way. In 2019, Agility partnered with automaker Ford to test autonomous package delivery , and in 2022, the company raised $150 million from Amazon and other companies to help Digit enter the workforce.

A headshot of the humanoid robot, Jiajia.

Jiajia (University of Science and Technology of China)

Developed by researchers from the University of Science and Technology of China , Jiajia is the first humanoid robot to come out of China. Researchers spent three years developing Jiajia. Chen Xiaoping, who led the team behind the humanoid robot, told reporters during Jiajia’s 2016 unveiling that he and his team would soon work to make Jiajia capable of crying and laughing, the Independent reports . According to  Mashable , its human-like appearance was modeled after five students from USTC.

The humanoid robot, Kime, standing behind a beverage bar.

KIME (Macco Robotics)

KIME , Macco Robotics ’ humanoid robotic bartender, serves beer, coffee, wine, snacks, salads and more. Each KIME kiosk is able to  dispense 253 items per hour and features a touchscreen and app-enabled ordering, plus a built-in payment system. Though unable to dispense the sage advice of a seasoned bartender, KIME is able to recognize its regular customers and pour two beers every six seconds.

More on Robotic Innovation Exoskeleton Suits: 20 Real-Life Examples

The humanoid robot, Nadine, sitting behind a table ready to have a conversation.

Nadine (Nanyang Technological University)

Researchers from Nanyang Technological University in Singapore developed Nadine , a humanoid social robot, with realistic skin, hair, facial expressions and upper body movements that’s able to work in a variety of settings. According to researchers, Nadine can recognize faces, speech, gestures and objects. It even features an affective system that models Nadine’s personality, emotions and mood. So far, Nadine has worked in customer service and led a bingo game for a group of older adults in Singapore.  

The humanoid robot, NAO, gesturing to the side with its arms.

NAO (Softbank Robotics)

Softbank Robotics ’ first humanoid robot, NAO , works as an assistant for companies and organizations in industries ranging from healthcare to education. Only 2 feet tall, NAO features two 2D cameras for  object recognition as well as four directional microphones and speakers, plus seven touch sensors, to better interact with people and its surrounding environment. With the ability to speak and converse in 20 languages, NAO is helping create content and teach programming in classrooms and working as assistants and patient service representatives in healthcare settings .

The humanoid robot, Ocean One k, swimming in the ocean.

OceanOne (Stanford Robotics Lab)

A diving humanoid robot, OceanOne , from the Stanford Robotics Lab is exploring shipwrecks. In 2016, in its maiden voyage, OceanOne ventured to the Mediterranean Sea off the coast of France to explore the wreckage of La Lune, one of King Louis XIV’s ships that was sunk in 1664. In its latest iteration,  OceanOneK , Stanford’s humanoid robot is able to dive even deeper, reaching depths of 1,000 meters. Featuring  haptic feedback and AI, OceanOneK can operate tools and other equipment, and has already explored underwater wreckage of planes and ships.

The humanoid robot, Pepper, interacting with children.

Pepper (Softbank Robotics)

Pepper is another humanoid robot from Softbank Robotics working in classrooms and healthcare settings. But unlike NAO,  Pepper is able to recognize faces and track human emotions. Pepper has worked as a  hotel concierge and has been used to monitor contactless care and communication for older adults during the pandemic. A professional baseball team in Japan even used a squad of Peppers to cheer on its players when the pandemic kept the team’s human fans at home.

The humanoid robot, promobot, smiling and waving.

Promobot (Promobot)

Promobot is a customizable humanoid robot that’s capable of working as a brand ambassador, concierge, tour guide, medical assistant and in other service-oriented roles. Equipped with facial recognition and chat functions, Promobot can issue keycards, scan and auto-fill documents, and print guest passes and receipts. As a concierge, Promobot integrates with a building’s security system and is able to recognize the faces of a building’s residents. At hotels, it can check guests in , and in healthcare settings, Promobot is able to measure key health indicators like blood sugar and blood oxygen levels.  

Two Robonaut humanoid robots facing each other holding space tools.

Robonaut 2 (NASA and General Motors)

Developed by NASA and General Motors , Robonaut 2 is a humanoid robot that works alongside human counterparts in space and on the factory floor. More than a decade ago, Robonaut 2 became the first humanoid robot to enter space , and worked as an assistant on the International Space Station until 2018, when it returned to Earth for repairs . Today, Robonaut 2 is inspiring other innovations and advancements in robotics, like the RoboGlove and Aquanaut from the ocean robotics company Nauticus .

The humanoid robot, robothespian, expressing emotion with its fists balled in anguish.

RoboThespian (Engineered Arts)

Another humanoid robot from Engineered Arts is RoboThespian , which features telepresence software that allows humans to remotely talk through the robot. With automated eye contact and micro-facial expressions, RoboThespian is able to perform for crowds and work in places like the Kennedy Space Center where it  answers questions about the Hubble Telescope from curious visitors.

A headshot of the humanoid robot, Sophie.

Sophia (Hanson Robotics)

Hanson Robotics ’ AI-powered humanoid robot Sophia has traveled the world, graced the cover of Cosmopolitan Magazine and addressed the United Nations. One of the more widely known humanoid robots, Sophia can process visual, emotional and conversational data to better interact with humans. Sophia has made multiple appearances on The Tonight Show , where she challenged host Jimmy Fallon to a game of rock-paper-scissors, expressed her disdain for nacho cheese and sang a short duet with Fallon. Sophia sees herself as “a personification of our dreams for the  future of AI , as well as a framework for advanced AI and robotics research, and an agent for exploring human-robot experience in service and entertainment applications.”

The humanoid robot, Surenal IV, holding a ballon with both hands.

Surena IV (University of Tehran)

Able to grab a water bottle,  pose for a selfie and write its own name on a whiteboard, Surena IV is the latest humanoid robot from the University of Tehran . IEEE Spectrum reports that Surena IV has improved tracking capabilities and new hands that allow it to use power tools. It’s also able to adjust the angle and position of its feet, giving it an improved ability to navigate uneven terrain.

More on Robotics Do Robots Have a Race?

A person controlling the humanoid robot, T HR3, through a VR headset, wearable devices and seated controller.

T-HR3 (Toyota)

Billed as a “remote avatar robot” Toyota ’s humanoid robot, T-HR3 , is controlled by humans clad in wearable devices . Introduced in 2017, T-HR3 aims to help the automaker expand its mobility services, according to Tomohisa Moridaira, the T-HR3 development team leader. Toyota envisions that T-HR3 will one day help around the house and assist in childcare, nursing and construction.

The humanoid robot, Walker X, walking through a lit tunnel.

Walker (UBTECH Robotics)

With improved hand-eye coordination and autonomous navigation, Walker , a humanoid service robot by  UBTECH Robotics , is able to safely climb stairs and balance on one leg. Robotics and Automation News  reports that Walker is able to serve tea, water flowers and use a vacuum, showing off just how helpful this humanoid robot could be around the house.

The humanoid robot, Phoenix, and a list of its specifications.

Phoenix (Sanctuary AI)

Sanctuary AI ’s sixth-generation robot, named  Phoenix , is equipped with human-like hands and the ability to lift up to 55 pounds, making it useful for various roles in the workforce where there are labor shortages. Because Phoenix can be controlled, supervised and trained by humans, it goes beyond specific tasks and demonstrates the competence to  complete tasks in various settings.

The humanoid robot, EVE, picking up a package.

1X claims the title of being the company to send the  first AI-powered humanoid robot into the workforce. The company’s robot EVE comes with strong grippers for hands, cameras that support panoramic vision and two wheels for mobility. Most importantly, EVE uses AI to learn new tasks and improve based on past experiences. With these abilities, EVE is on pace to spread into industries like retail, logistics and even  commercial security .

Frequently Asked Questions

What are humanoid robots.

Humanoid robots look like humans and mimic human motions and actions to perform various tasks. Some humanoid robots even use materials that resemble human features, like skin and eyes, to appear friendlier. 

What are humanoid robots used for?

Humanoid robots are often used for customer service roles, including concierges, bartenders and greeters. Because of their human shape, humanoid robots can also assist with handling and carrying materials in warehouses and factories.

Great Companies Need Great People. That's Where We Come In.

Advancements in Humanoid Robots: A Comprehensive Review and Future Prospects

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Perception for Humanoid Robots

  • Open access
  • Published: 28 November 2023
  • Volume 4 , pages 127–140, ( 2023 )

Cite this article

You have full access to this open access article

research on humanoid robots

  • Arindam Roychoudhury   ORCID: orcid.org/0009-0001-7016-0523 1 ,
  • Shahram Khorshidi 1 ,
  • Subham Agrawal 1 &
  • Maren Bennewitz 1 , 2  

1348 Accesses

Explore all metrics

Purpose of Review

The field of humanoid robotics, perception plays a fundamental role in enabling robots to interact seamlessly with humans and their surroundings, leading to improved safety, efficiency, and user experience. This scientific study investigates various perception modalities and techniques employed in humanoid robots, including visual, auditory, and tactile sensing by exploring recent state-of-the-art approaches for perceiving and understanding the internal state, the environment, objects, and human activities.

Recent Findings

Internal state estimation makes extensive use of Bayesian filtering methods and optimization techniques based on maximum a-posteriori formulation by utilizing proprioceptive sensing. In the area of external environment understanding, with an emphasis on robustness and adaptability to dynamic, unforeseen environmental changes, the new slew of research discussed in this study have focused largely on multi-sensor fusion and machine learning in contrast to the use of hand-crafted, rule-based systems. Human robot interaction methods have established the importance of contextual information representation and memory for understanding human intentions.

This review summarizes the recent developments and trends in the field of perception in humanoid robots. Three main areas of application are identified, namely, internal state estimation, external environment estimation, and human robot interaction. The applications of diverse sensor modalities in each of these areas are considered and recent significant works are discussed.

Similar content being viewed by others

research on humanoid robots

Sensor Fusion and State Estimation of the Robot

research on humanoid robots

Sensing and Estimation

Avoid common mistakes on your manuscript.

Introduction

Perception is of paramount importance for robots to establish a model of their internal state as well as the external environment. These models allow the robot to perform its task safely, efficiently and accurately. Perception is facilitated by various types of sensors which gather both proprioceptive and exteroceptive information. Humanoid robots, especially those which are mobile, pose a difficult challenge for the perception process: mounted sensors are susceptible to jerky and unstable motions due to the very high degrees of freedom afforded by the high number of articulable joints present on a humanoid’s body, e.g., the legs, the hip, the manipulator arms or the neck.

We organize the main areas of perception in humanoid robots into three broad yet overlapping areas for the purposes of this survey, namely, state estimation for balance and joint configurations, environment understanding for navigation, mapping and manipulation, and finally human-robot interaction for successful integration into a shared human workspace, see Fig.  1 . For each area we discuss the popular application areas, the challenges and recent methodologies used to surmount them.

Internal state estimation is a critical aspect of autonomous systems, particularly for humanoid robots in order to address both low level stability and dynamics, and as an auxiliary to higher level tasks such as localization, mapping and navigation. Legged robots locomotion is particularly challenging given their inherent under-actuation dynamics and the intermittent contact switching with the ground during motion.

The application of external environment understanding has a very broad scope in humanoid robotics but can be roughly divided into navigation and manipulation. Navigation implies the movement of the mobile bipedal base from one location to another without collision thereby leaving the external environment configuration unchanged. On the other hand, manipulation is where the humanoid changes the physical configuration of its environment using its end-effectors.

It could be argued that human robot interaction or HRI is a subset of environment understanding. However, we have separated the two areas based on their ultimate goals. The goal of environment understanding is to interact with inanimate objects while the goal of HRI is to interact with humans. The set of posed challenges are different though similar principles may be reused. Human detection, gesture and activity recognition, teleoperation, object handover and collaborative actions, and social communications are some of the main areas where perception is used.

figure 1

Perception for humanoid robots split into three principal areas. Left: State estimation being used to estimate derived quantities like CoM and ZMP from sensors like IMU and joint encoders. Right: Environment understanding has a very broad scope which varies from localization and mapping to environment segmentation for planning and even more application areas. Human Robot Interaction is closely related but deals exclusively with human beings rather than inanimate objects. Center: Some sensors which aid in perception for humanoid robots. Sources for labeled images- (a):[ 1 ], (b): [ 2 ] and (c): [ 3 ]

State Estimation

Recent works on humanoid and legged robots locomotion control have focused extensively on state-feedback approaches [ 4 ]. Legged robots have highly nonlinear dynamics, and they need high frequency ( \(1\, kHz\) ) and low latency ( \(<1: ms\) ) feedback in order to have robust and adaptive control systems, thereby adding more complexity to the design and development of reliable estimators for the base and centroidal states, and contact detection.

Challenges in State Estimation

Perceived data is often noisy and biased and it gets magnified in derived quantities. For instance, joint velocities tend to be noisier than joint positions, as these are obtained by numerically differentiating joint encoder values. Rotella et al. [ 5 ] developed a method to determine joint velocities and acceleration of a humanoid robot using link-mounted Inertial Measurement Units (IMUs), resulting in less noise and delay compared to filtered velocities from numerical differentiation. An effective approach to mitigate biased IMU measurements is to explicitly introduce these biases as estimated states in the estimation framework [ 6 , 7 ].

The high dimensionality of humanoids make it computationally expensive to formulate a single filter for the entire state. As an alternative, Xinjilefu et al. [ 8 ] proposed decoupling the full state into several independent state vectors, and used separate filters to estimate the pelvis state and joint dynamics.

To account for kinematic modeling errors such as joint backlash and link flexibility, Xinjilefu et al. [ 9 ] introduced a method using a Linear Inverted Pendulum Model (LIPM) with an offset which represented the modeling error in the Center of Mass (CoM) position and/or external forces. Bae et al. [ 10 ] proposed a CoM kinematics estimator by including a spring and damper in the LIPM to compensate for modeling errors. To address the issue of link flexibility in the humanoid exoskeleton Atalante , Vigne et al. [ 11 ] decomposed the full state estimation problem into several independent attitude estimation problems, each corresponding to a given flexibility and a specific IMU relying only on dependable and easily accessible geometric parameters of the system, rather than the dynamic model.

In the remainder of this section, we classify the recent related works on state estimation into three main categories [ 12 ]: proprioceptive state estimation, which primarily involves filtering methods that fuse high-frequency proprioceptive sensor data; multi-sensor fusion filtering, which integrates exteroceptive sensor modalities into the filtering process; multi-sensor fusion with state smoothing, which employs advanced techniques that leverage the entire history of sensor measurements to refine estimated states.

Finally, we present a list of available open-source software for state estimation from reviewed literature in Table  1 .

Proprioceptive State Estimation

Proprioceptive sensors provide measurements of the robot’s internal state. They are commonly used to compute leg odometry, which captures the drifting pose. For a comprehensive review of the evolution of proprioceptive filters on leg odometry, refer to [ 22 ], and [ 23 ].

Base State Estimation

In humanoid robots, the focus is on estimating the position, velocity, and orientation of the “base” frame, typically located at the pelvis. Recent state estimation approaches in this field often fuse IMU and leg odometry.

The work by Bloesch [ 6 ] was a decisive step in introducing a base state estimator for legged robots using a quaternion-based Extended Kalman Filter (EKF) approach. This method made no assumptions about the robot’s gait and number of legs or the terrain structure and included absolute positions of the feet contact points, and IMU bias terms in the estimated states. Rotella et al. [ 7 ] extended it to humanoid platforms by considering the full foot plate and adding foot orientation to the state vector. Both works showed that as long as at least one foot remains in contact with the ground, the base absolute velocity, roll and pitch angles, and IMU biases are observable. There are also other formulations for the base state estimation using only proprioceptive sensing in [ 16 , 24 ], and [ 25 ].

Centroidal State Estimation

Centroidal states in humanoid robots include the CoM position, linear and angular momentum, and their derivatives. The CoM serves as a vital control variable for stability and robust humanoid locomotion, making accurate estimation of centroidal states crucial in control system design for humanoid robots.

When the full 6-axis contact wrench is not directly available to the estimator, e.g., the robot gauge sensors measure only the contact normal force, some works have utilized simplified models of dynamics, such as the LIPM [ 26 ].

Piperakis et al. [ 27 ] presented an EKF to estimate centroidal variables by fusing joint encoders, IMU, foot sensitive resistors, and later including visual odometry in [ 13 ]. They formulated the estimator based on the non-linear Zero Moment Point (ZMP) dynamics, which captured the coupling between dynamics behavior in the frontal and lateral planes. Their results showed better performance over Kalman filter formulation based on the LIPM.

Mori et al. [ 28 ] proposed a centroidal state estimation framework for a humanoid robot based on real-time inertial parameter identification, using only the robot’s proprioceptive sensors (IMU, foot Force/Torque (F/T) sensors, and joint encoders), and the sequential least squares method. They conducted successful experiments deliberately altering the robot’s mass properties to demonstrate the robustness of their framework against dynamic inertia changes.

By having 6-axis F/T sensors on the feet, Rotella et al. [ 29 ] utilized momentum dynamics of the robot to estimate the centroidal quantities. Their nonlinear observability analysis demonstrated the observability of either biases or external wrench. In a different approach, Carpentier et al. [ 30 ] proposed a frequency analysis of the information sources utilized in estimating the CoM position, and later for CoM acceleration and the derivative of angular momentum [ 31 ]. They introduced a complementary filtering technique that fuses various measurements, including ZMP position, sensed contact forces, and geometry-based reconstruction of the CoM by using joint encoders, according to their reliability in the respective spectral bandwidth.

figure 2

State estimation with multi-sensor filtering, integrating LiDAR for drift correction and localization. Top row, filtering people from raw point cloud. Bottom row, state estimation and localization with iterative closest point correction on filtered point cloud. From [ 12 ]

Contact Detection and Estimation

Feet contact detection plays a crucial role in locomotion control, gait planning, and proprioceptive state estimation in humanoid robots. Recent approaches can be categorized into two main groups: those directly utilizing measured ground reaction wrenches, and methods integrating kinematics and dynamics to infer the contact status by estimating the ground reaction forces. Fallon et al. [ 2 ] employed a Schmitt trigger with a 3-axis foot F/T sensor to classify contact forces and used a simple state machine to determine the most reliable foot for kinematic measurements. Piperakis et al. [ 13 ] adapted a similar approach by utilizing pressure sensors on the foot.

Rotella et al. [ 32 ] presented an unsupervised method for estimating contact states by using fuzzy clustering on only proprioceptive sensor data (foot F/T and IMU sensing), surpassing traditional approaches based on measured normal force. By including the joint encoders in proprioceptive sensing, Piperakis et al. [ 20 ] proposed an unsupervised learning framework for gait phase estimation, achieving effectiveness on uneven/rough terrain walking gaits. They also developed a deep learning framework by utilizing F/T and IMU sensing in each leg, to determine the contact state probabilities [ 33 ]. The generalizability and accuracy of their approach was demonstrated on different robotic platforms. Furthermore, Maravgakis et al. [ 34 ] introduced a probabilistic contact detection model, using only IMU sensors mounted on the end effector. Their approach estimated the contact state of the feet without requiring training data or ground truth labels.

Another active research field in humanoid robots is monitoring and identifying contact points on the robot’s body. Common approaches focus on proprioceptive sensing for contact localization and identification. Flacco et al. [ 35 ] proposed using an internal residual of external momentum to isolate and identify singular contacts, along with detecting additional contacts with known locations. Manuelli et al. [ 36 ] introduced a contact particle filter for detecting and localizing external contacts, by only using proprioceptive sensing, such as 6-axis F/T sensors, capable of handling up to 3 contacts efficiently. Vorndamme et al. [ 37 ] developed a real-time method for multi-contact detection using 6-axis F/T sensors distributed along the kinematic chain, capable of handling up to 5 contacts. Vezzani et al. [ 38 ] proposed a memory unscented particle filter algorithm for real-time 6 Degrees of freedom (DoF) tactile localization using contact point measurements made by tactile sensors.

Multi-Sensor Fusion Filtering

One drawback of base state estimation using proprioceptive sensing is the accumulation of drift over the time, due to sensor noise. This drift is not acceptable for controlling highly dynamic motions, therefore it is typically compensated by integrating other sensor modalities from exteroceptive sensors, such as cameras, depth cameras, and LiDAR.

Fallon et al. [ 2 ] proposed a drift-free base pose estimation method by incorporating LiDAR sensing into a high-rate EKF estimator using a Gaussian particle filter for laser scan localization. Although their framework eliminated the drift, a pre-generated map was required as input. Piperakis et al. [ 39 ] introduced a robust Gaussian EKF to handle outlier detection in visual/LiDAR measurements for humanoid walking in dynamic environments. To address state estimation challenges in real-world scenarios, Camurri et al. [ 12 ] presented Pronto, a modular open-source state estimation framework for legged robots Fig.  2 . It combined proprioceptive and exteroceptive sensing, such as stereo vision and LiDAR, using a loosely-coupled EKF approach.

Multi-Sensor Fusion with State Smoothing

So far, we have explored filtering methods based on Bayesian filtering for sensor fusion and state estimation. However, as the number of states and measurements increases, computational complexity becomes a limitation. Recent advancements in computing power and nonlinear solvers have popularized non-linear iterative maximum a-posteriori (MAP) optimization techniques, such as factor graph optimization.

To address the issue of visual tracking loss in visual factor graphs, Hartley et al. [ 40 ] introduced a factor graph framework that integrated forward kinematic and pre-integrated contact factors. The work was extended by incorporating the influence of contact switches and associated uncertainties [ 41 ]. Both works showed that the fusion of contact information with IMU and vision data provides a reliable odometry system for legged robots.

Solá et al [ 18 ] presented an open-source modular estimation framework for mobile robots based on factor graphs. Their approach offered systematic methods to handle the complexities arising from multi-sensory systems with asynchronous and different-frequency data sources. This framework was evaluated on state estimation for legged robots and landmark-based visual-inertial SLAM for humanoids by Fourmy et al. [ 26 ].

Environment Understanding

Environment understanding is a critical area of research for humanoid robots, enabling them to effectively navigate through and interact with complex and dynamic environments. This field can be broadly classified into two key categories: 1. localization, navigation and planning for the mobile base, and 2. object manipulation and grasping.

Perception in Localization, Navigation and Planning

Localization focuses on precisely and continuously estimating the robot’s position and orientation relative to its environment. Planning and navigation involve generating optimal paths and trajectories for the robot to reach its desired destination while avoiding obstacles and considering task-specific constraints.

Localization, Mapping and SLAM

Localization and SLAM (simultaneous localization and mapping) relies primarily on visual sensors such as cameras and lasers but often additionally use encoders and IMUs to enhance estimation accuracy.

Localization

Indoor environments are usually considered structured, characterized by the presence of well-defined, repeatable and often geometrically consistent objects. Landmarks can be uniquely identified by encoded vectors obtained from visual sensors such as depth or RGB cameras allowing the robot to essentially build up a visual map of the environment and then compare newly observed landmarks against a database to localize via object or landmark identification. In recent years, the use of handcrafted image features such as SIFT and SURF and feature dictionaries such as the Bag-of-Words (BoW) model in landmark representation has been superseded by feature representations learned through training on large example sets, usually by variants of artificial neural networks such as convolutional neural networks (CNNs). CNNs have also outperformed classifiers such as support vector machines (SVMs) in deriving inferences [ 42 , 43 ]. However, several rapidly evolving CNN architectures exist. Ovalle-magallanes et al. [ 44 ] performed a comparative study of four such networks while successfully localizing in a visual map.

The RoboCup Soccer League is popular in humanoid research due to the visual identification and localization challenges it presents. [ 45 , 46 ] and [ 47 ] are some examples of real-time, CNN based ball detection approaches utilizing RGB cameras developed specifically for RoboCup. Cruz et al. [ 48 ] could additionally estimate player poses, goal locations and other key pitch features using intensity images alone. Due to the low on-board computational power of the humanoids, others have used fast, low power external mobile GPU boards such as the Nvidia Jetson to aid inference [ 47 , 49 ].

Unstructured and semi-structured environments are encountered outdoors or in hazardous and disaster rescue scenarios. They have a dearth of reliably trackable features, unpredictable lighting conditions and are challenging for gathering training data. Thus, instead of features, researchers have focused on raw point clouds or combining different sensor modalities for navigating such environments. Starr et al. [ 50 ] presented a sensor fusion approach which combined long-wavelength infrared stereo vision and a spinning LiDAR for accurate rangefinding in smoke-obscured environments. Nobili et al. [ 51 ] successfully localized robots constrained by a limited field-of-view LiDAR in a semi-structured environment. They proposed a novel strategy for tuning outlier filtering based on point cloud overlap which achieved good localization results in the DARPA Robotics Challenge Finals. Raghavan et al. [ 52 ] presented simultaneous odometry and mapping by fusing LiDAR and kinematic-inertial data from IMU, joint encoders, and foot F/T sensors while navigating a disaster environment.

SLAM subsumes localization by the additional map construction and loop closing aspects, whereby the robot has to re-identify and match a place which was visited sometime in the past, to its current surroundings and adjust its pose history and recorded landmark locations accordingly. A humanoid robot which is intended to share human workspaces needs to deal with moving objects, both rapid and slow, which could disrupt its mapping and localizing capabilities. Thus, recent works on SLAM have focused on handling the presence of dynamic obstacles in visual scenes. While the most popular approach remains sensor fusion [ 53 , 54 ], other purely visual approaches have also been proposed, such as, [ 55 ] which introduced a dense RGB-D SLAM solution that utilized optical flow residuals to achieve accurate and efficient dynamic/static segmentation for camera tracking and background reconstruction. Zhang et al. [ 56 ] took a more direct approach which employed deep learning based human detection, and used graph-based segmentation to separate moving humans from the static environment. They further presented a SLAM benchmark dedicated to dynamic environment SLAM solutions [ 57 ]. It included RGB-D data acquired from an on-board camera on the HRP-4 humanoid robot, along with other sensor data. Adapting publicly available SLAM solutions and tailoring it for humanoid use is not uncommon. Sewtz et al. [ 58 ] adapted the Orb-Slam [ 59 ] for a multi-camera setup on the DLR Rollin’ Justin System while Ginn et al. [ 60 ] did it for the iGus , a midsize humanoid platform, to have low computational demands.

Navigation and Planning

Navigation and planning algorithms use perception information to generate a safe, optimal and reactive path, considering obstacles, terrain, and other constraints.

Local Planning

Local planning or reactive navigation is generally concerned with local real-time decision-making and control, allowing the robot to actively respond to perceived changes in the environment and adjust its movements accordingly. Especially in highly controlled applications rule-based, perception driven navigation is still popular and yields state-of-the-art performance both in terms of time demands and task accomplishment. Bista et al. [ 61 ] achieved real-time navigation in indoor environments by representing the environment by key RGB images, and deriving a control law based on common line segments and feature points between the current image and nearby key images. Regier et al. [ 62 ] determined appropriate actions based on a pre-defined set of mappings between object class and action. A CNN was used to classify objects from monocular RGB vision. Ferro et al. [ 63 ] integrated information from a monocular camera, joint encoders, and an IMU to generate a collision-free visual servo control scheme. Juang et al. [ 64 ] developed a line follower which was able to infer forward, lateral and angular velocity commands using path curvature estimation and PID control from monocular RGB images. Magassouba et al. [ 65 ] introduced an aural servo framework based on auditory perception, enabling robot motions to be directly linked to low-level auditory features through a feedback loop.

We also see the use of a diverse array of classifiers to learn navigation schemes from perception information. Their generalization capability allows adaptation to unforeseen obstacles and events in the environment. Abiyev et al. [ 66 ] presented a vision-based path-finding algorithm which segregated captured images into free and occupied areas using an SVM. Lobos-tsunekawa et al. [ 67 ] and Silva et al. [ 68 ] proposed deep learned visual (RGB) navigation systems for humanoid robots which were able to achieve real time performance. The former used a reinforcement learning (RL) system with an actor-critic architecture while the latter utilized a decision tree of deep neural networks deployed on a soccer playing robot.

Global Planning

These algorithms operate globally, taking into account long-term objectives and optimize movements to minimize costs, maximize efficiency, or achieve a specific outcome on the basis of a perceived environment model.

Footstep Planning is a crucial part of humanoid locomotion and has generated substantial research interest for itself. Recent works exhibit two primary trends related to perception. The first is providing humanoids the capability of rapidly perceiving changes in the environment and reacting through fast re-planning. The second endeavors to segment and/or classify uneven terrains to find stable 6 DoF footholds for highly versatile navigation.

figure 3

Footstep planning on the humanoid Lola from [ 69 ]. Top left: The robot’s vision system and a human causing disturbance. Bottom right: The collision model with geometric obstacle approximations

Tanguy et al. [ 54 ] proposed a model predictive control (MPC) scheme that fused visual SLAM and proprioceptive F/T sensors for accurate state estimation. This allowed rapid reaction to external disturbances by adaptive stepping leading to balance recovery and improved localization accuracy. Hildebrandt et al. [ 69 ] used the point cloud from an RGB-D camera to model obstacles as swept-sphere-volumes (SSVs) and step-able surfaces as convex polygons for real-time reactive footstep planning with the Lola humanoid robot. Their system was capable of handling rough terrain as well as external disturbances such as pushes (see Fig.  3 ). Others have also used geometric primitives to aid in footstep planning, such as surface patches for foothold representation [ 70 , 71 ], environment segmentation to find step-able regions, such as 2D plane segments embedded in 3D space [ 72 , 73 ], or represented obstacles by their polygonal ground projections [ 74 ]. Suryamurthy et al. [ 75 ] assigned pixel-wise terrain labels and rugosity measures using a CNN consuming RGB images for footstep planning on a CENTAURO robot.

Whole Body Planning in humanoid robots involves the coordinated planning and control of the robot’s entire body to achieve an objective. Coverage planning is a subset of whole body planning where a minimal sequence of whole body robot poses are estimated to completely explore a 3D space via robot mounted visual sensors [ 76 , 77 ]. Target finding is a special case of coverage planning where the exploration stops when the target is found [ 78 , 79 ]. These concepts are related primarily to view planning in computer vision. In other applications, Wang et al. [ 80 ] presented a method for trajectory planning and formation building of a robot fleet using local positions estimated from onboard optical sensors and Liu et al. [ 81 ] presented a temporal planning approach for choreographing dancing robots in response to microphone-sensed music.

Perception in Grasping and Manipulation

Manipulation and grasping in humanoid robots involve their ability to interact with objects of varying shapes, sizes, and weights, to perform dexterous manipulation tasks using their sensor equipped end-effectors which provide visual or tactile feedback for grip adjustment.

Grasp Planning

Grasp planning is a lower level task specifically focused on determining the optimal manipulator pose sequence to securely and effectively grasp an object. Visual information is used to find grasping locations and also as a feedback to optimize the difference between the target grasp pose and the current end-effector pose.

Schmidt et al. [ 82 ] utilized a CNN trained on object depth images and pre-generated analytic grasp plans to synthesize grasp solutions. The solution generated full end-effector poses and could generate poses not limited to the camera view direction. Vezzani et al. [ 83 ] modeled the shape and volume of the target object captured from stereo vision in real-time using super-quadric functions allowing grasping even when parts of the object were occluded. Vicente et al. [ 84 ] and Nguyen [ 85 ] focused on achieving accurate hand-eye coordination in humanoids equipped with stereo vision. While the former compensated for kinematic calibration errors between the robot’s internal hand model and captured images using particle based optimization, the latter trained a deep neural network predictor to estimate the robot arm’s joint configuration. [ 86 ] proposed a combination of CNNs and dense conditional random fields (CRFs) to infer action possibilities on an object (affordances) from RGB images.

figure 4

Left: A Nao humanoid equipped with artificial skin cells on the chest, hand, fore arm, and upper arm. Right: Visualization of the skin cell coordinate frames on the Nao. Figure taken from [ 87 ]

Tactile sensors, such as pressure-sensitive skins or fingertip sensors, provide feedback about the contact (surface normal) forces, slip detection, object texture, and shape information during object grasping. Kaboli et al. [ 87 ] extracted tactile descriptors for material and object classification agnostic to various sensor types such as dynamic pressure sensors, accelerometers, capacitive sensors, and impedance electrode arrays. A Nao with artificial skin used for their experiments is shown in Fig.  4 . Hundhausen et al. [ 88 ] introduced a soft humanoid hand equipped with in-finger integrated cameras and an in-hand real-time image processing system based on CNNs for fast reactive grasping.

Manipulation Planning

Manipulation planning involves the higher-level decision-making process of determining how the robot should manipulate an object once it is grasped. It generates a sequence of motions or actions which is updated based on the continuously perceived robot and grasped object state.

Deep recurrent neural networks (RNNs) are capable of predicting the next element in a sequence based on the previous elements. This property is exploited in manipulation planning by breaking down a complex task into a series of manipulation commands generated by RNNs based on past commands. These networks are capable of mapping features extracted from a sequence of RGB images, usually by CNNs, to a sequence of motion commands [ 89 , 90 ]. Inceoglu et al. [ 91 ] presented a multimodal failure monitoring and detection system for robots which integrated high-level proprioceptive, auditory, and visual information during manipulation tasks. Robot assisted dressing is a challenging manipulation task that has been addressed by multiple authors. Zhang et al. [ 92 ] utilized a hierarchical multi-task control strategy to adapt the humanoid robot Baxter ’s applied forces, measured using joint torques, to the user’s movements during dressing. By tracking the subject human’s pose in real-time using capacitive proximity sensing with low latency and high signal-to-noise ratio, Erickson et al. [ 93 ] developed a method to adapt to human motion and adjust for errors in pose estimation during dressing assistance by the PR2 robot. Zhang et al. [ 94 ] computed suitable grasping points on garments from depth images using a deep neural network to facilitate robot manipulation in robot-assisted dressing tasks.

Human-Robot Interaction

Human robot interaction is a subset of environment understanding which deals with interactions with humans as opposed to inanimate objects. In order to achieve this, a robot needs diverse capabilities ranging from detecting humans, recognizing their pose, gesture, and emotions, to predicting their intent and even proactively performing actions to ensure a smooth and seamless interaction.

There are two main challenges to perception in HRI - perception of users, and inference which involves making sense of the data and making predictions.

Perception of Users

This involves identifying humans in the environment, detecting their pose, facial features, and objects they interact with. This information is crucial for action prediction and emotion recognition [ 95 ]. Robots rely on vision-based, audio-based, tactile-based, and range sensor-based sensing techniques for detection as explained in this survey on perception methods of social robots done by [ 96 ].

Robinson et al. [ 97 ] showed how vision-based techniques have evolved from using facial features, motion features, and body appearance to deep learning-based approaches. Motion-based features separate moving objects from the background to detect humans. Body appearance-based algorithms use shape, curves, posture, and body parts to detect humans. Deep learning models like R-CNN, Faster R-CNN, and YOLO have also been applied for human detection [ 96 ].

Pose detection is essential for understanding human body movements and postures. Sensors such as RGB cameras, stereo cameras, depth sensors, and motion tracking systems are used to extract pose information. This was explained in detail by Möller et al. [ 98 ] in their survey of human-aware robot navigation. Facial features play a significant role in pose detection as they provide additional points of interest and enable emotion recognition [ 99 ]. A great demonstration of detecting pose and using it for bi-manual robot control using an RGB-D range sensor was shown by Hwang et al. [ 100 ]. The system employed a CNN from the OpenPose package to extract human skeleton poses, which were then mapped to drive robotic hands. The method was implemented on the CENTAURO robot and successfully performed box and lever manipulation tasks in real-time. They presented a real-time pose imitation method for a mid-size humanoid robot equipped with a servo-cradle-head RGB-D vision system. Using eight pre-trained neural networks, the system accurately captured and imitated 3D motions performed by a target human, enabling effective pose imitation and complex motion replication in the robot. Lv et al. [ 101 ] presented a novel motion synchronization method called GuLiM for teleoperation of medical assistive robots, particularly in the context of combating the COVID-19 pandemic. Li et al. [ 102 ] presented a multimodal mobile teleoperation system that integrated a vision-based hand pose regression network and an IMU-based arm tracking method. The system allowed real-time control of a robot hand-arm system using depth camera observations and IMU readings from the observed human hand, enabled through the Transteleop neural network which generated robot hand poses based on a depth image input of a human hand.

Audio communication is vital for human interaction, and robots aim to mimic this ability. Microphones are used for audio detection, and speakers reproduce sound. Humanoid robots are usually designed to be binaural i.e., they have two separate microphones at either side of the head which receive transmitted sound independently. Several researchers have focused on this property to localize both the sound source and the robot in complex auditory environments. Such techniques are used in speaker localization, as well as other semantic understanding tasks such as automatic speech recognition (ASR), auditory scene analysis, emotion recognition, and rhythm recognition [ 96 , 103 ].

Benaroya et al. [ 104 ] employed non-negative tensor factorization for binaural localization of multiple sound sources within unknown environments. Schymura et al. [ 105 ] focused on combined audio-visual speaker localization and proposed a closed-form solution to compute dynamic stream weighting between audio and visual streams, improving the state estimation in a reverberant environment. The previous study was extended to incorporate dynamic stream weights into nonlinear dynamical systems which improved speaker localization performance even further [ 106 ]. Dávila-Chacón et al. [ 107 ] used a spiking and a feed-forward neural network for sound source localization and ego noise removal respectively to enhance ASR in challenging environments. Trowitzsc et al. [ 108 ] presented a joint solution for sound event identification and localization, utilizing spatial audio stream segregation in a binaural robotic system.

Ahmad et al. [ 109 ] in their survey on physiological signal-based emotion recognition showed that physiological signals from the human body, such as such as heart rate, blood pressure, body temperature, brain activity, and muscle activation can provide insights into emotions. Tactile interaction is an inherent part of natural interaction between humans and the same holds true for robots interacting with humans as well. The type of touch can be used to infer a lot of things such as the human’s state of mind, the nature of the object, what is expected out of the interaction, etc. [ 96 ]. Mainly two kinds of tactile sensors are used for this purpose - sensors embedded on the robot’s arms and grippers, and cover based sensors which are used to detect touch across entire regions or the whole body [ 96 ]. Khurshid et al. [ 110 ] investigated the impact of grip-force, contact, and acceleration feedback on human performance in a teleoperated pick-and-place task. Results indicated that grip-force feedback improved stability and delicate control, while contact feedback improved spatial movement but may vary depending on object stiffness.

An important aspect of inference with all the detected data from the previous section is regarding aligning the perspective of the user and the robot. This allows the robot to better understand the intent of the user regarding the objects or locations they are looking at. This skill is called perspective taking and requires the robot to consider and understand other individuals through motivation, disposition, and contextual attempts. This skill paired with a shared knowledge base allows the individuals and robots to build a reliable theory of mind and collaborate effectively during various types of tasks [ 3 ].

Bera et al. [ 111 ] proposed an emotion-aware navigation algorithm for social robots which combined emotions learned from facial expressions and walking trajectories using an onboard and an overhead camera respectively. The approach achieved accurate emotion detection and enabled socially conscious robot navigation in low-to-medium-density environments.

Substantial progress have been made in all three principal areas discussed in this survey. In Table  2 we compile a list of the most commonly cited humanoids in the literature corresponding to the aforementioned categorization. We conclude with a summary of the trends and possible areas of further research we observed in each of these areas.

Tightly-coupled formulation of state estimation based on MAP seems to be promising for future works as it offers several advantages, such as modularity and enabling seamless integration of new sensor types, and extending generic estimators with accommodating a wider range of perception sources in order to develop a whole-body estimation framework. By integrating high-rate control estimation and non-drifting localization based on SLAM, this framework could provide real-time estimation for locomotion control purposes, and facilitate gait and contact planning.

Another important area of focus is the development of multi-contact detection and estimation methods for arbitrary unknown contact locations. By moving beyond rigid segment assumptions for humanoid structure and augmenting robots with additional sensors, such as strain gauges to directly measure segment deflections; the multi-contact detection and compensating for modeling errors can lead to more accurate state estimation and improved human-robot interactions.

With the availability of improved inference hardware, learning techniques are increasingly being applied in localization, object identification, and mapping, replacing handcrafted feature descriptors. However, visual classifiers like CNNs struggle with unstructured “stuff” compared to regularly shaped objects, necessitating memory-intensive representations such as point clouds and the need for enhanced classifier capabilities. In the field of SLAM, which has robust solutions for static environments, research is focused on handling dynamic obstacles by favoring multi-sensor fusion for increased robustness. Scalability and real-time capability remain challenging due to the potential overload of a humanoid’s onboard computer from wrangling multiple data streams over long sequences. Footstep planning shows a trend towards rapid environment modeling for quick responses, but consistent modeling of dynamic obstacles remains an open challenge. Manipulation and long-term global planning also rely on learning techniques to adapt to unforeseen constraints, requiring representations or embeddings of high-dimensional interactions between perceived elements for complexity reduction. However, finding more efficient, comprehensive, and accurate methods to express these relationships is an ongoing challenge.

Human Robot Interaction

Research in the field of HRI has focused on understanding human intent and emotion through various elements such as body pose, motions, expressions, audio cues, and behavior. Though this may seem natural and trivial from a human’s perspective, it is often a very challenging task to incorporate the same into robotic systems. Despite considerable progress in the above approaches, the ever-changing and unpredictable nature of human interaction necessitates additional steps that incorporate concepts like shared autonomy and shared perception. In this context, contextual information and memory play a crucial role in accurately perceiving the state and intentions of the humans with whom interaction is desired. Current research endeavors are actively focusing on these pivotal topics, striving to enhance the capabilities of humanoid robots in human-robot interactions while also considering trust, safety, explainability, and ethics during these interactions.

Tanguy A, Gergondet P, Comport AI, Kheddar A. Closed-loop RGB-D SLAM multi-contact control for humanoid robots. In: IEEE/SICE Intl symposium on system integration (SII); 2016. p. 51–57.

Fallon MF, Antone M, Roy N, Teller S. Drift-free humanoid state estimation fusing kinematic, inertial and lidar sensing. In: IEEE-RAS Intl Conf on humanoid robots (Humanoids); 2014. p. 112–119.

Matarese M, Rea F, Sciutti A. Perception is only real when shared: a mathematical model for collaborative shared perception in human-robot interaction. Frontiers Robotics AI. 2022; 733954.

Carpentier J, Wieber PB. Recent progress in legged robots locomotion control. Current Robotics Reports. 2021;.

Rotella N, Mason S, Schaal S, Righetti L. Inertial sensor-based humanoid joint state estimation. In: IEEE Intl Conf on Robotics & Automation (ICRA); 2016. p. 1825–1831.

Bloesch M, Hutter M, Hoepflinger MA, Leutenegger S, Gehring C, Remy CD, et al. State estimation for legged robots-consistent fusion of leg kinematics and IMU. Robotics. 2013;.

Rotella N, Blösch M, Righetti L, Schaal S. State estimation for a humanoid robot. In: IEEE/RSJ Intl Conf on intelligent robots and systems (IROS); 2014. p. 952–958.

Xinjilefu X, Feng S, Huang W, Atkeson CG. Decoupled state estimation for humanoids using full-body dynamics. In: IEEE Intl conf on robotics & automation (ICRA); 2014. p. 195–201.

Xinjilefu X, Feng S, Atkeson CG. Center of mass estimator for humanoids and its application in modelling error compensation, fall detection and prevention. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2015. p. 67–73.

Bae H, Jeong H, Oh J, Lee K, Oh JH. Humanoid robot COM kinematics estimation based on compliant inverted pendulum model and robust state estimator. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2018. p. 747–753.

Vigne M, El Khoury A, Di Meglio F, Petit N. State estimation for a legged robot with multiple flexibilities using IMUs: A kinematic approach. IEEE Robotics Auto Lett (RA-L). 2020;.

Camurri M, Ramezani M, Nobili S, Fallon M. Pronto: A multi-sensor state estimator for legged robots in real-world scenarios. Frontiers Robotics AI. 2020;.

Piperakis S, Koskinopoulou M, Trahanias P. Nonlinear state estimation for humanoid robot walking. IEEE Robotics Auto Lett (RA-L). 2018;.

Piperakis S, Koskinopoulou M, Trahanias P. Piperakis S, Koskinopoulou M, Trahanias P, editors.: SEROW. Github;2016. https://github.com/mrsp/serow .

Camurri M, Fallon M, Bazeille S, Radulescu A, Barasuol V, Caldwell DG, et al.. Camurri M, Fallon M, Bazeille S, Radulescu A, Barasuol V, Caldwell DG, et al., editors.: Pronto. Github;2020. https://github.com/ori-drs/pronto .

Hartley R, Ghaffari M, Eustice RM, Grizzle JW. Contact-aided invariant extended Kalman filtering for robot state estimation. Intl J Robotics Res (IJRR). 2020;.

Hartley R, Ghaffari M, Eustice RM, Grizzle JW. Hartley R, Ghaffari M, Eustice RM, Grizzle JW, editors.: InEKF. Github; 2018. https://github.com/RossHartley/invariant-ekf .

Solá J, Vallvé J, Casals J, Deray J, Fourmy M, Atchuthan D, et al. WOLF: A modular estimation framework for robotics based on factor graphs. IEEE Robotics Auto Lett (RA-L). 2022;.

Solá J, Vallvé J, Casals J, Deray J, Fourmy M, Atchuthan D, et al.. Solá J, Vallvé J, Casals J, Deray J, Fourmy M, Atchuthan D, et al., editors.: WOLF. IRI;2022. https://mobile_robotics.pages.iri.upc-csic.es/wolf_projects/wolf_lib/wolf-doc-sphinx/ .

Piperakis S, Timotheatos S, Trahanias P. Unsupervised gait phase estimation for humanoid robot walking. In: IEEE Intl Conf on Robotics & Automation (ICRA); 2019. p. 270–276.

Piperakis S, Timotheatos S, Trahanias P. Piperakis S, Timotheatos S, Trahanias P, editors.: GEM. Github; 2019. https://github.com/mrsp/gem .

Bloesch M. State Estimation for Legged Robots - Kinematics, inertial sensing, and computer vision [Thesis]. ETH Zurich; 2017.

Camurri M. Multisensory state estimation and mapping on dynamic legged robots. Istituto Italiano di Tecnologia and Univ Genoa. 2017;p. 13.

Flayols T, Del Prete A, Wensing P, Mifsud A, Benallegue M, Stasse O. Experimental evaluation of simple estimators for humanoid robots. In: IEEE-RAS Intl Conf on Humanoid Robots (Humanoids); 2017. p.889–895.

Xinjilefu X, Feng S, Atkeson CG. Dynamic state estimation using quadratic programming. In: IEEE/RSJ Intl Conf on Intelligent Robots and Systems (IROS); 2014. p. 989–994.

Fourmy M. State estimation and localization of legged robots: a tightly-coupled approach based on a-posteriori maximization [Thesis]. INSA: Toulouse; 2022.

Google Scholar  

Piperakis S, Trahanias P. Non-linear ZMP based state estimation for humanoid robot locomotion. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2016. p. 202–209.

Mori K, Ayusawa K, Yoshida E. Online center of mass and momentum estimation for a humanoid robot based on identification of inertial parameters. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2018. p. 1–9.

Rotella N, Herzog A, Schaal S, Righetti L. Humanoid momentum estimation using sensed contact wrenches. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2015. p.556–563.

Carpentier J, Benallegue M, Mansard N, Laumond JP. Center-of-mass estimation for a polyarticulated system in contact–a spectral approach. IEEE Trans Robotics (TRO). 2016;.

Bailly F, Carpentier J, Benallegue M, Watier B, Souéres P. Estimating the center of mass and the angular momentum derivative for legged locomotion–a recursive approach. IEEE Robotics Auto Lett (RA-L). 2019;.

Rotella N, Schaal S, Righetti L. Unsupervised contact learning for humanoid estimation and control. In: IEEE Intl conf on robotics & automation (ICRA); 2018. p. 411–417.

Piperakis S, Maravgakis M, Kanoulas D, Trahanias P. Robust contact state estimation in humanoid walking gaits. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2022. p. 6732–6738.

Maravgakis M, Argiropoulos DE, Piperakis S, Trahanias P. Probabilistic contact state estimation for legged robots using inertial information. In: IEEE Intl conf on robotics & automation (ICRA); 2023. p.12163–12169.

Flacco F, Paolillo A, Kheddar A. Residual-based contacts estimation for humanoid robots. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2016. p.409–415.

Manuelli L, Tedrake R. Localizing external contact using proprioceptive sensors: The contact particle filter. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2016. p. 5062–5069.

Vorndamme J, Haddadin S. Rm-Code: proprioceptive real-time recursive multi-contact detection, isolation and identification. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2021. p. 6307–6314.

Vezzani G, Pattacini U, Battistelli G, Chisci L, Natale L. Memory unscented particle filter for 6-DOF tactile localization. IEEE Trans Robotics (TRO). 2017; 1139–1155.

Piperakis S, Kanoulas D, Tsagarakis NG, Trahanias P. Outlier-robust state estimation for humanoid robots. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2019. p. 706–713.

Hartley R, Mangelson J, Gan L, Jadidi MG, Walls JM, Eustice RM, et al. Legged robot state-estimation through combined forward kinematic and preintegrated contact factors. In: IEEE Intl conf on robotics & automation (ICRA); 2018. p. 4422–4429.

Hartley R, Jadidi MG, Gan L, Huang JK, Grizzle JW, Eustice RM. Hybrid Contact preintegration for visual-inertial-contact state estimation using factor graphs. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2018. p. 3783–3790.

Wozniak P, Afrisal H, Esparza RG, Kwolek B. Scene recognition for indoor localization of mobile robots using deep CNN. In: Intl conf on computer vision and graphics (ICCVG); 2018. p.137–147.

Wozniak P, Kwolek B. Place inference via graph-based decisions on deep embeddings and blur detections. In: Intl conf on computational science (ICCS); 2021. p. 178–192.

Ovalle-Magallanes E, Aldana-Murillo NG, Avina-Cervantes JG, Ruiz-Pinales J, Cepeda-Negrete J, Ledesma S. Transfer learning for humanoid robot appearance-based localization in a visual Map. IEEE Access. 2021;p. 6868–6877.

Speck D, Bestmann M, Barros P. Towards real-time ball localization using CNNs. In: RoboCup 2018: Robot World Cup XXII; 2019. p. 337–348.

Teimouri M, Delavaran MH, Rezaei M. A real-time ball detection approach using convolutional neural networks. In: RoboCup 2019: Robot World Cup XXIII; 2019. p. 323–336.

Gabel A, Heuer T, Schiering I, Gerndt R. Jetson, Where is the ball? using neural networks for ball detection at RoboCup 2017. In: RoboCup 2018: Robot World Cup XXII; 2019. p. 181–192.

Cruz N, Leiva F, Ruiz-del-Solar J. Deep learning applied to humanoid soccer robotics: playing without using any color information. Autonomous Robots. 2021;.

Chatterjee S, Zunjani FH, Nandi GC. Real-time object detection and recognition on low-compute humanoid robots using deep learning. In: Intl conf on control, automation and robotics (ICCAR); 2020. p.202–208.

Starr JW, Lattimer BY. Evidential sensor fusion of long-wavelength infrared stereo vision and 3D-LIDAR for rangefinding in fire environments. Fire Technol. 2017;1961–1983.

Nobili S, Scona R, Caravagna M, Fallon M. Overlap-based ICP tuning for robust localization of a humanoid robot. In: IEEE Intl conf on robotics & automation (ICRA); 2017. p. 4721–4728.

Raghavan VS, Kanoulas D, Zhou C, Caldwell DG, Tsagarakis NG. A study on low-drift state estimation for humanoid locomotion, using LiDAR and kinematic-inertial data fusion. In: IEEE-RAS Intl Conf on Humanoid Robots (Humanoids); 2018. p. 1–8.

Scona R, Nobili S, Petillot YR, Fallon M. Direct visual SLAM fusing proprioception for a humanoid robot. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS). IEEE; 2017. p. 1419–1426.

Tanguy A, De Simone D, Comport AI, Oriolo G, Kheddar A. Closed-loop MPC with dense visual SLAM-stability through reactive stepping. In: IEEE Intl conf on robotics & automation (ICRA). IEEE; 2019. p. 1397–1403.

Zhang T, Zhang H, Li Y, Nakamura Y, Zhang L. Flowfusion: Dynamic dense rgb-d slam based on optical flow. In: IEEE Intl conf on robotics & automation (ICRA). IEEE; 2020. p.7322–7328.

Zhang T, Uchiyama E, Nakamura Y. Dense rgb-d slam for humanoid robots in the dynamic humans environment. In: IEEE-RAS Intl conf on humanoid robots (Humanoids). IEEE; 2018. p. 270–276.

Zhang T, Nakamura Y. Hrpslam: A benchmark for rgb-d dynamic slam and humanoid vision. In: IEEE Intl conf on robotic computing (IRC). IEEE; 2019. p. 110–116.

Sewtz M, Luo X, Landgraf J, Bodenmüller T, Triebel R. Robust approaches for localization on multi-camera systems in dynamic environments. In: Intl conf on automation, robotics and applications (ICARA). IEEE; 2021. p. 211–215.

Mur-Artal R, Montiel JMM, Tardos JD. ORB-SLAM: A Versatile and accurate monocular SLAM system. IEEE Trans Robotics (TRO). 2015;1147–1163.

Ginn D, Mendes A, Chalup S, Fountain J. Monocular ORB-SLAM on a humanoid robot for localization purposes. In: AI: Advances in artificial intelligence; 2018. p. 77–82.

Bista SR, Giordano PR, Chaumette F. Combining line segments and points for appearance-based indoor navigation by image based visual servoing. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2017. p. 2960–2967.

Regier P, Milioto A, Karkowski P, Stachniss C, Bennewitz M. Classifying obstacles and exploiting knowledge about classes for efficient humanoid navigation. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2018. p. 820–826.

Ferro M, Paolillo A, Cherubini A, Vendittelli M. Vision-Based navigation of omnidirectional mobile robots. IEEE Robotics Auto Lett (RA-L). 2019;2691–2698.

Juang LH, Zhang JS. Robust visual line-following navigation system for humanoid robots. Artif Intell Rev. 2020;653–670.

Magassouba A, Bertin N, Chaumette F. Aural Servo: Sensor-based control from robot audition. IEEE Trans Robotics (TRO). 2018;572–585.

Abiyev RH, Arslan M, Gunsel I, Cagman A. Robot pathfinding using vision based obstacle detection. In: IEEE Intl conf on cybernetics (CYBCONF); 2017. p. 1–6.

Lobos-Tsunekawa K, Leiva F, Ruiz-del-Solar J. Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robotics Auto Lett (RA-L). 2018; 3247–3254.

Silva IJ, Junior COV, Costa AHR, Bianchi RAC. Toward robotic cognition by means of decision tree of deep neural networks applied in a humanoid robot. J Control Autom Electr Syst. 2021; 884–894.

Hildebrandt AC, Wittmann R, Sygulla F, Wahrmann D, Rixen D, Buschmann T. Versatile and robust bipedal walking in unknown environments: real-time collision avoidance and disturbance rejection. Autonomous Robots. 2019;1957–1976.

Kanoulas D, Stumpf A, Raghavan VS, Zhou C, Toumpa A, Von Stryk O, et al. Footstep planning in rough terrain for bipedal robots using curved contact patches. In: IEEE Intl Conf on Robotics & Automation (ICRA); 2018. p. 4662–4669.

Kanoulas D, Tsagarakis NG, Vona M. Curved patch mapping and tracking for irregular terrain modeling: application to bipedal robot foot placement. J Robotics Autonomous Syst (RAS). 2019; 13–30.

Bertrand S, Lee I, Mishra B, Calvert D, Pratt J, Griffin R. Detecting usable planar regions for legged robot locomotion. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2020. p. 4736–4742.

Roychoudhury A, Missura M, Bennewitz M. 3D Polygonal mapping for humanoid robot navigation. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2022. p.171–177.

Missura M, Roychoudhury A, Bennewitz M. Polygonal perception for mobile robots. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2020. p. 10476–10482.

Suryamurthy V, Raghavan VS, Laurenzi A, Tsagarakis NG, Kanoulas D. Terrain segmentation and roughness estimation using rgb data:path planning application on the CENTAURO robot. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2019. p. 1–8.

Osswald S, Karkowski P, Bennewitz M. Efficient coverage of 3D environments with humanoid robots using inverse reachability maps. In: IEEE-RAS Intl conf on humanoid robots (Humanoids); 2017. p.151–157.

Osswald S, Bennewitz M. GPU-accelerated next-best-view coverage of articulated scenes. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2018. p. 603–610.

Monica R, Aleotti J, Piccinini D. Humanoid robot next best view planning under occlusions using body movement primitives. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2019. p. 2493–2500.

Tsuru M, Escande A, Tanguy A, Chappellet K, Harad K. Online object searching by a humanoid robot in an unknown environment. IEEE Robotics Auto Lett (RA-L). 2021; 2862–2869.

Wang X, Benozzi L, Zerr B, Xie Z, Thomas H, Clement B. Formation building and collision avoidance for a fleet of NAOs based on optical sensor with local positions and minimum communication. Sci China Inf Sci. 2019;335–350.

Liu Y, Xie D, Zhuo HH, Lai L, Li Z. Temporal planning-based choreography from music. In: Sun Y, Lu T, Guo Y, Song X, Fan H, Liu D, et al., editors. Computer Supported Cooperative Work and Social Computing (CSCW); 2023. p. 89–102.

Schmidt P, Vahrenkamp N, Wachter M, Asfour T. Grasping of unknown objects using deep convolutional neural networks based on depth images. In: IEEE Intl conf on robotics & automation (ICRA); 2018. p.6831–6838.

Vezzani G, Pattacini U, Natale L. A grasping approach based on superquadric models. In: IEEE Intl conf on robotics & automation (ICRA); 2017. p.1579–1586.

Vicente P, Jamone L, Bernardino A. Towards markerless visual servoing of grasping tasks for humanoid robots. In: IEEE Intl conf on robotics & automation (ICRA); 2017. p.3811–3816.

Nguyen PDH, Fischer T, Chang HJ, Pattacini U, Metta G, Demiris Y. Transferring visuomotor learning from simulation to the real world for robotics manipulation tasks. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2018. p. 6667–6674.

Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG. Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2017. p. 5908–5915.

Kaboli M, Cheng G. Robust Tactile descriptors for discriminating objects from textural properties via artificial robotic skin. IEEE Trans on Robotics (TRO). 2018;985–1003.

Hundhausen F, Grimm R, Stieber L, Asfour T. Fast reactive grasping with in-finger vision and In-Hand FPGA-accelerated CNNs. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2021. p. 6825–6832.

Nguyen A, Kanoulas D, Muratore L, Caldwell DG, Tsagarakis NG. Translating videos to commands for robotic manipulation with deep recurrent neural networks. In: IEEE Intl conf on robotics & automation (ICRA); 2017. p. 3782–3788.

Kase K, Suzuki K, Yang PC, Mori H, Ogata T. Put-in-box task generated from multiple discrete tasks by ahumanoid robot using deep learning. In: IEEE Intl conf on robotics & automation (ICRA); 2018. p.6447–6452.

Inceoglu A, Ince G, Yaslan Y, Sariel S. Failure detection using proprioceptive, auditory and visual modalities. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2018. p. 2491–2496.

Zhang F, Cully A, Demiris Y. Personalized robot-assisted dressing using user modeling in latent spaces. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2017. p. 3603–3610.

Erickson Z, Collier M, Kapusta A, Kemp CC. Tracking human pose during robot-assisted dressing using single-axis capacitive proximity sensing. IEEE Robotics Auto Lett (RA-L). 2018; 2245–2252.

Zhang F, Demiris Y. Learning grasping points for garment manipulation in robot-assisted dressing. In: IEEE Intl conf on robotics & automation (ICRA); 2020. p. 9114–9120.

Narayanan V, Manoghar BM, Dorbala VS, Manocha D, Bera A. Proxemo: Gait-based emotion learning and multi-view proxemic fusion for socially-aware robot navigation. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS); 2020. p. 8200–8207.

Yan H, Ang MH, Poo AN. A survey on perception methods for human–robot interaction in social robots. Intl J Soc Robotics. 2014; 85–119.

Robinson N, Tidd B, Campbell D, Kulić D. Corke P. Robotic vision for human-robot interaction and collaboration: A survey and systematic review. ACM Trans on Human-Robot Interaction; 2023. p. 1–66.

Möller R, Furnari A, Battiato S, Härmä A, Farinella GM. A survey on human-aware robot navigation. J Robotics Autonomous Syst (RAS). 2021;103837.

Samadiani N, Huang G, Cai B, Luo W, Chi CH, Xiang Y, et al. A review on automatic facial expression recognition systems assisted by multimodal sensor data. IEEE Sensors J. 2019;1863.

Hwang CL, Liao GH. Real-time pose imitation by mid-size humanoid robot with servo-cradle-head RGB-D vision system. IEEE Intl conf on systems, man, and cybernetics (SMC). 2019;p.181–191.

Lv H, Kong D, Pang G, Wang B, Yu Z, Pang Z, et al. GuLiM: A hybrid motion mapping technique for teleoperation of medical assistive robot in combating the COVID-19 pandemic. IEEE Trans Medical Robotics Bionics. 2022;106–117.

Li S, Jiang J, Ruppel P, Liang H, Ma X, Hendrich N, et al. A mobile robot hand-arm teleoperation system by vision and IMU. In: IEEE/RSJ Intl conf on intelligent robots and systems (IROS);2020. p. 10900–10906.

Badr AA, Abdul-Hassan AK. A review on voice-based interface for human-robot interaction. Iraqi J Electric Electron Eng. 2020;91–102.

Benaroya EL, Obin N, Liuni M, Roebel A, Raumel W, Argentieri S. Binaural localization of multiple sound sources by non-negative tensor factorization. IEEE/ACM Trans on Audio Speech Lang Process. 2018; 1072–1082.

Schymura C, Isenberg T, Kolossa D. Extending linear dynamical systems with dynamic stream weights for audiovisual speaker localization. In: International workshop on acoustic signal enhancement (IWAENC); 2018. p. 515–519.

Schymura C, Kolossa D. Audiovisual speaker tracking using nonlinear dynamical systems with dynamic stream weights. IEEE/ACM Trans on Audio Speech Lang Process. 2020; 1065–1078.

Dávila-Chacón J, Liu J, Wermter S. Enhanced robot speech recognition using biomimetic binaural sound source localization. IEEE Trans on Neural Netw Learn Syst. 2019; 138–150.

Trowitzsch I, Schymura C, Kolossa D, Obermayer K. Joining sound event detection and localization through spatial segregation. IEEE/ACM Trans on Audio Speech Lang Process. 2020; 487–502.

Ahmad Z, Khan N. A survey on physiological signal-based emotion recognition. Bioengineering. 2022;p. 688.

Khurshid RP, Fitter NT, Fedalei EA, Kuchenbecker KJ. Effects of grip-force, contact, and acceleration feedback on a teleoperated pick-and-place task. IEEE Trans on Haptics. 2017; 985–1003.

Bera A, Randhavane T, Manocha D. Modelling multi-channel emotions using facial expression and trajectory cues for improving socially-aware robot navigation. In: IEEE/CVF Conf on computer vision and pattern recognition workshops (CVPRW); 2019. p. 257–266.

Jin Y, Lee M. Enhancing binocular depth estimation based on proactive perception and action cyclic learning for an autonomous developmental robot. IEEE Intl conf on systems, man, and cybernetics (SMC). 2018;p. 169–180.

Hoffmann M, Straka Z, Farkaš I, Vavrečka M, Metta G. Robotic homunculus: learning of artificial skin representation in a humanoid robot motivated by primary somatosensory cortex. IEEE Trans Cognitive Develop Syst. 2017;163–176.

Download references

Acknowledgements

This work has partially been funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under BE 4420/4-1 within the FOR 5351 – 459376902 –AID4Crops and under Germany’s Excellence Strategy, EXC-2070 – 390732324 –PhenoRob.

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Humanoid Robots Lab, University of Bonn, Bonn, Germany

Arindam Roychoudhury, Shahram Khorshidi, Subham Agrawal & Maren Bennewitz

Lamarr Institute for Machine Learning and Artificial Intelligence, Dortmund, Germany

Maren Bennewitz

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Arindam Roychoudhury .

Ethics declarations

Conflict of interest.

The authors declare no competing interests.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Roychoudhury, A., Khorshidi, S., Agrawal, S. et al. Perception for Humanoid Robots. Curr Robot Rep 4 , 127–140 (2023). https://doi.org/10.1007/s43154-023-00107-x

Download citation

Accepted : 14 September 2023

Published : 28 November 2023

Issue Date : December 2023

DOI : https://doi.org/10.1007/s43154-023-00107-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Humanoid robots
  • State estimation
  • Human robot interaction

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

Featured Article

Humanoid robots are learning to fall well

Boston dynamics and agility are teaching their bipedal robots to brace for the inevitable.

Boston Dynamics... robot down!

The savvy marketers at Boston Dynamics produced two major robotics news cycles last week. The larger of the two was, naturally, the electric Atlas announcement . As I write this, the sub-40 second video is steadily approaching five million views. A day prior, the company tugged at the community’s heart strings when it announced that the original hydraulic Atlas was being put out to pasture , a decade after its introduction.

The accompanying video was a celebration of the older Atlas’ journey from DARPA research project to an impressively nimble bipedal ’bot. A minute in, however, the tone shifts. Ultimately, “Farewell to Atlas” is as much a celebration as it is a blooper reel. It’s a welcome reminder that for every time the robot sticks the landing on video there are dozens of slips, falls and sputters.

I’ve long championed this sort of transparency. It’s the sort of thing I would like to see more from the robotics world. Simply showcasing the highlight reel does a disservice to the effort that went into getting those shots. In many cases, we’re talking years of trial and error spent getting robots to look good on camera. When you only share the positive outcomes, you’re setting unrealistic expectations. Bipedal robots fall over. In that respect, at least, they’re just like us. As Agility put it recently , “Everyone falls sometimes, it’s how we get back up that defines us.” I would take that a step further, adding that learning how to fall well is equally important.

The company’s newly appointed CTO, Pras Velagapudi, recently told me that seeing robots fall on the job at this stage is actually a good thing. “When a robot is actually out in the world doing real things, unexpected things are going to happen,” he notes. “You’re going to see some falls, but that’s part of learning to run a really long time in real-world environments. It’s expected, and it’s a sign that you’re not staging things.”

A quick scan of Harvard’s rules for falling without injury reflects what we intuitively understand about falling as humans:

  • Protect your head
  • Use your weight to direct your fall
  • Bend your knees
  • Avoid taking other people with you

As for robots, this IEEE Spectrum piece from last year is a great place to start.

“We’re not afraid of a fall—we’re not treating the robots like they’re going to break all the time,” Boston Dynamics CTO Aaron Saunders told the publication last year. “Our robot falls a lot, and one of the things we decided a long time ago [is] that we needed to build robots that can fall without breaking. If you can go through that cycle of pushing your robot to failure, studying the failure, and fixing it, you can make progress to where it’s not falling. But if you build a machine or a control system or a culture around never falling, then you’ll never learn what you need to learn to make your robot not fall. We celebrate falls, even the falls that break the robot.”

The subject of falling also came up when I spoke with Boston Dynamics CEO Robert Playter ahead of the electric Atlas’ launch. Notably, the short video begins with the robot in a prone position. The way the robot’s legs arc around is quite novel, allowing the system to stand up from a completely flat position. At first glance, it almost feels as though the company is showing off, using the flashy move simply as a method to showcase the extremely robust custom-built actuators.

“There will be very practical uses for that,” Playter told me. “Robots are going to fall. You’d better be able to get up from prone.” He adds that the ability to get up from a prone position may also be useful for charging purposes.

Much of Boston Dynamics’ learnings around falling came from Spot. While there’s generally more stability in the quadrupedal form factor (as evidenced from decades trying and failing to kick the robots over in videos), there are simply way more hours of Spot robots working in real-world conditions.

“Spot’s walking something like 70,000 kms a year on factory floors, doing about 100,000 inspections per month,” adds Playter. “They do fall, eventually. You have to be able to get back up. Hopefully you get your fall rate down — we have. I think we’re falling once every 100-200 kms. The fall rate has really gotten small, but it does happen.”

Playter adds that the company has a long history of being “rough” on its robots. “They fall, and they’ve got to be able to survive. Fingers can’t fall off.”

Watching the above Atlas outtakes, it’s hard not to project a bit of human empathy onto the ’bot. It really does appear to fall like a human, drawing its extremities as close to its body as possible, to protect them from further injury.

With a 99% success rate over about 20 hours of live demos, Digit still took a couple of falls at ProMat. We have no proof, but we think our sales team orchestrated it so they could talk about Digits quick-change limbs and durability. #ConspiracyTheories pic.twitter.com/aqC5rhvBTj — Agility Robotics (@agilityrobotics) April 6, 2023

When Agility added arms to Digit, back in 2019, it discussed the role they play in falling. “For us, arms are simultaneously a tool for moving through the world — think getting up after a fall, waving your arms for balance, or pushing open a door — while also being useful for manipulating or carrying objects,” co-founder Jonathan Hurst noted at the time .

I spoke a bit to Agility about the topic at Modex earlier this year. Video of a Digit robot falling over on a convention floor a year prior had made the social media rounds. “With a 99% success rate over about 20 hours of live demos, Digit still took a couple of falls at ProMat,” Agility noted at the time. “We have no proof, but we think our sales team orchestrated it so they could talk about Digits quick-change limbs and durability.”

As with the Atlas video, the company told me that something akin to a fetal position is useful in terms of protecting the robot’s legs and arms.

The company has been using reinforcement learning to help fallen robots right themselves. Agility shut off Digit’s obstacle avoidance for the above video to force a fall. In the video, the robot uses its arms to mitigate the fall as much as possible. It then utilizes its reinforcement learnings to return to a familiar position from which it is capable of standing again with a robotic pushup.

One of humanoid robots’ main selling points is their ability to slot into existing workflows — these factories and warehouses are known as “brownfield,” meaning they weren’t custom built for automation. In many existing cases of factory automation, errors mean the system effectively shuts down until a human intervenes.

"We'll be singing / when we're winning." 🔊 pic.twitter.com/51DYD1Avvg — Brian Heater (@bheater) August 17, 2021

“Rescuing a humanoid robot is not going to be trivial,” says Playter, noting that these systems are heavy and can be difficult to manually right. “How are you going to do that if it can’t get itself off the ground?”

If these systems are truly going to ensure uninterrupted automation, they’ll need to fall well and get right back up again.

“Every time Digit falls, we learn something new,” adds Velagapudi. “When it comes to bipedal robotics, falling is a wonderful teacher.”

More TechCrunch

Get the industry’s biggest tech news, techcrunch daily news.

Every workday and Sunday, you can get the best of TechCrunch’s coverage.

Startups Weekly

Startups are the core of TechCrunch, so get our best coverage delivered weekly.

TechCrunch Fintech

The latest Fintech news and analysis, delivered every Sunday.

TechCrunch Mobility

TechCrunch Mobility is your destination for transportation news and insight.

AI chip startup DEEPX secures $80M Series C at a $529M valuation 

The Series C funding, which brings its total raise to around $95 million, will go toward mass production of the startup’s inaugural products

AI chip startup DEEPX secures $80M Series C at a $529M valuation 

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

A dust-up between Evolve Bank & Trust, Mercury and Synapse has led TabaPay to abandon its acquisition plans of troubled banking-as-a-service startup Synapse.

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

Apple’s ‘Crush’ ad is disgusting

The problem is not the media, but the message.

Apple’s ‘Crush’ ad is disgusting

Google built some of the first social apps for Android, including Twitter and others

The Twitter for Android client was “a demo app that Google had created and gave to us,” says Particle co-founder and ex-Twitter employee Sara Beykpour.

Google built some of the first social apps for Android, including Twitter and others

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

WhatsApp is updating its mobile apps for a fresh and more streamlined look, while also introducing a new “darker dark mode,” the company announced on Thursday. The messaging app says…

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

Plinky is an app for you to collect and organize links easily

Plinky lets you solve the problem of saving and organizing links from anywhere with a focus on simplicity and customization.

Plinky is an app for you to collect and organize links easily

Google I/O 2024: How to watch

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

Triomics raises $15M Series A to automate cancer clinical trials matching

For cancer patients, medicines administered in clinical trials can help save or extend lives. But despite thousands of trials in the United States each year, only 3% to 5% of…

Triomics raises $15M Series A to automate cancer clinical trials matching

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Tap, tap.…

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

Reddit locks down its public data in new content policy, says use now requires a contract

The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and…

Reddit locks down its public data in new content policy, says use now requires a contract

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

Eva Ho plans to step away from her position as general partner at Fika Ventures, the Los Angeles-based seed firm she co-founded in 2016. Fika told LPs of Ho’s intention…

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

Amazon’s CTO built a meeting-summarizing app for some reason

In a post on Werner Vogels’ personal blog, he details Distill, an open-source app he built to transcribe and summarize conference calls.

Amazon’s CTO built a meeting-summarizing app for some reason

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

Paris-based Mistral AI, a startup working on open source large language models — the building block for generative AI services — has been raising money at a $6 billion valuation,…

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

Google I/O 2024: What to expect

You can expect plenty of AI, but probably not a lot of hardware.

Google I/O 2024: What to expect

Bumble says it’s looking to M&A to drive growth

Dating apps and other social friend-finders are being put on notice: Dating app giant Bumble is looking to make more acquisitions.

Bumble says it’s looking to M&A to drive growth

Blackboard founder transforms Zoom add-on designed for teachers into business tool

When Class founder Michael Chasen was in college, he and a buddy came up with the idea for Blackboard, an online classroom organizational tool. His original company was acquired for…

Blackboard founder transforms Zoom add-on designed for teachers into business tool

Groww joins the first wave of Indian startups moving domiciles back home from US

Groww, an Indian investment app, has become one of the first startups from the country to shift its domicile back home.

Groww joins the first wave of Indian startups moving domiciles back home from US

Dell discloses data breach of customers’ physical addresses

Technology giant Dell notified customers on Thursday that it experienced a data breach involving customers’ names and physical addresses. In an email seen by TechCrunch and shared by several people…

Dell discloses data breach of customers’ physical addresses

Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

The Israeli startup has raised $5.5M for its platform that uses “statistical AI” to generate synthetic data that it says is as good as the real thing.

Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

Hydrow, the at-home rowing machine maker, announced Thursday that it has acquired a majority stake in Speede Fitness, the company behind the AI-enabled strength training machine. The rowing startup also…

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

Retell AI lets companies build ‘voice agents’ to answer phone calls

Call centers are embracing automation. There’s debate as to whether that’s a good thing, but it’s happening — and quite possibly accelerating. According to research firm TechSci Research, the global…

Retell AI lets companies build ‘voice agents’ to answer phone calls

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

TikTok is starting to automatically label AI-generated content that was made on other platforms, the company announced on Thursday. With this change, if a creator posts content on TikTok that…

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

India likely to delay UPI market caps in win for PhonePe-Google Pay duopoly

India’s mobile payments regulator is likely to extend the deadline for imposing market share caps on the popular UPI (unified payments interface) payments rail by one to two years, sources…

India likely to delay UPI market caps in win for PhonePe-Google Pay duopoly

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

Line Man Wongnai, an on-demand food delivery service in Thailand, is considering an initial public offering on a Thai exchange or the U.S. in 2025.

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Ever wonder why conversational AI like ChatGPT says “Sorry, I can’t do that” or some other polite refusal? OpenAI is offering a limited look at the reasoning behind its own…

OpenAI offers a peek behind the curtain of its AI’s secret instructions

US Patent and Trademark Office confirms another leak of filers’ address data

The federal government agency responsible for granting patents and trademarks is alerting thousands of filers whose private addresses were exposed following a second data spill in as many years. The…

US Patent and Trademark Office confirms another leak of filers’ address data

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

As part of an investigation into people involved in the pro-independence movement in Catalonia, the Spanish police obtained information from the encrypted services Wire and Proton, which helped the authorities…

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Match looks to Hinge as Tinder fails

Match Group, the company that owns several dating apps, including Tinder and Hinge, released its first-quarter earnings report on Tuesday, which shows that Tinder’s paying user base has decreased for…

Match looks to Hinge as Tinder fails

Gratitude Plus makes social networking positive, private and personal

Private social networking is making a comeback. Gratitude Plus, a startup that aims to shift social media in a more positive direction, is expanding its wellness-focused, personal reflections journal to…

Gratitude Plus makes social networking positive, private and personal

Can AI help founders fundraise more quickly and easily?

With venture totals slipping year-over-year in key markets like the United States, and concern that venture firms themselves are struggling to raise more capital, founders might be worried. After all,…

Can AI help founders fundraise more quickly and easily?

$node.PillImageNode.Description

Humanoid Robots: Sooner Than You Might Think

research on humanoid robots

GS Research makes an additional, more ambitious projection as well. “Should the hurdles of product design, use case, technology, affordability and wide public acceptance be completely overcome, we envision a market of up to US$154bn by 2035 in a blue-sky scenario,” say the authors of the report The investment case for humanoid robots. A market that size could fill from 48% to 126% of the labor gap, and as much as 53% of the elderly caregiver gap.

Obstacles remain: Today’s humanoid robots can work in only short one- or two-hour bursts before they need recharging. Some humanoid robots have mastered mobility and agility movements, while others can handle cognitive and intellectual challenges – but none can do both, the research says. One of the most advanced robot-like technologies on the commercial market is a self-driving vehicle, but a humanoid robot would have to have greater intelligence and processing abilities than that – by a significant order. “In the history of humanoid robot development,” the report says, “no robots have been successfully commercialized yet.”

That said, there’s a pathway to humanoid robots becoming smarter and less expensive than a new electric vehicle. Goldman Sachs suggests humanoid robots could be economically viable in factory settings between 2025 to 2028, and in consumer applications between 2030 and 2035. Several assumptions support that outlook, and the Goldman Sachs Research report details the multiple breakthroughs that have to happen in order for it to come to fruition.

  • The battery life of humanoid robots would have to improve to the point where one could work for up to 20 hours before requiring recharging (or need fast charging for one hour and work for four to five hours, then repeat).
  • Mobility and agility would have to incrementally increase, and the processing abilities of such robots would also have to make steady gains. In addition, depth cameras, force feedback, visual and voice sensors, and other aspects of sensing—the robot’s nerves and sensory organs—will all need to get incrementally better.
  • There will also need to be gains in computation, so that robots can avoid obstacles, screen the shortest route to completion of a task, and react to questions.
  • Still more challenging will be the process of training and refining the abilities of humanoid robots once they begin working. This process can take upwards of a year.
  • And finally, robot makers will need to bring down production costs by roughly 15-20% a year in order for the humanoid robot to be able to pay for itself in two years.

Those difficulties may seem daunting but there’s a precedent for working through them. The report draws on the experience of factory collaborative robots – or “cobots” – that are now regularly part of manufacturing centers such as auto plants. These took roughly seven to 10 years to go from their first commercially available versions to batch sales. They faced, as humanoid robots now do, significant skepticism.

And like today’s still-struggling humanoid robots, they had much to learn in terms of dexterity and responsiveness. But today, cobots are commonplace in certain industrial applications, and humanoid robots could find a place as well. “Humanoid robot solutions could also be attractive in fields that current major industrial robot makers are having difficulty serving,” states the report, “including the warehouse management/logistics management field and fields that are simple and have a heavy human burden, such as moving goods up and down stairs.”

In a household setting, the range of challenges would be far more complex. “Consumer/household applications are significantly harder to design due to more diverse application scenarios, diverse object recognition, more complicated navigation system, etc,” says the report. That’s setting aside how ordinary people react and respond to the humanoid robots themselves. “There are many other issues to consider, including conflicts surrounding replacing human workers, trust and safety, privacy of the data that they collect, their resemblance to actual humans, the inability to truly replace human emotions, brain-computer-interaction-related issues, and ethics concerning their autonomy.”

Explore More Insights

Sign up for briefings, a newsletter from goldman sachs about trends shaping markets, industries and the global economy..

Thank you for subscribing to BRIEFINGS: a newsletter from Goldman Sachs about trends shaping markets, industries and the global economy.

Some error occurred. Please refresh the page and try again.

Invalid input parameters. Please refresh the page and try again.

Connect With Us

research on humanoid robots

Generative AI is speeding up human-like robot development. What that means for jobs

  • ChatGPT-like artificial intelligence is speeding up research and bringing humanoid robots closer to reality in China, home to many of the world's factories.

In robotics, the development of generative AI can help machines with understanding and perceiving their environment, said Li Zhang, chief operating officer of Shenzhen-based LimX Dynamics.

  • Even if AI allows a robot to think and make decisions on par with humans, mechanical limitations are a major reason why humanoids can't yet replace human laborers, he said.

BEIJING — ChatGPT-like artificial intelligence is speeding up research and bringing humanoid robots closer to reality in China, home to many of the world's factories.

AI has been around for decades. What's changed with the emergence of OpenAI's ChatGPT chatbot is the ability of AI to better understand and generate content in a human-like way. While the U.S.-based tech is not officially available in China, local companies such as Baidu have released similar chatbots and AI models.

About three months after joining the two-year-old startup, Li said he shortened his expectations for how long it would take LimX to produce a humanoid robot capable of not just factory work, but also helping out in a households.

Li originally expected the entire process to take eight to ten years, but now anticipates some use cases will be ready in five to seven years. "After working for a few months, I saw how various tools' abilities were improved because of AI," he said in Mandarin, translated by CNBC.

"It has accelerated our entire research and development cycle," he said.

Companies are rushing into the opportunity. OpenAI itself is backing humanoid robot startups , while Elon Musk's Tesla is developing its own , called Optimus.

Electric car giant BYD last year invested in Shanghai-based Agibot just months after its founding, according to PitchBook.

And at a high level, Chinese state media in November published a photo of Chinese President Xi Jinping watching a humanoid robot at an exhibition center during his first trip to Shanghai since the pandemic. The robot was developed by Fourier Intelligence .

Before humanoid robots reach households, as LimX ultimately intends, factories can be a lucrative, enclosed scenario in which to deploy them.

China surpassed Japan in 2013 as the world's largest installer of industrial robots, and now accounts for more than 50% of the global total, according to Stanford's latest AI Index report .

Electronics, automotive and metal and machinery were the three leading sectors for industrial robot installation in China, the report said.

Impact on human jobs

When it comes fully replacing human workers, however, AI advancements alone aren't enough.

Even if AI allows a robot to think and make decisions on par with humans, mechanical limitations are a major reason why humanoids can't yet replace human laborers, LimX's Li said.

One of LimX's backers, Future Capital, has also invested in a company called Pan Motor that specializes in motors for humanoids.

Generative AI doesn't directly help with robotic motion, pointed out Eric Xia, partner at Future Capital, an investor in LimX. But "advances in large language models can help humanoid robots with advanced task planning," he said in Chinese, translated by CNBC.

LimX's other investors include Lenovo Capital.

A shift toward factory robots can accelerate, once the cost-per-robot comes down.

Steve Hoffman, chairman of a startup accelerator called Founders Space, said he is working with a Chinese startup called Fastra, which he expects can begin mass robot production in one year. He said he spent time in China this year teaching local businesses how to integrate generative AI.

"We have already received six orders from research institutions," he said, noting the startup aims to lower the cost per robot to between $50,000 to $100,000 by rollout.

"If we can hit a $50,000 price point, we can sell a lot of robots," he said, pointing out the robots' batteries can be charged as they work, 24 hours a day. "Could pay for the robot in a year."

In pharmaceutical research, generative AI can reduce costs, without cutting into human labor.

"You don't save costs in our business by having less people. You actually save costs by making fewer experiments that fail," said Alex Zhavoronkov, chairman of the board, executive director and CEO of Insilico Medicine, which has offices in Hong Kong, New York and other parts of the world.

He noted how large pharmaceutical companies have typically had to spend thousands of dollars to replicate a molecule for testing — and would run a few thousand such tests per program. He claimed that with the help of AI, Insilico only needs to synthesize about 70 molecules per program.

The company published a paper in Nature in March claiming to have reached phase 2 clinical trials for an AI-generated drug .

Shenzhen-based LimX Dynamics shows off one of its humanoid robots.

TALOS research: torque-controlled locomotion for humanoids in unknown environments

TALOS - locomotion for humanoids in unknown envrionments

In recent research the team at PAL Robotics and our French partners, TOWARD and Dynamograde lab , using a new framework, enabled the biped robot, TALOS to navigate and climb up to 15cm stairs without prior knowledge of the environment for the first time. Read on for our interview with Pierre Fernbach and to find out more about this research, which aims to help humanoids develop the skills needed for future work in industry.

TALOS navigating with limited knowledge of the environment and limited instructions

Pierre Fernbach , Software Engineer at PAL Robotics’ French partner, TOWARD explained this project and its importance for the progress and development of humanoid robots, telling us, “This project was the first time we have used this approach on life-sized robots, such as PAL Robotics’ humanoid biped, TALOS. The project started about six months ago, but is based on years of prior development and research. We wanted to work on generalised locomotion – for example to request a robot to navigate to a certain point, and to be able to reach that point without providing the robot with further information, even if it means that the robot needs to climb stairs, open doors, etc, to reach that point. With humanoid robots in general, for them to help in industrial or domestic environments for example, this is a very beneficial goal.”  

Pierre added, “ Research papers and their approach, including, ‘ SL1M: Sparse L1-norm Minimization for contact planning on uneven terrain ’ helped to inspire this research.” 

An additional challenge is that for legged robots including TALOS, following a pre-planned trajectory can also present complexities due to the possible accumulation of errors in the control and estimation processes. To cope with these the robot needs to be able to re-evaluate its plans. Here the team worked with a new framework that helps the robot to do just that by creating a local map of the environment. For this, the ground in front of the robot is captured using the LiDAR sensor on the robot’s base.

The robot base state is then re-constructed using the IMU and the kinematics odometry, and an elevation map is constructed. Then a contact planner is used to compute a path in the measured environment. From the contact sequence, a trajectory is computed for the next few seconds of the walk using a model predictive controller.

Finally, the joint torques are computed using inverse dynamics and sent to the robot’s actuators. With this framework, the biped robot, TALOS, for the first time was able to climb 15cm stairs without prior knowledge of the environment. 

Pierre told us, “For this project we have been collaborating with our research partner, LAAS-CNRS as the contact planner was originally developed by them.” Pierre continued, “ In order to control TALOS in torque, we have been using inverse dynamics and a stabiliser for several years already with good results. However, to use these for dynamic motions, both on flat floor and complex terrain, we implemented an MPC method that generates the reference trajectories that we send to the inverse dynamics.”

Pierre explained, “As shown in the video, this method worked well and allowed us to generate different kinds of dynamic motions in challenging environments, but for the complex terrains we needed to provide the position of the footsteps. So the next step was to compute these footsteps automatically. For this we needed perception, the robot alr eady had LiDARs but we needed software that is able to correctly extract possible contact surfaces from the sensor’s data. We implemented our solution based on open source state of the art libraries for perception and mapping. We also use d a state-of-the-art contact planner. After this, we had the footsteps and could loop with the method mentioned previously.”

research on humanoid robots

Illustration of “generalised locomotion” with TALOS in the experimental room of the LAAS-CNRS

Impact of research including contributing to greater autonomy of humanoid robots

Pierre told us, “We believe projects such as this one will contribute to greater autonomy of humanoid robots, with less need to define exactly how the robot will be used.” 

“In relation to this work, we also recently m ade a video with TALOS where the robot walks up stairs, following this, we move the stairs, and the robot then needs to adapt to the new position of stairs. Our goal is to give general tasks to humanoid robots such as TALOS and not tell the robot exactly how to do them.”

Pierre added, “We have a software architecture with blocks that are connected together with ROS for solving challenges such as these – where humanoids are given general tasks and not told how to do them. However, if a customer has a specific need we can adapt this software architecture for them. The customer can also replace any of the blocks in order to experiment with their own methodology.”

research on humanoid robots

This figure is the final “autonomous” version used at the end of the project. The blocks “Stabiliser, Inverse Dynamics, Estimator” already existed and were made by PAL Robotics, the block “MPC” was made by TOWARD, and the block “Perception” was made in collaboration between PAL Robotics and TOWARD. For the “Contact planner” block we used an open source package.

TALOS locomotion integrated with ROS to be run fully onboard the robot 

Pierre told us, “The project is now complete and the software is operational, however, we will of course continue working on it to make a second version. The walking work will also be applied to our new biped robot Kangaroo in the future.”

Pierre concluded, “With this project, we have worked on the implementation of an architecture for TALOS locomotion in unknown environments, that is able to be fully integrated with ROS, and can be run fully onboard the robot.”

Future work in Whole-Body Motion Predictive Control for TALOS and Kangaroo 

Regarding future projects, Pierre explained, “We also plan to work on a project in Whole-Body Model Predictive Control – for bo th TALOS and Kangaroo robots. This will be a way to generate more optimal and dynamic motions using the whole bodies of the robots.” 

“In terms of future work on TALOS only, we would like to focus on making TALOS’ perception block more robust, improving the robot base estimation with visual odometry, fusion of the robot head and waist sensors, and global path planning.”

We would like to thank Pierre Fernbach for taking the time to talk with us! To learn about the capabilities of our advanced biped research robot TALOS and Kangaroo take a look at our website. Finally, if you would like to ask us more about TALOS or Kangaroo as research platforms for your organisation, do not hesitate to get in touch with us. For more news on our research and robots, follow our blog on robotics and technology!

  • Biped robot
  • Machine learning
  • Customer stories (11)
  • EU projects (68)
  • Events (131)
  • Learn how to (32)
  • Logistics (20)
  • Pal Robotics (341)
  • Research (153)
  • Retail (22)
  • Social (61)
  • SAFE-LY Project: Improving Patient Safety with Healthcare Robotics PAL Robotics
  • PRO-CARED pilots: robot ARI as education robot helping students with... PAL Robotics
  • European Robotics Forum – ERF2024 PAL Robotics

SAFE-LY Project: Improving Patient Safety with Healthcare Robotics

SAFE-LY Project: Improving Patient Safety with Healthcare Robotics

PROCARED blog by PAL Robotics

PRO-CARED pilots: robot ARI as education robot helping students with Catalan language

Banner for the European Robotics Forum (ERF) 2024 event, highlighting the theme 'Let's talk about collaboration' set against a dark blue background. The event is scheduled from 13-15 March in Rimini, Italy, with the logo of EU Robotics and a decorative pattern of stars in varying sizes and colors. Text also mentions PAL as the gold sponsor of the event.

European Robotics Forum – ERF2024

  • International Affairs
  • Current Students
  • -->Faculty/Staff

TOHOKU UNIVERSITY

  • Contact Tohoku University

Chat bot

  • Facts & Figures
  • Organization Chart
  • President's Message
  • Top Global University Project
  • Designated National University
  • Global Network
  • Promotional Videos
  • Undergraduate
  • Courses in English
  • Exchange Programs
  • Summer Programs
  • Double Degree Programs
  • Academic Calendar
  • Undergraduate Admissions
  • Graduate Admissions
  • Fees and Expenses
  • Financial Aid
  • Feature Highlights
  • Research Releases
  • University Research News
  • Research Institutes
  • Visitor Research Center
  • Research Profiles
  • Academic Research Staff
  • International Support Office
  • IT Services
  • Dining & Shops
  • Clubs & Circles
  • University News
  • Arts & Culture
  • Health & Sports
  • Campus & Community
  • Press Release
  • International Visit
  • Special Event
  • Campus Maps & Bus
  • Facilities Map
  • Campus Life
  • Faculty/Staff
  • Subscribe to our Newsletter
  • Map & Directions
  • Jobs & Vacancies
  • Emergency Information

TOHOKU UNIVERSITY LOGO

Research News

Generative ai that imitates human motion.

research on humanoid robots

An international group of researchers has created a new approach to imitating human motion through combining central pattern generators (CPGs) and deep reinforcement learning (DRL). The method not only imitates walking and running motions but also generates movements for frequencies where motion data is absent, enables smooth transition movements from walking to running, and allows for adapting to environments with unstable surfaces.

Details of their breakthrough were published in the journal IEEE Robotics and Automation Letters on April 15, 2024.

We might not think about it much, but walking and running involves inherent biological redundancies that enable us to adjust to the environment or alter our walking/running speed. Given the intricacy and complexity of this, reproducing these human-like movements in robots is notoriously challenging.

Current models often struggle to accommodate unknown or challenging environments, which makes them less efficient and effective. This is because AI is suited for generating one or a small number of correct solutions. With living organisms and their motion, there isn't just one correct pattern to follow. There's a whole range of possible movements, and it is not always clear which one is the best or most efficient.

DRL is one way researchers have sought to overcome this. DRL extends traditional reinforcement learning by leveraging deep neural networks to handle more complex tasks and learn directly from raw sensory inputs, enabling more flexible and powerful learning capabilities. Its disadvantage is the huge computational cost of exploring vast input space, especially when the system has a high degree of freedom.

Another approach is imitation learning, in which a robot learns by imitating motion measurement data from a human performing the same motion task. Although imitation learning is good at learning on stable environments, it struggles when faced with new situations or environments it hasn't encountered during training. Its ability to modify and navigate effectively becomes constrained by the narrow scope of its learned behaviors.

"We overcame many of the limitations of these two approaches by combining them," explains Mitsuhiro Hayashibe, a professor at Tohoku University's Graduate School of Engineering. "Imitation learning was used to train a CPG-like controller, and, instead of applying deep learning to the CPGs itself, we applied it to a form of a reflex neural network that supported the CPGs."

research on humanoid robots

CPGs are neural circuits located in the spinal cord that, like a biological conductor, generate rhythmic patterns of muscle activity. In animals, a reflex circuit works in tandem with CPGs to provide adequate feedback that allows them to adjust their speed and walking/running movements to suit the terrain.

By adopting the structure of CPG and its reflexive counterpart, the adaptive imitated CPG (AI-CPG) method achieves remarkable adaptability and stability in motion generation while imitating human motion.

"This breakthrough sets a new benchmark in generating human-like movement in robotics, with unprecedented environmental adaptation capability," adds Hayashibe "Our method represents a significant step forward in the development of generative AI technologies for robot control, with potential applications across various industries."

The research group comprised members from Tohoku University's Graduate School of Engineering and the École Polytechnique Fédérale de Lausanne, or the Swiss Federal Institute of Technology in Lausanne.

research on humanoid robots

Publication Details: Title: AI-CPG: Adaptive Imitated Central Pattern Generators for Bipedal Locomotion Learned through Reinforced Reflex Neural Networks Authors: G. Li, A. Ijspeert and M. Hayashibe Journal: IEEE Robotics and Automation Letters DOI: 10.1109/LRA.2024.3388842

research on humanoid robots

IMAGES

  1. RoMeLa

    research on humanoid robots

  2. NASA gives MIT a humanoid robot to develop software for future space

    research on humanoid robots

  3. The World's Most advanced Humanoid Robot from Honda ~ UNDER

    research on humanoid robots

  4. Ameca, the world's most advanced humanoid robot joins the Museum of the

    research on humanoid robots

  5. Japanese researchers most unveil life-life humanoid robots The Engineer

    research on humanoid robots

  6. Autonomous Robotic Manipulation

    research on humanoid robots

VIDEO

  1. IRT Humanoid

  2. I Investigated “Humanoid” Robots, Here’s What I Found.. 😭 #shorts

  3. The first humanoid robot somersault powered by electrical

  4. Humanoid Robots And Androids

  5. Exploring the World of Humanoid Robots: From Definition to Future Applications

  6. Jaemi Hubo, Humanoid Robot, Debut on CBS News Philadelphia Humanoid Robot

COMMENTS

  1. Advancements in Humanoid Robots: A Comprehensive Review and Future

    This paper provides a comprehensive review of the current status, advancements, and future prospects of humanoid robots, highlighting their significance in driving the evolution of next-generation industries. By analyzing various research endeavors and key technologies, encompassing ontology structure, control and decision-making, and perception and interaction, a holistic overview of the ...

  2. Humanoid robotics—History, current state of the art, and ...

    The humanoid robot clearly needs more research and development not only in mechanical design and control but also in perception and recognition capabilities. We are, after all, still at the beginning of a long journey of creating a humanoid robot that is intelligent and can act, reason, and interact like a human being in real-world scenarios. ...

  3. Humanoid Robots Are Getting to Work

    Apptronik. Apptronik has worked on more than half a dozen humanoid robots over the past eight years, including NASA's Valkyrie. Apollo is the culmination of all this experience and is designed ...

  4. Humanoid robotics—History, current state of the art, and challenges

    The humanoid robot clearly needs more research and development not only in me-chanical design and control but also in perception and recognition capabilities. We are, after all, still at the be-ginning of a long journey of creating a humanoid robot that is intelligent and can act, reason, and interact like a human being in real-world scenarios.

  5. Human-Humanoid Interaction and Cooperation: a Review

    Purpose of Review Humanoid robots are versatile platforms with the potential to assist humans in several domains, from education to healthcare, from entertainment to the factory of the future. To find their place into our daily life, where complex interactions and collaborations with humans are expected, their social and physical interaction skills need to be further improved. Recent Findings ...

  6. Bipedal Humanoid Hardware Design: a Technology Review

    A clear shift in the recent design features of humanoid robots is developing, which is supported by literature. As humanoid robots are meant to leave laboratories and traverse the world, compliance and more efficient locomotion are necessary. The limitations of highly rigid actuation are being tackled by different research groups in unique ways.

  7. Humanoid Robotics

    Humanoid robotics is an emerging and challenging research field, which has received significant attention during the past years and will continue to play a central role in robotics research and in many applications of the 21st century. Regardless of the application area, one of the common problems tackled in humanoid robotics is the ...

  8. Humanoid Robots

    In this way, research on humanoid robots is still growing in many aspects, motivated by realistic applications and the recent leaps of advancement in artificial intelligence (AI). This entry is intended to invite readers to humanoid research by providing the required fundamentals. After outlining an historical overview of hardware development ...

  9. The MIT Humanoid Robot: Design, Motion Planning, and Control For

    View PDF Abstract: Demonstrating acrobatic behavior of a humanoid robot such as flips and spinning jumps requires systematic approaches across hardware design, motion planning, and control. In this paper, we present a new humanoid robot design, an actuator-aware kino-dynamic motion planner, and a landing controller as part of a practical system design for highly dynamic motion control of the ...

  10. Frontiers

    Humanoid robots attract growing research interests from different communities, both as tools for artificial intelligence research and neurocognitive interaction assessment and as enabling technology with high societal impacts as personal robots for health, education, and entertainment. These robots, modelled on the basis of the embodiment of neural systems in software and hardware devices, are ...

  11. Humanoid robot

    Humanoid robots are now used as research tools in several scientific areas. Researchers study the human body structure and behavior (biomechanics) to build humanoid robots. On the other side, the attempt to simulate the human body leads to a better understanding of it. Human cognition is a field of study which is focused on how humans learn ...

  12. Editorial: Humanoid Robots for Real-World Applications

    Humanoid Robots for Real-World Applications. Since Honda introduced the P2 in 1996, numerous humanoid robots have been developed around the world, and research and development of various fundamental technologies, including bipedal walking, have been conducted. At the same time, attempts have been made to apply humanoid robots to various ...

  13. Top 22 Humanoid Robots in Use Right Now

    The humanoid robot market is valued at $1.8 billion in 2023, according to research firm MarketsandMarkets, and is predicted to increase to more than $13 billion over the next five years. Fueling that growth and demand will be advanced humanoid robots with greater AI capabilities and human-like features that can take on more duties in the ...

  14. Humanoid service robots: The future of healthcare?

    Humanoid service robots made swift progress in extending a helping hand to the strained global healthcare during the COVID-19 pandemic. This case provides an overview of the robots' inclusion in healthcare regarding pre- and intra-pandemic contexts. Specific focus is devoted to humanoid service robots as their shape, size, and mobility make ...

  15. Human-Robot Interaction: Status and Challenges

    Human-robot interaction (HRI) is a rapidly expanding field with a great need for human factors involvement in research and design, especially as robots are challenged to undertake more sophisticated tasks. In any case, the first 90% of replacing humans with robots is much easier than the last 10%. •.

  16. Reinforcement learning for robot research: A comprehensive review and

    Reinforcement learning (RL), 1 one of the most popular research fields in the context of machine learning, effectively addresses various problems and challenges of artificial intelligence. It has led to a wide range of impressive progress in various domains, such as industrial manufacturing, 2 board games, 3 robot control, 4 and autonomous driving. 5 Robot has become one of the research hot ...

  17. Full article: Mechanics of humanoid robot

    When verifying walking control of a humanoid robot, even if its effectiveness is confirmed through simulation, the controller often doesn't work with an actual robot. ... His research interests include legged robots, and humanoid robots. He is a member of robotics related academic societies such as IEEE, RSJ, JSME, Japan Council of IFToMM ...

  18. Advancements in Humanoid Robots: A Comprehensive Review and Future

    This paper provides a comprehensive review of the current status, advancements, and future prospects of humanoid robots, highlighting their significance in driving the evolution of next-generation industries. By analyzing various research endeavors and key technologies, encompassing ontology structure, control and decision-making, and perception and interaction, a holistic overview of the ...

  19. Humanoid Robots

    2.2 Humanoid robot research for the practical verification of motor learning. Recently, several researcher groups have started to investigate "Humanoid" robots from various aspects. One of the recent study is concerned with trajectory generation for walking, dancing or jumping movements (9)(10). A full body humanoid robot was built to ...

  20. Perception for Humanoid Robots

    Environment understanding is a critical area of research for humanoid robots, enabling them to effectively navigate through and interact with complex and dynamic environments. This field can be broadly classified into two key categories: 1. localization, navigation and planning for the mobile base, and 2. object manipulation and grasping.

  21. (PDF) Humanoid Robots-Recent Developments & Human-Robot ...

    study did a liter ature review and found out: (a) The drastic. evolution of humanoid robots in the past decade; (b) Kids and. people of old age prefer humanoid ro bots; (c) Humanoid robots. play a ...

  22. Humanoid robots are learning to fall well

    Humanoid robots are learning to fall well ... The accompanying video was a celebration of the older Atlas' journey from DARPA research project to an impressively nimble bipedal 'bot. A minute ...

  23. Humanoid Robots: Sooner Than You Might Think

    Some humanoid robots have mastered mobility and agility movements, while others can handle cognitive and intellectual challenges - but none can do both, the research says. One of the most advanced robot-like technologies on the commercial market is a self-driving vehicle, but a humanoid robot would have to have greater intelligence and ...

  24. Unintended consequences of humanoid service robots: A case study of

    Most research in this area has focused on for-profit service organizations because they have been at the forefront of adopting humanoid robots. In contrast, fewer public and non-profit organizations have used humanoid robots for service operations, yet service operations are the core of most public and non-profit organizations ( Grant et al ...

  25. (PDF) The Ethics of Humanoid Robots

    Abstract. The Ethics of Humanoid Robots is a research paper that delves into the ethical considerations surrounding the creation, use, and potential impacts of humanoid robots in society. This ...

  26. Generative AI is speeding up human-like robot development. What ...

    ChatGPT-like artificial intelligence is speeding up research and bringing humanoid robots closer to reality in China, home to many of the world's factories. In robotics, the development of ...

  27. World's first fully electric robot boasts 550 trillion ops, 4mph speed

    In addition to its fully electric operation, the humanoid robot boasts a lightweight construction (94.7 pounds or 43 kilograms), ensuring stability during movement, particularly while running.

  28. TALOS research: torque-controlled locomotion for humanoids in unknown

    With humanoid robots in general, for them to help in industrial or domestic environments for example, this is a very beneficial goal." Pierre added, " Research papers and their approach, including, ' SL1M: Sparse L1-norm Minimization for contact planning on uneven terrain ' helped to inspire this research."

  29. Research News

    Researchers have overcome some of the challenges of imitating human motion in robots with an innovative method that combines central pattern generators (CPGs) and deep reinforcement learning. ... The research group comprised members from Tohoku University's Graduate School of Engineering and the École Polytechnique Fédérale de Lausanne, or ...