For decades, robots have captivated the imagination of humans with their ability to emulate human capabilities. Movies like Blade Runner and Terminator have sparked our imagination with their portrayal of a future where robots and humans coexist. While robots tailored for specific tasks are increasingly common, the prevalence of versatile humanoid robots remains limited. A crucial element for their widespread adoption had been missing: the ability for fluid communication with humans. Early attempts at human-computer interaction, like Apple’s Siri, were rudimentary. However, the advent of Large Language Models (LLM) marked a significant leap forward for machines to communicate with humans in a more effective and intelligent manner.
ChatGPT
In November 2022, the world witnessed a significant advancement in human-computer interaction with the launch of ChatGPT by OpenAI. ChatGPT uses a Large Language Model, a type of Machine Learning (ML) trained on vast amounts of text, enabling it to understand and generate human-like language. For the first time, it was possible to seamlessly communicate with a computer using plain English.
OpenAI has made significant advancements in its models since the release of ChatGPT. The model now powering ChatGPT is GPT-4, an advanced multimodal model capable of processing both text and images. Furthermore, OpenAI has announced Sora, a groundbreaking text-to-video model that can create videos with complex scenes lasting up to a minute, marking another leap forward in OpenAI’s offerings. However, it’s their partnership with Figure AI that brings us closer to the reality of interactive humanoid robots, merging Artificial Intelligence (AI) with practical robot applications.
Figure AI
Figure AI has been developing a general-purpose robot called Figure 01, a versatile humanoid robot designed to mimic human appearance and movement. Founded in 2022, the company has assembled a team from tech giants like Boston Dynamics, Tesla, Google DeepMind, and Apple to bring humanoid robots to life. The company has several notable backers including Jeff Bezos, Microsoft, Nvidia, Intel, and OpenAI. The company plans to commercialize its humanoid robot to address labor shortages, unsafe or undesired jobs, and global supply chain needs.
In a significant move in February 2024, the company announced a strategic partnership with OpenAI. This collaboration will have OpenAI building specialized models for Figure’s humanoid robots, effectively materializing ChatGPT into a physical entity.
As part of its partnership announcement with OpenAI, Figure AI released a video highlighting the capabilities of its Figure01 robot running on OpenAI’s LLM. In the video, Figure 01 interacts with a human and performs basic tasks. The video may appear simplistic, but it does illustrate the robot’s ability to operate autonomously. All behaviors are learned; Figure 01 isn’t remotely controlled or preprogrammed like Boston Dynamics’ dancing robots. The images from Figure 01’s camera feed into OpenAI’s multimodal model, enabling it to perceive its surroundings. The model shows its reasoning capabilities by translating an ambiguous request like “Can I have something to eat?” to a context-aware action, such as handing the only edible item, an apple. Similarly, when prompted about the location of dishes, Figure 01 deduces the next logical step, anticipating that they would go into the drying rack, and proceeds to move them. This advanced level of reasoning has practical implications. For instance, in the future, it can autonomously identify and dispose of trash, showcasing its application in various domestic and commercial settings.
Figure01 operates at 200 Hz, meaning that internally it is thinking and adjusting 200 times a second, making its movements smooth. The model uses pause words and speech hesitation, a nice touch that shows OpenAI is thinking about making the robots sound more human.
While the Figure 01 demo is impressive, there are some notable limitations. Primarily, its movement speed is slow. Additionally, there is a noticeable lag in communication, likely caused by the interaction with OpenAI’s servers for instructions. Nevertheless, despite these issues, the advancements in Figure 01 represent a significant leap forward in robotics technology.
Figure AI is far from the only player making strides in the field of humanoid robots. Established companies like Boston Dynamics have been working on advanced robots for decades, while other big names such as X, Agility Robotics, and Apptronik have also made significant progress. While each company’s robots have distinct advantages and drawbacks, Figure AI stands out for its reasoning and dexterity capabilities. In February 2024, Figure AI raised $675M at a $2.6B valuation, underscoring the substantial investment flowing into this capital-intensive sector. Moreover, Figure AI’s partnership with BMW Manufacturing marks a pivotal moment by integrating their advanced humanoid robots into the automotive production process, demonstrating the practical applications and potential impact of these cutting-edge technologies.
OpenAI and Robots
It’s likely that OpenAI will make their own robots at some point. Founder and CEO Sam Altman hinted at this on a podcast with Lex Fridman. Before musing about his dislike of using capital letters on social media posts, he confirmed that although building robots is not an area of focus for him, he plans to eventually work on robots. The need for focus makes sense; although OpenAI bypassed $2B in annualized revenue in December 2023, the costs associated with operating ChatGPT are notably high. The popular chat program is rumored to cost the company $700k a day in computing cost, and Altman once remarked that OpenAI was going to be “the most capital-intensive startup in Silicon Valley history”. Now that the board drama has been resolved, Altman can focus on growing the company. The company is in talks to raise a fresh round of capital at a valuation of $100B, making it one of the world’s most valuable startups.
As the field of embodied robots continues to emerge, it’s important to recognize that humanoid robots are still in their infancy. Figure AI has made great strides considering they were founded less than two years ago. While their partnership with BMW is a great step towards commercializing humanoid robots, it’s unclear how effectively these robots will perform in real-world environments. While startups often adopt a ‘move fast and break things’ ethos, the intricate nature of humanoid robotics requires a more cautious and deliberate approach. Nevertheless, the potential to alleviate labor shortages makes this endeavor worthwhile. By striking the right balance between innovation and prudence, companies like Figure AI can enhance productivity and transform industries.
“More human than human is our motto.” – Eldon Tyrell, Blade Runner (1982)