Nvidia unveiled a general-purpose generative artificial intelligence (AI) platform for humanoid robots dubbed GR00T earlier this week. The platform comes with both hardware and software solutions for humanoid robots and can equip them with capabilities that enable them to perform complex tasks and learn new tasks through human-assisted inputs. The AI model is paired with the company’s new chipset called Jetson Thor which will power the AI capabilities. Nvidia revealed that GR00T was developed with the help of its Isaac Robotics Platform, which has also received multiple upgrades to enable end-to-end development of platforms for humanoid robots.
Announcing GR00T, an acronym for Generalist Robot 00 Technology, Jensen Huang, founder and CEO of Nvidia said, “Building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today.” The statement carries the sentiment of decades of research in human-shaped robots that can not only look but also function like us. While researchers have built robots that come close in terms of appearance and certain functionalities such as the football-playing robot ARTEMIS developed at the University of California, Los Angeles (UCLA), the struggle has always been about developing a powerful enough software platform that can handle the complex movements of a human and gain knowledge about the surrounding world. With GR00T, Nvidia appears to have taken a big step in that direction. Here are five things that you should know about this monumental development.
GR00T the AI Model
It is easy to mistake GR00T for a humanoid robot, given Huang’s Nvidia session at the GPU Technology Conference (GTC) keynote in San Jose had multiple robots featured in Disney projects appear on-stage alongside him. However, GR00T is not a robot. Instead, it can be understood as the brain of a robot.
What Nvidia has created is a generative AI model that can power robots just like OpenAI’s GPT-4 powers ChatGPT and Google Gemini Pro powers the Gemini chatbot. AI models contain the algorithmic architecture and the data set that enables it to generate information and perform specific tasks. Such AI models which become the foundation of chatbots and AI tools that we end up using, are also called foundation models. As such, GR00T is a foundation model.
What does GR00T Do?
As the brain of a robot, Nvidia’s GR00T AI model is responsible for taking inputs and processing outputs. Nvidia said that the foundation model can take text, speech, videos, and live demonstrations by humans as input. This means any robot powered by GR00T can be verbally instructed to do something, it can be fed via written text, through a video, and even by showing it how to do something.
During the demonstration, Huang told a robot to move towards him and it did. Ideally, the robots can watch a human do a push-up and perform it as well. Video inputs will work the same way and watching a drummer’s jamming session should teach it how to play drums as well, but maybe not as efficiently.
The role of Jetson Thor chipset
While GR00T is the most important part of the software platform for humanoid robots, it is not the only part. At the event, Jensen also introduced a chipset called Jetson Thor. This chipset is what will power the robots. The AI SoC is based on the Nvidia Thor chipsets and was built by the company’s Isaac Robotics Platform. The company said that the chip was made powerful enough to help GR00T operate through it seamlessly. The SoC comes with specific enhancements for generative AI models and tools for simulation and AI workflow infrastructure.
Isaac Robotics Platform
Another important part of the GR00T ecosystem is the Isaac Robotics Platform. The platform has also been upgraded to feature various tools that can enable end-to-end development of AI-based robots. The process includes development, simulation and deployment and it is carried out using specific digital platforms and computers to power them. Let us break it down.
While the GR00T AI model enables humanoid robots to learn on the job, the robots should have some idea about the world where they will be performing these tasks and a basic understanding of how to perform them. For this, a robot training ground needs to be created where they can learn how to be a robot before they head out in the world.
This is what the Isaac Robotics Platform does. It is a gym for AI-powered robots and through simulations in the virtual world and training in the real world, it can deploy functional robots ready to operate. For GPU-powered simulations, Nvidia uses its new Isaac Lab, which is based on the Isaac Sim platform. Additionally, Nvidia is using its OSMO compute platform and the OVX and DGX systems to simultaneously train the robots.
The future of robotics
Nvidia’s GR00T AI system is a significant leap in the technology that is used to train and deploy robots that come close to what is known as artificial general intelligence (AGI). For the unversed, AGI is the branch of AI that develops capabilities that are on par with humans. These include both cognitive tasks as well as physical ones.
Recently, OpenAI has also begun its work on AGI with its collaboration with 1X Technologies and Figure, two robotics startups that will use the AI firm’s models to power their machines.
However, these are still early days for humanoid robots. Despite the doom warnings from AI researchers that we may end up in the future world of Terminator movies where the AI Skynet becomes the overlord of humans, there is a very long way to go. Both Nvidia and OpenAI are at their initial stages of developing robotics technologies and at present, it is not likely that the robots powered by them will be capable of handling complex tasks without human supervision. But this is a start, and in a decade’s time we might see humanoid robots enter the workforce and work alongside humans. How that will change the world is something no one can predict.