From folding boxes to fixing vacuums, GEN-1 robotics model hits 99% reliability

Robotic machine learning company Generalist has announced GEN-1, a new physical AI system that it says “crosses into production-level success rates” on “a broad range of physical skills” that used to require the dexterity and muscle memory of human hands. Generalist is also touting the new model’s ability to respond to disruptions by improvising new moves and “connect[ing] ideas from different places in order to solve new problems.”

GEN-1 builds on Generalist’s previous GEN-0 model, which the company touted in November as a proof of concept for the applicability of scaling laws in robotics training, showing how more pre-training data and compute time improve post-training performance. But while large language models have been able to effectively process trillions of words collectively written on the Internet as part of their training, robotic models don’t have a similar, readily accessible source of quality data about how humans manipulate objects.

To help solve this problem, Generalist has relied on “data hands”, a set of wearable pincers that capture micro-movements and visual information as humans perform manual tasks. Generalist now claims it has collected over half a million hours and “petabytes of physical interaction data” to help train its physical model.

Shut up and take my money (out of my wallet) (then put it back in).

The result is an autonomous system that is precise enough to put money into a wallet and adaptable enough to fold laundry or sort auto parts. The model now reaches 99 percent success rates on repetitive but delicate mechanical tasks such as folding boxes, packing phones, and servicing robot vacuums, according to Generalist, and at roughly three times the speed of the previous GEN-0 model. GEN-1 can hit these marks after only about an hour spent adapting its pretraining to “robot data” that applies to its specific robotic embodiment, according to the company.

Recovering from mistakes

In the past, complex robotic systems have usually relied on carefully pre-programmed motions or been trained to focus exclusively on a single task with little variation. What sets GEN-1 apart, Generalist says, is the ability for a single model to improvise based on its previous experience and respond to disruptions naturally, even when they are “well outside the training distribution.”

In an interview with Forbes, for instance, Generalist engineers describe the model giving a plastic bag a little shake to get a plush toy to shimmy inside, even though such a move wasn’t explicitly programmed in the training data. A video posted by Generalist also shows robot hands adjusting intelligently as flexible objects spring out of their expected positions or refolding a shirt that gets moved in the middle of a folding task. Generalist also describes the model adjusting and regrasping small washers when they get nudged out of place, using both hands to insert them into their desired spot.

“Nobody has programmed the robot to make mistakes, therefore nobody has programmed the robot to recover from mistakes,” Generalist engineer Felix Wang says in that video. “And that just happens for free.”

Please send this robot over to my house to fold all my laundry ASAP.

Generalist isn’t the only company working to bring machine learning techniques into the physical realm. Last year, Google showed off the “visual learning action” capabilities of its Gemini Robotics models, which can understand and respond to general action prompts from humans. And Physical Intelligence has made waves with a pair of robotic hands on a wheeled platform, trained in specially designed simulated household environments to perform tasks from cleaning up spills to making beds.

Then there’s Tesla, which first rolled out its humanoid Optimus robots in late 2024 with staged demos that were actually teleoperated by remote human pilots. In January, Tesla CEO Elon Musk admitted that current Optimus robots are still not doing “useful work” at Tesla, despite previous claims to the contrary.

With GEN-1, though, Generalist says its physical models have reached a GPT-3-style inflection point, where some tasks are starting to “cross the level of performance needed to be deployed in economically useful settings” and where “we can expect each new generation of model to result in a new set of increasingly complex tasks that can be mastered.” Color us hopeful that this means we’re finally on the path to an affordable, at-home laundry-folding robot sometime in the near future.