Tech
Jun 17, 2026
The Dirty Work of Robot Training: XDOF Emerges to Fill the Data Gap
XDOF, a new startup, is addressing the bottleneck in AI robotics by collecting and annotating high-…
The Rise of Robot Training Data
The AI industry is racing to teach machines to operate in the physical world, but a major bottleneck is the lack of high-quality training data. Unlike language models that were trained on vast amounts of publicly available text, robots need data that captures physical interaction.
XDOF's Solution
XDOF, emerging from stealth, aims to build the data pipelines, collection tools, and annotation systems that frontier labs and robotics companies can't easily build themselves. The startup has raised $70 million from top investors and is already working with 20 customers, including several frontier AI labs.
The Data Gap in Robotics
The company's co-founder and CEO, Philippe Wu, experienced the problem firsthand as a PhD student at UC Berkeley. He worked on a project called GELLO, a low-cost teleoperation system that lets a human operator control a robotic arm to generate training data.
Partnership with UC Berkeley
XDOF is partnering with UC Berkeley's AI Research lab to release a large collection of high-quality robot training data, dubbed ABC. The dataset includes 130,000 trajectories of robot manipulation data, 300 hours of simulation, and 100 hours of evaluations.
The Future of Robot Training
The company plans to work across three tiers of a data pyramid, including teleoperation data collected on actual robots, teleoperated robots gathering general data, and egocentric data gathered by humans performing everyday tasks. XDOF aims to hire and train armies of teleoperators and egocentric data operators around the world.
#XDOF
#Robotics
#AI Training Data
Read More