We wanted to give robots a brainβnot just motion, but understanding and purpose. The project began as part of the βSeeed 2025 Build Cooking and Home Robot with NVIDIA Jetson Thorβ hackathon.
π€ What Is Brainbot?Brainbotis a modular teleoperation + learning platform. It includes a unified system for teleoperation, demonstration data collection, inference, and visualization. Mode switching can be done at runtime, and you can monitor your robot's action states and camera streams in real time from anywhere in the world. It is also multi-device, robot agnostic, and fully wireless!
The brainbot interface is compatible with any robot definition in Huggingface's Lerobot, and the data format and pipeline are a customized mirror of how Lerobot does it.
We trained our own GR00T model, an embodied policy that learns from human demonstrations, to perform home-oriented tasks such as:
π§Ί Fold clothes, organize toys, stack blocks
𧦠Open / close drawers, put items into baskets
π° Bring water / get water
π Seasonal Mode: Trick-or-treat with kids, organize candies, clean up wrappers, move paper cups
π§© How It WorksService Provider ArchitectureAny action provider, such as idle, teleoperation, or AI actions, is hosted on a dedicated service server. The service servers talk to a command server that sends states to the robot server. In the command server, a mode dispatcher manages the switchings between different action provides.
Any action provider, such as idle, teleoperation, or AI actions, is hosted on a corresponding dedicated service server. The service servers talk with a command server, which sends states to the robot server. In the command server, a mode dispatcher manages the switching between different action providers.
Of course, all states and camera streams are sent to a web server in real-time, which allows you to monitor the robot from anywhere in the world. The streams are also used in some teleoperation modes, such as AR mode, where you can visualize all streams right from the headset.
Runtime mode switching- AR Teleop (Quest 3): full 3D control and labeling in mixed reality
- Joy-Con / Gamepad Teleop: intuitive joystick control for fast data collection
- Leader-Follower Arm Teleop: physical dual-arm mirroring for dexterous tasks
- AI Inference Mode: runs the trained GR00T policy directly on the Jetson Thor
- Data Collection Mode: Mirrors the same Lerobot data pipeline and formats
All modes communicate through a unified interface that synchronizes joint states, actions, and video feedsβallowing smooth switching between human and AI control.
Real + Sim Data Pipeline- We collected data both in the real world and inside NVIDIA Isaac Lab simulation
- Real data from Quest 3 AR teleoperation (hand tracking + pose capture)
- Sim data from Isaac Lab teleoperation, reinforcement learning pipeline, and domain randomization
- Visual + IMU SLAM for real-time scene reconstruction for mapping and 3D perception alignment.
- The GR00T model learns language-conditioned visuomotor policies that connect perception, intent, and control.
- Using the teleop data, we trained policies to map camera frames and state inputs to actions.
- We explored both imitation learning (for short-horizon skill reproduction) and reinforcement learning (for long-horizon optimization) within simulation before deploying to Jetson Thor.
We built a web-based visualization interface, connected to Brainbot's unified communication protocol
- Real-time joint state graphs
- Camera streaming from robotβs RGB feed
This makes it easy to monitor the robotβs internal state during training or teleoperationβcrucial for human-in-the-loop supervision.
π§° Tech StackDeployment
- NVIDIA Jetson Thor
Simulation
- Isaac Lab + PyBullet
Teleoperation
- Meta Quest 3 (AR)
- Nintendo Joy-Con
- Gamepad
- SO101 leader arms
Web
- JavaScript
AI Training
- PyTorch + RL + IL + VLA
Scene Mapping
- VSLAM + Open3D reconstruction


Comments