From humble beginnings, our project explores the use of the LeRobot SO101 robotic arm paired with our Phosphbot system. Together, they form a real-time, camera-guided platform that can stack (sometimes throw) cups
Our main goal: train the robot to stack cups, by using two camera views one overhead for spatial awareness, and one mounted directly on the arm for close-up precision.
To teach the robot how to stack the cones, we recorded teleoperated demonstrations using the Phosphbot interface.
During each session, the SO101 arm was guided to:
- Detect a cone on the right side of the workspace.
- Move toward it, and align using the arm, then grasp it.
- Transfer the cone smoothly to the left side of the table and release it in the target zone (cone).
- Return to the starting position to repeat the cycle till all cones are stacked (80 more times).
Once the teleoperation dataset was collected, we moved to the training phase using ACT (Action Chunking Transformer)
We created a virtual environment using Docker to ensure consistent dependencies across our development machines. The container included all required packages for LeRobot, ACT, and GPU-accelerated training.
DEMO!and throwing?!?





Comments