ASGARD: Autonomous System for General Assistance and Robotic Dynamics
Our Home robot - ASGARD tackles everyday kitchen chores like sorting groceries, setting a table, or assists in cooking by combining a 6‑DoF robot arm with vision‑language‑action (VLA) policies fine‑tuned on our demonstrations. We collect demos via teleoperation and simulation (optional Isaac Sim), train ACT / Diffusion Policy / Pi0.5 variants, and deploy the best policy on Jetson AGX Thor for low‑latency inference. A compact perception + planning stack helps with object detection, grasping, and safe motions; a ReSpeaker mic array enables hands‑free voice prompts like “place the red cup on the tray.”
We are Team Proxemics! We are exploring the development of user-adjacent robotic assistants, that are designed to be close-enough as to be helpful, but far-enough away so as not impede user actions in a kitchen workspace. Our project is built as part of our submission to the NVidia + SeeedStudio Embodied AI Hackathon Oct 2025 at Circuit Launch in Mountain View CA!
Our robot is built upon a series of open-source projects: (1) the XLeRobot platform, (2) The SO-ARM101 kit (2 x leader arms, and 2 x follower arms), and (3) LeKiwi robotic base. The project utilizes an Nvidia Thor development board, an Intel RealSense D455f RGBD camera, a WowRobo WowSkin touch sensor, and 2 x wrist-mounted camera modules.
To run tests and experiment in parallel, we utilized a BeeLink Mini-S PC. All hardware was first assembled according to the instructions provided by the SO-ARM101 kit and XLeRobot authors. Augmentations to those designs include (1) a StarTech 7-port USB hub, 1 x WowSkin touch sensor, 1 x ReSpeaker Microphone Array module, and modified cable routing and mounting, including adhesive wire ties, and cable splitters + adapters when needed. To keep track of the wire routing, all cables were labeled on each end. The robot needed to be constructed in Texas, and then broken down, and then re-assembled in the Bay Area!
To validate that the assembly was properly completed, first we ran a calibration script for each of the 2 x servo-driven arms to determine their maximum and minimum position feedback values, and then verified that the leader arm could control each follower (on-board) arm.
The ReSpeaker module was tested using simple scripts from the original Github documentation, to output the detection angle as we yelled at the sensor.
The WowSkin sensor was tested using a built-in example and visualizer to verify that the hall effect sensors embedded in the compliant "finger" pad could detect orthogonal and shear forces. A classifier model was created to determine when the user was truly intending to retrieve an object from the WowSkin-equipped robot arm.
Data SourcesThe data collected by the Nvidia Jetson Thor during operation includes the following:
- 2 x RGB streams of wrist camera data
- RGBD stream from the Intel RealSense camera
- WowSkin 5 x hall effect sensor data
- 6 + 6 + 3 position feedback values from the 2 x robot arms, and base with its own 3 x continuous-rotation servo motors.
- Object retrieval and binning (or reverse). This is done by manually controlling the leader (5V) off-board ARM and performing this action for a large number of times while collecting data from all the sources above.
- Sound and robot action response.
- Object handoff to the user. The classifier model mentioned above is used to determine when the user truly intends to retrieve an object from the robot arm.
We successfully demonstrated that the robot could locate the sound of the user's voice and move towards it, and classify the object retrieval through user pass move set. Next steps include incorporating a touch sensor into the 2nd arm, and there are a number of improvements we can make to the robustness and feature set of the robot. Thank you for checking out our project!
Demonstration











Comments