Moving a robotic arm usually requires writing rigid scripts or using complex ROS nodes. However, with the emergence of AI Agent, we can now treat our development environment as a collaborator. By using Claude Code and its skills framework, we can control the 4-DOF robot myPalletizer 260 M5 using natural language, allowing the AI to handle coordinate math and error correction. Setting Up the Environment. First, ensure your hardware is connected and the pymycobot library is installed.
This open-source project is available on GitHub at: https://github.com/vhp8rc7p/hackster
The Hardware: myPalletizer 260 M5The myPalletizer 260 M5 is a specialized 4-axis robotic arm designed for palletizing tasks. Learn more information here.
Unlike 6-axis robot arms, it operates on a specific coordinate system:
• X, Y, Z: Linear movement in 3D space.
• Rz: Rotation of the end-effector (crucial for aligning with cubes or markers).
The most difficult hurdle in vision-guided robotics isn’t the vision or the movement—it’s the mapping. Your camera sees the world in pixels (u, v), while your myPalletizer 260 M5 moves in millimeters (X, Y, Z). To bridge these two worlds, we need to find a mathematical “bridge” called a Transformation Matrix.
The Algorithm: 2D Affine TransformationSince the myPalletizer 260 is a 4-axis palletizing robot arm typically used for picking items off a flat surface, we can simplify the complex 3D “Hand-Eye” problem into a 2D Affine Transformation. An Affine Transformation is a geometric mapping that preserves points, straight lines, and planes. It accounts for four specific types of changes:
1. Translation: Moving the origin (The camera isn’t exactly at the robot’s [0, 0] ).
2. Rotation: The camera might be mounted at an angle relative to the robot’s base.
3. Scaling: Converting “pixels” to “millimeters.”
4. Shear: Correcting any slight tilt in the camera lens. Mathematically, Claude uses the cv2.estimateAffine2D function to solve.
for a 2 X 3 matrix (M):
The “Nine-Point” StrategyThe code implements a Nine-Point Grid movement. While you technically only need three points to define an affine transform, using nine points provides redundancy.
- Sampling the Workspace: By moving the arm in a 3x3 grid, we sample the camera’s entire field of view.
- Error Minimization: Claude uses Least Squares Estimation (via OpenCV) to find the matrix that best fits all nine points, effectively “averaging out” any small errors caused by motor jitter or lens distortion.
The beauty of using Claude Code here is the synchronization. Ordinarily, you would have to manually move the robot, write down coordinates, click a point in the camera, and then manually run a separate math script. Claude’s script automates the “handshake”:
1. It commands the arm.send_coords.
2. It waits for the physical vibration to stop (time.sleep).
3. It captures the image and detects the marker.
4. It correlates the two datasets instantly.
At the end of the process, it saves a calibration_matrix.npy file. This file becomes the “Rosetta Stone” for all future tasks—whether picking a red cube or an ArUco-labeled box.
Perceptive Picking (Color & ArUco)With our calibration matrix saved, the myPalletizer 260 now has a “sense of space.” However, to make it truly autonomous, we need it to “see” specific objects. In this phase, we move from simple movement to Perceptive Picking using a suction pump end-effector.
The Hardware: Why a Suction Pump?For this project, we swapped the standard gripper for a suction pump. In palletizing, suction is often superior for wooden cubes because:
- Zero-Grip Clearance: You don’t need space around the cube for “fingers”; you only need a flat top surface.
- Precision: As long as the nozzle creates a vacuum seal, the pick is very secure.
- Offset Management: We include a TOOL_OFFSET in the code to account for the physical distance between the camera lens and the center of the suction nozzle.
- Inside the Code: How Claude “Sees” I asked Claude to write a modular script that handles both detection and the coordinate handshake.
Here is the logic Claude implemented:
1. The HSV AdvantageClaude chose the HSV (Hue, Saturation, Value) color space over standard RGB.
- Why? RGB is very sensitive to brightness. By using HSV, Claude can isolate the “Hue” (the color itself) from the “Value” (how much light is hitting it).
- The Reality Check: Even with HSV, I found that direct overhead lighting is a challenge. For wooden cubes, direct glare can wash out the “Saturation, ” causing the robot to “blindly” skip a block. To fix this, Claude added a GaussianBlur and Morphological Operations (OPEN and CLOSE). These functions act like a “digital sandpaper, ” smoothing out noise and filling in small holes in the color mask caused by glare or wood grain.
2. Pixel-to-Millimeter Conversion: Once a color contour is found, the code calculates the “Moment” (the center of the cube). It then performs the following transformation:
This allows the robot to calculate exactly how many millimeters to move from its “Home” observation position to the center of the detected object.
The “Agentic” WorkflowWhat makes this script special isn’t just the OpenCV logic—it’s how it interacts with Claude Code.
- Scan Mode: The script runs in a loop, showing the camera feed and highlighting detected cubes.
- The Handshake: Once I press ‘C’ to confirm, the script outputs a clean JSON-like list of detected cubes.
- Decision Making: Claude reads this output, identifies which cube matches my natural language request (e.g., “Pick the red one”), and then calculates the movement sequence: Move ➡ Descend ➡ Pump ON ➡ Lift.
Now, we combine these logic blocks for a real-world scenario: Pick a color cube and place it on a specific ArUco ID.
The Workflow:
1. Find the Red Cube.
2. Pick it up.
3. Search for ArUco ID 10.
See the demo here:
The “Skill” TransformationThis is where we move from running individual scripts to an Agentic Skill.
What is a Skill?Per the anthropics skills repo: >Skills are folders of instructions, scripts, and resources that Claude loads dynamically to improve performance on specialized tasks. Skills teach Claude how to complete specific tasks in a repeatable way.
A skill needs a SKILL.md just like main in programming language. At the start of that md is a YAML frontmatter:name and description of the skill. The description is important as Claude Code whether to use the skill when user ask to do something.
Anthropic even created a skill to create other skills:
And we will use it to create our robot arm skill.
Using the skill-creator PluginWe use the 2026 skill-creator plugin to automate the setup.
The Command:
Bash claude plugin call skill-creator –name “PalletizerPro” –dir “./skills/robotics” The Final Prompt to Claude:
“Claude, use the skill-creator to bundle our calibration, color detection, and ArUco logic into the ‘PalletizerPro’ skill. Define a command called /sort-by-id that takes a color and an ID as arguments.”
Now, you can simply type:
/sort-by-id color=blue id=12
or just tell him to pick the blue cube using natural language.
Claude will autonomously decide which scripts to run, handle the vision processing, and execute the physical move.
Reflections: The Reality of the Human-AI Collaboration LoopWorking with an agentic tool like Claude Code to control hardware is a unique experience. It feels less like using a traditional IDE and more like supervising a very fast, occasionally over-confident junior engineer.
The “Superpower”: Auto-CorrectionThe most impressive part of this workflow is the terminal feedback loop. When Claude writes a script to move the myPalletizer 260 and the pymycobot library throws a serial exception or a joint-limit error, I don’t have to copy-paste the error back into a chat window.
Claude sees the traceback in the terminal immediately. It often realizes its own mistake before I do, saying, “I see the Serial port was busy; let me close the connection and try a different baud rate.” This “self-healing” code ability is a massive time-saver when debugging hardware-software interfaces.
The “Trap”: When the AI Goes the Wrong WayHowever, it isn’t magic. There are two distinct ways the process can stall:
1. The Logic Loop: Sometimes, Claude identifies a problem but proposes a fix that doesn’t work. If it fails again, it might try a slightly different version of the same wrong fix. I’ve watched it get stuck in a “retry loop” where it just moves the same error around in the code.
2. The Wrong Direction: Occasionally, it can fundamentally misunderstand why a physical movement failed (e.g., thinking it’s a software bug when a cable is actually unplugged).
The Verdict: Is it worth $20 a month?This is the big question for 2026. While the Pro plan at $20/month is currently the entry point for Claude Code, it comes with caveats:
• The Quota Crunch: Community reports suggest Anthropic has been quietly decreasing user quotas “behind the scenes” to manage the massive compute load of the newer Opus models.
• The Opus Trap: Asking just a few complex architectural questions using Opus 4.6 can burn through your 5-hour session limit in minutes. If you are doing heavy robotics work, you’ll find yourself hitting that “Usage limited until 4:00 PM” message faster than you’d like.
• Strategy: If you want to try this setup, subscribe monthly. AI is evolving so quickly in 2026 that an annual commitment is risky—today’s “best” tool might be eclipsed by a new model or a better pricing structure next month.
SummaryAt the end of the day, for the speed of building a perceptive robotic agent in a single afternoon? Yes, it’s worth it—as long as you treat Claude as a collaborator you need to supervise, not an “autopilot” you can ignore.
Developers are welcome to participate in our User Case Initiative and showcase your innovative projects: https://www.elephantrobotics.com/en/call-for-user-cases-en/.





Comments