Bot Pursuit

Building robots is hard, but AI is making it easier. Ben Caunt's bot follows objects using Moondream AI, and you can try it for yourself.

Nick Bild
10 months agoRobotics
Follow me! (📷: Ben Caunt)

Building a robot comes with more than its fair share of challenges, to be sure, but the nuts and bolts of the assembly process are not where the biggest difficulties lie. Writing the software necessary for that robot to carry out complex actions is where the real challenges come in. This is because the software must integrate multiple components, such as sensors, actuators, and decision-making algorithms, to enable the robot to perceive its environment, make informed decisions, and execute tasks with precision and accuracy. Moreover, the software must be able to adapt to changing situations and handle unexpected conditions, making the programming process a difficult task even for experienced engineers.

In some ways, the software development process is getting a bit easier, however. Thanks in large part to the rise in accessibility of artificial intelligence (AI)-based tools, it is much easier to handle tasks like perception and decision-making these days. Ben Caunt recently built a very interesting — and capable — little robot that leverages some existing AI services to simplify complex tasks like object detection and tracking. This wheeled robot can be provided with a short text description of something, then it will use computer vision to locate that object and follow it around.

Caunt’s robot leverages a set of three main components, all running on the same local network as the robot, to make this possible. The first service captures image frames from a webcam positioned on the robot and preprocesses them for use by downstream algorithms. An object tracking service then feeds these images into a 2B parameter object detection model provided by Moondream. A user prompt determines what object should be tracked by the model, and it provides coordinates when that object is detected.

Finally, a visual servoing service leverages the coordinates provided by the Moondream model to control the robot’s actuators — via Robot Operating System (ROS) commands — to follow the target object and keep it in view. This object could be a person, a pet, or anything else that the Moondream model can recognize.

If you have a ROS-powered robot with a webcam ready to go and are looking for something interesting to do with it, it would be worthwhile to reproduce Caunt’s project. The GitHub repository has the source code (under an Apache license) and setup instructions you need to get your robot up and running quickly. Oh, and don’t forget to post your project on Hackster so we can check it out!

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles