Robot Projects => Stationary => Topic started by: Ralph on January 09, 2016, 01:58:02 PM

Title: Buster – A Voice Controlled Robot Arm (WIP)
Post by: Ralph on January 09, 2016, 01:58:02 PM
Buster is my work-in-progress, interactive voice controlled robot arm.  Buster accepts basic commands in spoken English.  Buster will also answer basic questions about his status.
*busterforrr.jpg (10.52 kB . 400x300 - viewed 478 times)

After several months of tinkering, I was pleased to recently hit the major milestone of having the foundational elements for speech recognition, speech synthesis, and arm control integrated and playing nicely together.  I now have a solid platform for exploring some of my natural language processing ideas.

Even at this nascent stage, Buster can be fairly clever.  He recognizes that “raise the arm,” “lift arm,” and “move the arm up,” all are roughly equivalent.  Similarly, he recognizes that “what is the height of the arm” and “how high is the arm” are asking for the same information.  Buster thinks in millimeters, but if you give a command in centimeters he will convert the value.  If you give a command that would move beyond the arm's range, Buster will remind you what the limit is, and won't execute the command.

I have a lot of ambition for Buster.  In the near term I'll be focused on making him smarter, not just in terms of more commands and system queries, but also adding general conversational and question answering functionality.  I'd also like to integrate some basic vision functions and perhaps other sensors and even other appendages.  To that end, modularity has been a primary focus in Buster's design.  I'm going on the assumption that over time virtually every element of hardware and software will be upgraded, replaced, enhanced or added to.

For now, I wanted to share my progress here and show this short video of Buster 0.1 in action:

Some facts/details about the current Buster build:

Buster's brain is a Raspberry Pi 2 running the standard Raspian operating system.

I'm programming in C++ using the GNU GCC compiler.

Speech recognition is handled using the open source PocketSphinx library.  (I have a walk through on installing PocketSphinx here: (, and a boilerplate code example here: (

For speech synthesis I'm using the open source Flite library.

PocketSphinx and Flite are Carnegie Mellon University Language Technologies Institute projects.  They are both offered as lightweight implementations of more comprehensive tools (Sphinx and Festival respectively) which made them appropriate choices for Buster.  Running realtime speech recognition and speech synthesis is pushing the Pi somewhat to its limits, so lightweight is the order of the day.

Buster's secret sauce is the command and query parser, a set of C++ routines that rely heavily on regular expression pattern matching.  PocketSphinx will output a string of words that is oblivious to any meaning.  The parser  examines this output looking at both keywords and word order, looking for known structures.  The parser then decomposes the string of words into a structured command or query. Also, PocketSphinx returns spoken numbers as text (i.e. “TEN”), which the parser will convert to a numeric value.

The parser accommodates a lot of variation in terms of sentence structure and synonyms.  At the same time it is fairly restrictive, in that it doesn't make too many educated guesses.  If too much information is missing or not understood, Buster will simply say “I did not understand what you said” and take no action.

Buster's arm is a MeArm that I assembled from a kit.  At around US$50, I think the MeArm is a pretty good value.  Since it uses standard 9-gram hobby servos, it was familiar and easy to interface with the electronics.  MeArm has some obvious limitations and a well-known issue with strain on the servo that rotates the base.  But I've successfully picked-up small objects and moved them around, so for the moment I'm happy enough.

To drive the servos I'm using a bare ATMEGA328P IC programmed as an Arduino.  The Arduino and the Pi are coupled using SPI.

The microphone is the mic from a Logitech USB webcam.  I've been using the webcam in other projects, and since Buster will hopefully have vision soon it made sense to use it now.

The speaker is a cheap amplified unit I picked up from DX for a couple of bucks.  Overall right now speech quality is fairly poor, but this can't all be blamed on the speaker.  The actual synthesis out of Flite and the Pi's PCM audio output are both culpable as well.
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: erco on January 09, 2016, 04:10:49 PM
Ralph: LIKE. No, LOVE.

That is seriously AMAZING! Well done, great execution. Neatest thing I've seen in a long time. What great potential. I'm a fan!

Write it up and submit to SERVO!
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: cevinius on January 10, 2016, 02:18:00 AM
Fantastic work!!! Thank you for posting this.
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: mogul on January 10, 2016, 07:30:31 AM
Now this is really cool.

How often does it get things wrong? is it tuned to your voice? Or did you learn to speak a way that pleases the program?

Also I notice a delay between your command and the machine start executing. Raspberry pi the bottleneck here?

Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: Ralph on January 10, 2016, 11:57:12 AM
First – thanks to all for the enthusiastic response.  I sincerely appreciate the interest and encouragement!

@mogul  Great questions.  Thanks for asking.
With PocketSphinx speed, accuracy, and the size of the domain are all interrelated.  You can expect better performance with a smaller vocabulary and less complex potential inputs.

Even though I allowed for a lot of natural language variation for the potential inputs, Buster's current domain is still very small.  As I add functionality and the domain grows, my challenge will be to continue optimizing so that both speed and accuracy continue to be acceptable. 
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: MEgg on January 10, 2016, 12:15:59 PM
Great stuff!

Since you are using the Raspberry Pi 2 did you check if the software PocketSphinx uses all 4 CPU cores or if you can dedicate one core to
PocketSphinx  if it uses only one core?
Perhaps "taskset" does help?
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: Ralph on January 10, 2016, 09:20:47 PM
The PocketSphinx decoder runs single threaded.  For what it's worth, it is said to be thread safe and I've come across references of people running multiple instances of the decoder in a multicore system using multiple threads. 

The entire Buster is application is running in a single thread on a single core.  The flow moves sequentially in an endless loop of listen > decode > parse > execute.  Ultimately, the problem with the current state is that when listening, Buster can't do anything but wait patiently for the next command.   Eventually, I'd like to run the PocketSphinx decoder in its own thread and get Buster multitasking.  I optimistically plan to handle the threading within the C++ code itself.
Title: Re: Buster – A Voice Controlled Robot Arm (WIP)
Post by: craighissett on January 11, 2016, 06:47:58 AM
This is simply tremendous.
I have always wanted to use pocketsphinx and hope to so so in a desktop assistant project coming up soon.
This is bloody brilliant!