This is a slightly tongue-in-cheek project motivated by my desire to interact with a voice assistant as if they're my own private secretary, and using a rotary phone is perfect for creating this illusion.
SetupI started by getting the Alexa Voice Service running on a Raspberry Pi. This was relatively straight forward as Amazon provides a Pi compatible SDK with a sample project and also a detailed guide.
At a high level, it just involves registering a product and setting up a security profile with Amazon via the AVS dashboard, building the sample project, and then authenticating it with Amazon. At this point, I could interact with Alexa on the Pi.
The Rotary PhoneThere is a surprisingly large number of modern reproduction rotary phones available. The one I chose is a GPO (General Post Office) 746 which, as I understand it, was a very common phone in the UK 30-plus years ago.
After taking the phone apart, I could see, on the base of the phone, there were four wires from the handset, two for the microphone and two for the speaker.
I took these wires and screwed them into two 3.5mm mono adapters. I then plugged the speaker into the audio jack on the Pi and the microphone into a USB audio adapter. After this step, I was able to interact with Alexa via the handset.
The top section of the phone houses the switchhook, this is an important component of the project as I was keen to remove the need for a wake-word and trigger Alexa just by lifting the handset.
I connected the wires from the switch the Pi's GPIO pins via a breadboard with the below circuit to avoid floating inputs. More information regarding connecting a switch to a Raspberry Pi can be found here.
I wanted Alexa to be muted by default when the handset is down and then unmute and trigger the 'tap to talk' feature when the handset is lifted off the switchhook. My initial plan was to modify the sample app that comes with the SDK, it's a C++ app that should work with the WiringPi library but, unfortunately, I had issues building the app with my additions. Thankfully, I realised there was a simple alternative approach.
The sample app is started with a bash script, it then gives you a list of options for interacting with Alexa. The options wanted to invoke are listed there as 'Privacy mode' and 'Tap to Talk'.
-----+
| Options: |
| Wake word: |
| Simply say Alexa and begin your query. |
| Tap to talk: |
| Press 't' and Enter followed by your query (no need for the 'Alexa').|
| Hold to talk: |
| Press 'h' followed by Enter to simulate holding a button. |
| Then say your query (no need for the 'Alexa'). |
| Press 'h' followed by Enter to simulate releasing a button. |
| Stop an interaction: |
| Press 's' and Enter to stop an ongoing interaction. |
| Privacy mode (microphone off): |
| Press 'm' and Enter to turn on and off the microphone. |
| Playback Controls: |
| Press '1' for a 'PLAY' button press. |
| Press '2' for a 'PAUSE' button press. |
| Press '3' for a 'NEXT' button press. |
| Press '4' for a 'PREVIOUS' button press. |
| Settings: |
| Press 'c' followed by Enter at any time to see the settings screen. |
| Speaker Control: |
| Press 'p' followed by Enter at any time to adjust speaker settings. |
| Firmware Version: |
| Press 'f' followed by Enter at any time to report a different |
| firmware version. |
| Info: |
| Press 'i' followed by Enter at any time to see the help screen. |
| Reset device: |
| Press 'k' followed by Enter at any time to reset your device. This |
| will erase any data stored in the device and you will have to |
| re-register your device. |
| This option will also exit the application. |
| Reauthorize device: |
| Press 'z' followed by Enter at any time to re-authorize your device. |
| This will erase any data stored in the device and initiate |
| re-authorization. |
| |
| Quit: |
| Press 'q' followed by Enter at any time to quit the application. |
+----------------------------------------------------------------------------+
To be able to invoke these commands programmatically, I wrote a Node script that spawns the sample app as a child process and injects any required command via stdin. It uses the pigpio library to watch the relevant GPIO pin for changes from the switch, which is how the appropriate commands are invoked.
ConclusionI'm really pleased with how this project turned out. The phone is really fun to use, I keep it on my desk and mostly use it for controlling lights and checking the weather. Not needing to use a wake-word is great as I've always found the wake-word to be the most flawed aspect of voice assistants. It's far too easy to trigger that with any word that even sounds vaguely similar.
There is a small limitation with using AVS and that is that Amazon doesn't allow you to make calls with it unless you're building a commercial device, in which case they whitelist your access token. This is unfortunate as being able to make calls with this would obviously be brilliant.
Comments