After discovering that my 25 years old Onkyo stereo amplifier is still working, I wanted to give it a second life. An M5StickC Plus (ESP32 pico) which I had obtained some time ago should serve as internet radio player. It provides built-in WiFi, an LCD, some buttons, and I²S output for digital audio. To achieve a sufficiently high audio quality, I decided to use an external digital-to-analog audio converter, the PCM5102A, with a line-out port.
Luckily, some similar projects exist already, and there is a great library - the ESP32-audioI2S - which does most of the work. So the project was feasible in a reasonable amount of time. In the course of the project, I learned quite a few things about audio streaming.Key Features
- Listening to internet radio stations
- Station and song information shown on display
- Switching the radio station on button press
- Audio playback using I²S and a DAC with up to 32-bit
- Mute and soft unmute during station change
- Device and network information shown during startup
I used Visual Studio Code with the PlatformIO IDE for this project. In the
platformio.ini file of the project, library dependencies to the
M5StickCPlus library, and the
ESP32-audioI2S library are included.
platform = espressif32
board = m5stick-c
framework = arduino
upload_speed = 1500000
monitor_speed = 115200
build_type = debug
build_flags = -D CORE_DEBUG_LEVEL=4 ; 'Debug'
monitor_filters = log2file, esp32_exception_decoder, default
Furthermore, I use the
build_flags option to enable debug-level log messages, and the
monitor_filters option to enable file logging and exception stack trace decoding.
The commented source code is available in the GitHub repository of this project. To build the application, you additonally need to create a
WifiCredentials.cpp file in the
src folder, with the following content:
const char *WifiCredentials::SSID = "Your WiFi SSID";
const char *WifiCredentials::PASSWORD = "Your WiFi password";
The following figure shows the required connections between the M5StickC (Plus) and the PCM5102 DAC board (digital-to-analog converter). Two cables are required for powering the DAC board with 5V, and three cables for the 3-wire I²S digital audio connection. In addition, the SCK pin must be connected to GND which can be accomplished through a solder bridge on the DAC board.
The following figure shows a simplified overview of the processing of the internet radio. The horizontal swimlanes represent its major components: the application program, the audio library ESP32-audioI2S, and the ESP32 itself along with its API (application programming interface), the ESP-IDF.
- The application configures the GPIO pins, establishes the WiFi connection, and invokes the functions of the audio library in order to connect to a internet radio station and to process the incoming stream from the radio station. Furthermore, the application displays stream meta data on the LCD screen and reacts to button presses.
- The audio library generates the required HTTP requests and processes the HTTP response of the streaming server according to the Icecast protocol. It decompresses the incoming compressed audio data frames and provides the raw audio data (PCM) to the ESP32 API in order to generate I2S audio output through the GPIO pins.
- The ESP32 API provides the required low level functionality e.g. for creating operating system tasks, networking (TCP/IP stack, WiFi), and accessing the ESP32 peripherals such as I²S, DMA, and GPIO. Some details on the use of the I²S driver and DMA buffers are described in my Audio Visualization project.
When the radio application calls the function
Audio::connecttohost, for example, with the URL
http://streams.radiobob.de/bob-national/mp3-192/streams.radiobob.de/ as parameter, the function sends the following HTTP request to the host
GET /bob-national/mp3-192/streams.radiobob.de/ HTTP/1.0
Authorization: Basic Og==
User-Agent: ESP32 audioI2S
The host answers this request, for example, with the following HTTP response which redirects the client to a different location:
HTTP/1.0 302 Found
Cache-Control: no-cache, must-revalidate
Expires: Sat, 05 Feb 2022 11:16:31 GMT
Status: 302 Moved Temporarily
Date: Sat, 05 Feb 2022 11:16:31 GMT
After the client sends another request to that location, it receives the following response header containing information about the radio stream, followed by the actual audio data:
HTTP/1.0 200 OK
icy-description: RADIO BOB! National
Server: bob.hoerradar.de 8.6.1
Expires: Mon, 26 Jul 1997 05:00:00 GMT
icy-name: RADIO BOB! National
Set-Cookie: AISSessionId=60aea84800a186e4_22000699_hXEVrHIJ__0000004vGPw; Path=/; Domain=bob.hoerradar.de; Max-Age=6000; Expires=Sat, 05 Feb 2022 12:56:31 GMT
According to this response, the audio stream contains audio/mpeg (i.e. mp3) data and has a bit rate of 192 kB/s per second. Furthermore, meta data (e.g. artist and song title) is inserted into the stream every 16000 bytes. After receiving this header, the client can periodically read audio data from the connection. The following figure shows the arrival of audio data through the connection.
From the incoming mp3 audio data stream, the
esp32-audioI2S library functions extract mp3 frames. An mp3 frame contains 1152 audio samples. In case of a sample rate of 44.1 kHz this makes up a time length of approx. 26.1 ms (1152/44.1 ms). Each mp3 frame starts with an MPEG audio frame header of 4 bytes length. The first 11 bits of the header are sync bits which are always set to
1. The header contains, among other things, the MPEG version, MPEG layer, bit rate, sampling rate, channel mode, and a padding flag. Decoding the header
FF FB B2 44 (see figure below) renders the following information:
- MPEG Version 1
- Layer III
- bit rate 192 kB/s
- sampling rate 44.1 kHz
- frame is padded with one extra slot
The byte size of the mp3 frame can be computed from the bit rate, sample rate, and padding flag. With a bit rate of 192 kB/s, a sampling rate of 44.1 kHz and a padding slot included, the frame has a byte size of 627 bytes including the frame header: floor(144 * 192/44.1) + 1. Some examples of mp3 frames from a data packet received via HTTP are shown in the following figure.
Decompressing such an mp3 frame into PCM audio data, i.e. raw audio samples, results in 2 channels * 2 bytes/sample * 1152 samples = 4608 bytes which are passed on to the I²S driver. The I²S driver takes care of sending the audio samples to the I²S controller via DMA transfer. The I²S controller eventually provides a (digital) I²S audio signal at the GPIO pins of the ESP32.Digital-to-Analog Conversion
The I²S DAC board connected to the M5StickC GPIO pins converts the digital I²S audio data into an analog signal which can then run through a stereo amplifier, in my case a somewhat older 2 x 110 W Onkyo amplifier.
In Part II of this project, the audio device gets a bluetooth receiver mode in addition to the internet radio mode.
In Part III the artist and title of the current radio song is sent to an IFTTT webhook in order to store it in a list of favourite songs.