•

Published March 20, 2020 © GPL3+

Spark Sound Detection From Raw Audio Data

We solve a spark sound recognition task from raw audio data on a SparkFun RedBoard Artemis ATP board. Spark detection on a SparkFun!

AdvancedFull instructions provided20 hours3,029

Spark Sound Detection From Raw Audio Data

Things used in this project

Hardware components

SparkFun RedBoard Artemis ATP

We recorded and classified spark sound with the Artemis board. A neural network was deployed that processes the raw audio signal in real-time.

Arduino Due

We used the Arduino Due to control electric spark generation.

SparkFun Micro OLED Breakout (Qwiic)

We employed the OLED screen to display classification results.

Software apps and online services

Microsoft Windows 10

We built the project on Windows 10.

TensorFlow

The neural network was developed in TensorFlow+Keras. Later, TensorFlow Lite Micro was used to deploy the trained model on the RedBoard Artemis ATP. Also, advanced methods were implemented in Keras to enhance model accuracy and robustness.

Ambiq SDK

The data acquisition was implemented by using the Ambiq SDK.

Story

Spark sound detection from raw audio data

Abstract:

The project's aim was to solve a pattern recognition task from a raw time-domain signal. We employed the Sparkfun RedBoard Artemis ATP module and the integrated MEMS microphone to record and classify environmental sounds. In this project summary, we present a simple pipeline for beginners to train and deploy a simple Neural Network (NN) and advanced methods as well that can be used to improve model performance. We hope that the experience collected during this challenge can be utilized in our gunshot detector designed for elephants.

Introduction:

Neural networks used for sound classification usually interpret their inputs as images. This is done by calculating the 2D spectrogram of the raw audio recording. However, there are situations when the spectrogram conversion leads to the loss of relevant information. One example is gunshot detection, where the ballistic shockwave sound has such a unique shape (resembles a capital letter N) that detectors based on this raw signal shape work more accurately than spectrogram based solutions. Our idea comes from this scenario because there may be many possible events that have a particular shape in the time domain.

In our case, these unique shaped audio signals were generated by sparks. The spark is an abrupt electrical discharge that produces a brief emission of light and a sharp crack or snapping sound. This sound contains very high frequencies and has a short length in time (around 4 ms). This spark acoustic event can be recorded by the MEMS microphone integrated on the Redboard Artemis ATP, and an example recording is illustrated in Figure 1.

Figure 1.: Example spark sound recorded by the MEMS microphone

The recorded spark noises don't have exactly the same shape, but all of them contain several spikes with similar lengths. These similarities should be understood by a NN to perform the detection task.

Goals, experimental setup, and data collection:

In summary, we built a classifier that can detect spark noises. To achieve this goal, we have collected spark sounds with different impulsive background noises, by employing a loudspeaker, a spark generator, and the RedBoard Artemis as the data collector. The background noises help to generalize the knowledge of the detector. and make the detection task harder.

The used background noises were: car horn, spoken digits, dog barking, gauss noise, gunshots, jackhammer, various music, siren, silence.

The basic pipeline was the following:

Record sparks with different background noises produced by a speaker
Record only background noises as negative samples
Collect these recordings into a dataset with binary labels – 0: no spark; 1: contains a spark
Train a simple model and deploy it on the Sparkfun Redboard Artemis ATP
Train various models with advanced methods involved
Evaluate models

The data collection setup contained the RedBoard Artemis as the recording device. An extra device, an Arduino Due controlled a relay that let through high current to produce sparks with a DC-DC Booster. The whole procedure was synchronized by a PC, which also playbacked various background noises from a loudspeaker. The setup is illustrated in Figure 2. The RedBoard Artemis ATP recorded the superposition of the background noise and the spark sound. One such combined recording is illustrated in Figure 3, where car_horn noise was produced during the measurement. The impulsive region in the middle of the recording can be spotted, which corresponds to the spark sound. Within the recordings, the locations of the sparks vary to prevent over-fitting to a specific location.

Figure 2. Data collection setup.

Figure 3.:Example recording that caontains a spark with car horn background noise.

The resulting dataset contained:

Dataset

From all classes, 100 samples were added to the training, 30 samples to the test and 20 samples to the validation sets. The remaining samples were left out for possible future work directions mentioned later.

The model is initially fit on a training dataset that is a set of examples used to fit the parameters of the model.

Successively, the fitted model is used to predict the responses for the observations in a second dataset called the validation dataset. The validation dataset provides an unbiased evaluation of a model fit on the training dataset while tuning the model's hyperparameters.

Finally, the test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset

In the following, we will describe a simple pipeline for beginners that will result in a deployable model. Later, more advanced techniques will be presented that help to improve accuracy and robustness.

Simple pipeline:

In this section, we cover the major steps that are required to train and deploy a baseline neural network architecture. The methods and results are limited, but this could serve as a good starting point for further improvements and a stable initial project that can be upgraded.

The major steps that will be presented in detail with additional example codes:

Data collection on the RedBoard Artemis ATP with the MEMS microphone
Baseline model training and conversion to a quantized model.
Model deployment and inference.

[Data collection details] -> [Training on a GPU] -> [Model deployment] -> [Inference]

Let's start with the data collection! In the previous section, we already presented the measurement setup. The interesting code related to the measurements is the audio recording on the RedBoard Artemis ATP through the PDM interface. You will need the AmbiqSuite-R2.3.2 SDK with the SparkFun board BSP files included. You can get the SparkFun extension from here:

https://github.com/sparkfun/SparkFun_Apollo3_AmbiqSuite_BSPs

Just move the boards_sfe folder next to the AmbiqSuite-R2.3.2/boards folder.

We started from the AmbiqSuite-R2.3.2/boards_sfe/common/examples/pdm_fft example code. This code collects audio and computes the recorded signal's Fourier Transform continuously. It also processes this information, then transmits it to the user through the serial port. Our code will implement the following:

PDM interface initialization with a reduced sampling frequency (11718 Hz)
UART initialization
Waiting for a command from the user: 'r' represents the recording command
If 'r' was received, start recording 12000 samples (around 1 second)
Once finished, send the data to the user at 1 MB/sec speed.
Waiting for another command, and so on...

We included the source code with a bunch of comments, so interested readers can go through and understand the details. The interesting part of the code implements the functionality explained above:

/*waiting for a trigger from the PC*/
while(am_bsp_com_uart_transfer(&transfer_config) != AM_HAL_STATUS_SUCCESS)
    ;
/*if the received character is an 'r', start a ~1 second long recording*/
if(readBuffer[0] == 'r') {
    am_devices_led_on(am_bsp_psLEDs, 0);
    g_bPDMDataReady = false;
    am_hal_pdm_fifo_flush(PDMHandle);
    /*start data collection by utilizing the DMA*/
    pdm_data_get();
    /*go to sleep*/
    am_hal_sysctrl_sleep(AM_HAL_SYSCTRL_SLEEP_DEEP);
    /*wake-up trigger from the DMA*/
    while(!g_bPDMDataReady)
        ;
    am_devices_led_off(am_bsp_psLEDs, 0);
    /*send the data through the UART*/
    am_bsp_com_uart_transfer(&transfer_config_writebuffer);
}

With this application, we could perform data collection triggered by the PC. We also implemented the PC-side data collector in Python, which utilized the PySerial module to transmit commands and to receive the recorded data as a response.

The Python program also sent commands to an Arduino Due board, which controlled a relay to generate sparks. Before sending the 'recording' and 'spark generation' commands, the PC program had started playing a randomly chosen audio file on a loudspeaker. These clips served as the background noises explained earlier.

A video recording about the data collection experiment can be accessed here:

Video 1.: Data collection - spark generation

[Data collection details] -> [Training on a GPU] -> [Model deployment] -> [Inference]

Once we had collected enough data, we could implement our first, simple training process, which produced a baseline NN model.

We have added the source code of the training process to this project, but it is also available as a Python notebook here: Training notebook. Google Colaboratory is a great place for beginners to test out their ideas in a controlled environment with freely available GPUs.

This notebook contains the main steps of model training, which include:

Data loading: positive and negative examples
Data separation into train, validation and test sets: 100 + 20 + 30 samples
Neural network model creation: simple convolutional neural network
Model fitting - training: default training parameters
Model evaluation: evaluated on the test dataset
Model conversion to TensorFlow Lite model.
Model conversion to a byte array, which can be uploaded to the Artemis Board.

The model used in this example consists of a convolutional layer with 2 kernels and a maximum pooling layer. that covers the whole feature vectors produced by the convolutions. With its simplicity, this model could hardly generalize knowledge, but it could achieve high accuracy of around 94% on the test dataset, which is in the acceptable range. The architecture is illustrated in Figure 4. (Note: Conv2D is used, because the TF Lite Micro only supports this operation, otherwise Conv1D would be required.)

Figure 4.:Structure of the proposed neural network that performs spark sound detection from raw audio input

The trained model had pros and cons:

+ small size, only 71 trainable parameters

+ can handle the 12000 samples long input

+ the Artemis board can run it within 1 sec

- very sensitive to noise, cannot generalize

[Data collection details] -> [Training on a GPU] -> [Model deployment] -> [Inference]

The deployment starts from the byte array, which was generated in the previous section during training. To implement an application that continuously records and processes audio signals, we started from the micro_speech example application found in TensorFlow Lite Micro examples. This example code collects audio signals and tries to detect and classify 'yes' and 'no' spoken keywords. We changed the following parts of the code:

Data acquisition: 12000 sample long buffers were filled at a sampling rate of 12 kHz
The model structure: the model trained earlier was used
Detection response: UART communication instead of LED blinking

The TensorFlow library does not support the Artemis boards, but there is a repository in which the porting has already started, and e.g. the micro_speech example can be compiled and uploaded. The repository.

Note that during the first build the makefile triggers the downloading of an out-dated version of the Abmiq SDK, which contains errors that were fixed in newer versions. One such error was related to the PDM clock configuration, therefore the sampling frequency of the audio recording cannot be changed through the corresponding am_hal_ interface. A possible solution is to modify the base makefile to download the newer version of the SDK.

To deploy our model, we only substituted the array found in micro_features/tiny_conv_micro_features_model_data.cc with the byte array that was generated earlier during the model training, so the manipulation of the Makefile was not required.

The data acquisition part was similar to the already introduced method. When the DMA finished with the collection of a 1 second long part, it generates an interrupt and starts recording a new 1 second long period into another buffer. Meanwhile, the already filled buffer is processed by invoking the trained neural network and the result is forwarded to the PC through UART.

The operation of the detector is presented in Video 2. On the left side, the Arduino connection is visible, which shows that when an 's' character is sent, a spark is generated. Shortly, a "Spark detected!" message is expected on the right side, which prints the messages from the Artemis Board.

Video 2.: Spark detection demo

Summary:

In this section, we presented a baseline solution for a detection problem that aims to classify audio recordings based on spark sound containment. We included source codes for data collection, for model training, and for model deployment and inference.

Advanced methods:

The simple model trained in the previous section achieved acceptable accuracy on the test dataset, however, during its real-world evaluation we could test its robustness against other impulsive loud events like claps or knocks. Based on these experiments it could be concluded that the model was capable of recognizing loud, impulsive events rather than spark sound only. It is reasonable that such a simple architecture cannot generalize knowledge to detect these complicated patterns in complex background noises.

In the current section, we demonstrate the usage of advanced methods that can help to find more suitable models with enhanced accuracy and robustness, and optimal memory and computational complexities.

accuracy: the ratio of correctly classified examples

robustness: the measure of the average input perturbation amplitude that mislead a classifier

memory complexity: the total amount of memory required to run a model

computational complexity: the total number of floating point operations that must be executed to run a model

The simple model presented earlier was created in an ad-hoc way, based on some experience. Even if an initial architecture is known, its hyper-parameters that provide the best results are unknown. Therefore, we started from the baseline model and implemented a searching algorithm that is capable of finding a superior hyper-parameter set. This approach is called grid-search, which collects hyper-parameters into sets from given intervals and test these configurations based on some metrics. In our case, the parameters that were taken into consideration were the following:

the number of kernels in the convolutional layer: [3, 5, 8, 13]
dilation rate of the convolutional kernels: [1, 2, 3]
size of the convolutional kernels: [15, 36, 57, 93, 150]

To evaluate a particular hyper-parameter set we employed the accuracy and robustness metrics. The accuracy is simple, it is the ratio between the correctly classified and the total number of examples. The robustness is more complicated. Without the full scientific background, it can be summarized as the measure of model insensitivity to input perturbations, and the average amplitude of these perturbations that imply false classifications is the measure of this property. The research field that studies this parameter is called adversarial machine learning. We used a slightly modified version of the DeepFool method to measure this property of our NNs.

Besides the hyper-parameter optimization, we also extended the model performance checking by adding Gaussian noise with different standard deviation values to the inputs. With the incrementation of the noise level, the signal-to-noise ratio decreases, which makes the detection problem even harder. The noise parameters were chosen from the [0.00, 0.01, 0.05, 0.1] set. To make these values interpretable, an example recording is shown in Figure 5 with the different noise levels. It can be observed that in the most extreme case the spark shape is totally lost in the noise.

Figure 5.: Visualization of the effects of various noise levels.

All combinations from the presented parameter value intervals were picked and the corresponding neural networks were generated accordingly. This resulted in 240 generated models. Each network was trained on the same training dataset and evaluated on the validation dataset. The Gaussian noise was generated on-the-fly during the trainings, which were carried out with the following parameters:

Batch size: 5
Early Stopping: monitored the training loss with the patience of 10 epochs
Optimizer: Adam

The result of the grid-search is illustrated in Figure 6. Here, the x-axis represents the accuracy, and the y-axis shows the logarithm of the average perturbation size. Larger perturbations represent better robustness. Each symbol on the plot has a shape that encodes the noise level, a diameter that expresses memory complexity, and a color that encodes the computational complexity of a neural network. Noise levels' symbols: star - no noise added; circle - noise level 0.01; square - noise level 0.05; triangle - noise level 0.1.

Figure 6.: The 5-dimensional plot about the 240 trained models.

In Figure 6, several point-clusters can be identified. For example, it is observable that a higher noise level reduces accuracy but enhances robustness (triangles in the upper-left corner). Another example is the cluster of squares at the middle, which has an evolution from the left to the right side and from the bottom to the upper side at the same time, which means that some parameter sets improve accuracy and robustness as well.

In our case, a model with good performance and robustness is required, but as we want to deploy it on a microcontroller, the memory and computational complexities must be taken into consideration too. These parameters are encoded into the color and size of a point. According to the color bar, a blue point is required with a small diameter, from the right side of the plot, which point also maximizes the robustness as well. We selected the model, which is represented by the single, outlying blue circle that is above the cluster of circles, at the right side of the top of the squares cluster. This model was evaluated on the test dataset. The parameters and the performance of the model are:

Accuracy on the test dataset:        0.99074
Accuracy on the training dataset:    0.99444
Robustness:                          0.00136
---------------------------------------------
Dilation rate:         1
Kernel size:           57
Number of kernels:     5
Added noise level:     0.01
---------------------------------------------
Memory complexity:         238 KB
Computational complexity:  3.4 MFLOP  (12kS input size)

This model has a higher computational complexity than our baseline model had, so the inference requires the activation of the Burst Mode of the Apollo 3 MCU. In this state, the core clock frequency is doubled from 48 MHz to 96 MHz.

Another advantage of the proposed NN architecture is that the full-window maximum pooling (called GlobalMaxPooling, but not supported in TF Lite Micro) enables the model to accept various input lengths. For example, it was found that if we reduce the input length to 3000 samples from 12000 samples, thememory complexity can be reduced significantly: from 238 KB to 14 KB. A disadvantage is that if we want to run the detector on the signal with overlapping regions to ensure full spark event containment, we must invoke the inference 7 times, instead of the previous 2. However, the MCU is fast enough to handle the computational overhead (a total of 5.6 MFLOPs).

As we applied adversarial attacks to measure the robustness of the NN structures, it is easily possible to visualize some of these adversarial examples. One such example is shown in Figure 7. Here, the goal is to generate a recording that is on the edge of the decision surface of an already trained neural network structure. This example is generated from an originally negative sample (absolute silence) but in the current form, it fools the network so that it would produce a positive label.

Figure 7.: Adversarial example generated from silence. It fools the network that produces a positive label.

These methods are complicated, andwe think that the publication of the source codewould not contribute to the general applicability of the mentioned directions, therefore we only share these files via e-mail upon request.

Project summary:

We implemented a spark sound detector based on a neural network that can be deployed on the SparkFun RedBoard Artemis ATP. The data was collected by utilizing the same device and its integrated MEMS microphone. The data acquisition employed spark generation with different background noises.

A simple pipeline for beginners was explained and a baseline neural network model was deployed. We shared the source code for all the major steps required to solve a similar problem.

Additionally, more advanced methods and ideas were included that enable the enhancement of model performance and robustness.

In the future, we plan to integrate the Artemis board into our Animal-borne gunshot detector, which is under active development. The advanced results presented in this report may provide the basis for the research of these directions.

Incompleted tasks:

Code

//*****************************************************************************
//
//! @file pdm_fft.c
//!
//! @brief An example to show basic PDM operation.
//!
//! Purpose: This example enables the PDM interface to record audio signals from an
//! external microphone. The required pin connections are:
//!
//! Printing takes place over the ITM at 1M Baud.
//!
//! GPIO 10 - PDM DATA
//! GPIO 11 - PDM CLK
//
//*****************************************************************************

//*****************************************************************************
//
// Copyright (c) 2019, Ambiq Micro
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// 1. Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
//
// 3. Neither the name of the copyright holder nor the names of its
// contributors may be used to endorse or promote products derived from this
// software without specific prior written permission.
//
// Third party software included in this distribution is subject to the
// additional license terms as defined in the /docs/licenses directory.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
// POSSIBILITY OF SUCH DAMAGE.
//
// This is part of revision v2.2.0-7-g63f7c2ba1 of the AmbiqSuite Development Package.
//
//*****************************************************************************


#include "am_mcu_apollo.h"
#include "am_bsp.h"
#include "am_util.h"


//*****************************************************************************
//
// Example parameters.
//
//*****************************************************************************
#define PRINT_PDM_DATA              1
#define PDM_DATA_SIZE				12000

//*****************************************************************************
//
// Global variables.
//
//*****************************************************************************
volatile bool g_bPDMDataReady = false;
uint32_t g_ui32SampleFreq;
int16_t g_ui32PDMDataBuffer[PDM_DATA_SIZE];
uint8_t readBuffer[2];
uint32_t bytesWritten;


//*****************************************************************************
//
// PDM configuration information.
//
//*****************************************************************************
void *PDMHandle;

am_hal_pdm_config_t g_sPdmConfig =
{
		.eClkDivider = AM_HAL_PDM_MCLKDIV_1,
		.eLeftGain = AM_HAL_PDM_GAIN_0DB,
		.eRightGain = AM_HAL_PDM_GAIN_0DB,
		.ui32DecimationRate = 64,
		.bHighPassEnable = 0,
		.ui32HighPassCutoff = 0xB,
		.ePDMClkSpeed = AM_HAL_PDM_CLK_1_5MHZ,
		.bInvertI2SBCLK = 0,
		.ePDMClkSource = AM_HAL_PDM_INTERNAL_CLK,
		.bPDMSampleDelay = 0,
		.bDataPacking = 1,
		.ePCMChannels = AM_BSP_PDM_CHANNEL,
		.ui32GainChangeDelay = 1,
		.bI2SEnable = 0,
		.bSoftMute = 0,
		.bLRSwap = 0,
};

//*****************************************************************************
//
// PDM initialization.
//
//*****************************************************************************
void pdm_init(void)
{
	//
	// Initialize, power-up, and configure the PDM.
	//
	am_hal_pdm_initialize(0, &PDMHandle);
	am_hal_pdm_power_control(PDMHandle, AM_HAL_PDM_POWER_ON, false);
	am_hal_pdm_configure(PDMHandle, &g_sPdmConfig);
	am_hal_pdm_enable(PDMHandle);

	//
	// Configure the necessary pins.
	//
	am_hal_gpio_pinconfig(AM_BSP_PDM_DATA, g_AM_BSP_PDM_DATA);
	am_hal_gpio_pinconfig(AM_BSP_PDM_CLOCK, g_AM_BSP_PDM_CLOCK);

	//
	// Configure and enable PDM interrupts (set up to trigger on DMA
	// completion).
	//
	am_hal_pdm_interrupt_enable(PDMHandle, (AM_HAL_PDM_INT_DERR
			| AM_HAL_PDM_INT_DCMP
			| AM_HAL_PDM_INT_UNDFL
			| AM_HAL_PDM_INT_OVF));

	NVIC_EnableIRQ(PDM_IRQn);
}

//*****************************************************************************
//
// Print PDM configuration data.
//
//*****************************************************************************
void
pdm_config_print(void)
{
	uint32_t ui32PDMClk;
	uint32_t ui32MClkDiv;

	//
	// Read the config structure to figure out what our internal clock is set
	// to.
	//
	switch (g_sPdmConfig.eClkDivider)
	{
	case AM_HAL_PDM_MCLKDIV_4: ui32MClkDiv = 4; break;
	case AM_HAL_PDM_MCLKDIV_3: ui32MClkDiv = 3; break;
	case AM_HAL_PDM_MCLKDIV_2: ui32MClkDiv = 2; break;
	case AM_HAL_PDM_MCLKDIV_1: ui32MClkDiv = 1; break;

	default:
		ui32MClkDiv = 0;
	}

	switch (g_sPdmConfig.ePDMClkSpeed)
	{
	case AM_HAL_PDM_CLK_12MHZ:  ui32PDMClk = 12000000; break;
	case AM_HAL_PDM_CLK_6MHZ:   ui32PDMClk =  6000000; break;
	case AM_HAL_PDM_CLK_3MHZ:   ui32PDMClk =  3000000; break;
	case AM_HAL_PDM_CLK_1_5MHZ: ui32PDMClk =  1500000; break;
	case AM_HAL_PDM_CLK_750KHZ: ui32PDMClk =   750000; break;
	case AM_HAL_PDM_CLK_375KHZ: ui32PDMClk =   375000; break;
	case AM_HAL_PDM_CLK_187KHZ: ui32PDMClk =   187000; break;

	default:
		ui32PDMClk = 0;
	}

	//
	// Record the effective sample frequency. We'll need it later to print the
	// loudest frequency from the sample.
	//
	g_ui32SampleFreq = (ui32PDMClk /
			(ui32MClkDiv * 2 * g_sPdmConfig.ui32DecimationRate));

	am_util_stdio_printf("Settings:\n");
	am_util_stdio_printf("PDM Clock (Hz):         %12d\n", ui32PDMClk);
	am_util_stdio_printf("Decimation Rate:        %12d\n", g_sPdmConfig.ui32DecimationRate);
	am_util_stdio_printf("Effective Sample Freq.: %12d\n", g_ui32SampleFreq);
}

//*****************************************************************************
//
// Start a transaction to get some number of bytes from the PDM interface.
//
//*****************************************************************************
void
pdm_data_get(void)
{
	//
	// Configure DMA and target address.
	//
	am_hal_pdm_transfer_t sTransfer;
	sTransfer.ui32TargetAddr = (uint32_t) g_ui32PDMDataBuffer;
	sTransfer.ui32TotalCount = PDM_DATA_SIZE*2;

	//
	// Start the data transfer.
	//
	am_hal_pdm_enable(PDMHandle);
	am_util_delay_ms(100);
	am_hal_pdm_fifo_flush(PDMHandle);
	am_hal_pdm_dma_start(PDMHandle, &sTransfer);
}

//*****************************************************************************
//
// PDM interrupt handler.
//
//*****************************************************************************
void am_pdm0_isr(void)
{
	uint32_t ui32Status;

	//
	// Read the interrupt status.
	//
	am_hal_pdm_interrupt_status_get(PDMHandle, &ui32Status, true);
	am_hal_pdm_interrupt_clear(PDMHandle, ui32Status);

	//
	// Once our DMA transaction completes, we will disable the PDM and send a
	// flag back down to the main routine. Disabling the PDM is only necessary
	// because this example only implemented a single buffer for storing FFT
	// data. More complex programs could use a system of multiple buffers to
	// allow the CPU to run the FFT in one buffer while the DMA pulls PCM data
	// into another buffer.
	//
	if (ui32Status & AM_HAL_PDM_INT_DCMP)
	{
		am_hal_pdm_disable(PDMHandle);
		g_bPDMDataReady = true;
	}
}

//*****************************************************************************
//
// Main
//
//*****************************************************************************
int main(void)
{
	//
	// Perform the standard initialzation for clocks, cache settings, and
	// board-level low-power operation.
	//
	am_hal_clkgen_control(AM_HAL_CLKGEN_CONTROL_SYSCLK_MAX, 0);
	am_hal_cachectrl_config(&am_hal_cachectrl_defaults);
	am_hal_cachectrl_enable();
	//am_bsp_low_power_init();

	am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_BLUE, g_AM_HAL_GPIO_OUTPUT);
	//am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_BLUE, g_AM_BSP_GPIO_LED_BLUE);

	//
	// Initialize the printf interface for UART output
	//
	am_bsp_uart_printf_enable();

	//
	// Turn on the PDM, set it up for our chosen recording settings, and start
	// the first DMA transaction.
	//
	pdm_init();
	//You can print the current configuration
	//pdm_config_print();
	am_hal_pdm_fifo_flush(PDMHandle);

  //config UART receive: waiting for 1 B
	am_hal_uart_transfer_t transfer_config;
	transfer_config.ui32Direction = AM_HAL_UART_READ;
	transfer_config.pui8Data = readBuffer;
	transfer_config.ui32NumBytes = 1;
	transfer_config.ui32TimeoutMs = AM_HAL_UART_WAIT_FOREVER;
	transfer_config.pui32BytesTransferred = &bytesWritten;
  //config UART send: it send the whole data (12k samples)
	am_hal_uart_transfer_t transfer_config_writebuffer;
	transfer_config_writebuffer.ui32Direction = AM_HAL_UART_WRITE;
	transfer_config_writebuffer.pui8Data = (uint8_t*)g_ui32PDMDataBuffer;
	transfer_config_writebuffer.ui32NumBytes = PDM_DATA_SIZE*2;
	transfer_config_writebuffer.ui32TimeoutMs = AM_HAL_UART_WAIT_FOREVER;
	transfer_config_writebuffer.pui32BytesTransferred = &bytesWritten;
	//
	// Loop forever while sleeping.
	//
	while (1)
	{
		//waiting for a trigger from the PC
		while(am_bsp_com_uart_transfer(&transfer_config) != AM_HAL_STATUS_SUCCESS)
			;
		//if the received character is an 'r', start a 1 second long recording
		if(readBuffer[0] == 'r') {
			am_devices_led_on(am_bsp_psLEDs, 0);
			g_bPDMDataReady = false;
			am_hal_pdm_fifo_flush(PDMHandle);
			//start data collection by utilizing the DMA
			pdm_data_get();
			//go to sleep 
			am_hal_sysctrl_sleep(AM_HAL_SYSCTRL_SLEEP_DEEP);
			//wake-up trigger from the DMA
			while(!g_bPDMDataReady)
				;
			am_devices_led_off(am_bsp_psLEDs, 0);
			//send the data through the USB
			am_bsp_com_uart_transfer(&transfer_config_writebuffer);
		}

	}
}

# -*- coding: utf-8 -*-
"""simple_pipeline.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/1zHi_UF1MfRlDoTdRtT9CsOAVtJHenCsa
"""

import numpy as np
import matplotlib.pyplot as plt
import os, glob

# Commented out IPython magic to ensure Python compatibility.
#@title Mount Google Drive to get data {display-mode: "form"}
from google.colab import drive
drive.mount('/gdrive')

from dataloader import *

#load data - all background noise classes - full dataset
(x_train,y_train,x_val,y_val,x_test,y_test) = getDataset(src_dir="../data",train_class=(0,1,2,3,4,5,6,7,8,10,11,12,13,14,15,16,17,18))
x_train = np.concatenate(x_train,axis=0)
if(x_train.ndim < 4):
  x_train = np.expand_dims(x_train, axis=2)
  x_train = np.expand_dims(x_train, axis=2)
y_train = np.concatenate(y_train,axis=0)
y_train = np.argmax(y_train, axis=1)

x_val = np.concatenate(x_val,axis=0)
if(x_val.ndim < 4):
  x_val = np.expand_dims(x_val, axis=2)
  x_val = np.expand_dims(x_val, axis=2)
y_val = np.concatenate(y_val,axis=0)
y_val = np.argmax(y_val, axis=1)

x_test = np.concatenate(x_test,axis=0)
if(x_test.ndim < 4):
  x_test = np.expand_dims(x_test, axis=2)
  x_test = np.expand_dims(x_test, axis=2)
y_test = np.concatenate(y_test,axis=0)
y_test = np.argmax(y_test, axis=1)

#final dataset split sizes
print(x_train.shape)
print(y_train.shape)
print(x_val.shape)
print(y_val.shape)
print(x_test.shape)
print(y_test.shape)

from tensorflow.keras import *
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import *

#create a very simple model
model = Sequential()
# 2 convolutional kernels
# Tf lite micro only supports Conv2D (Conv1D would be better for us here)
model.add(Conv2D(filters=2, kernel_size=(33,1), dilation_rate=2, activation='relu', input_shape=(12000,1,1)))
#Get the maximum - MaxPooling
model.add(MaxPool2D((11936,1)))
model.add(Flatten())
#produce an output [0-no spark --- 1-spark]
model.add(Dense(1, activation='sigmoid'))
#compile the model with the following training parameters
model.compile(loss='binary_crossentropy', optimizer=optimizers.Adam(), metrics=['binary_accuracy'])
# fit network
model.summary()
model.fit(x_train, y_train, validation_data=(x_val,y_val) ,epochs=10, batch_size=8, verbose=1, shuffle=True)
#evaluate network on test data
result = model.evaluate(x_test,y_test)
print("Accuracy on the test set: "+str(result[1]))

# Save tf.keras model in HDF5 format.
keras_file = "sparkCNN_baseline.h5"
tf.keras.models.save_model(model, keras_file)

# Convert to TensorFlow Lite model.
converter = tf.lite.TFLiteConverter.from_keras_model_file(keras_file)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_LATENCY]
tflite_model = converter.convert()
open("sparkCNN_baseline.tflite", "wb").write(tflite_model)

# Install xxd if it is not available
!apt-get -qq install xxd
# Save the file as a C source file
!xxd -i sparkCNN_baseline.tflite > sparkCNN_baseline.cc
# Print the source file
!cat sparkCNN_baseline.cc

Credits

György Kalmár

1 project • 0 followers

Istvan Megyeri

1 project • 0 followers

Spark Sound Detection From Raw Audio Data

Things used in this project

Hardware components

Software apps and online services

Story

Spark sound detection from raw audio data

Project summary:

Schematics

SparkFun RedBoard Artemis + OLED screen circuit diagram

SparkFun RedBoard Artemis + OLED screen schematics

Spark generator circuit diagram

Spark generator schematics

Code

Audio recording on the Artemis ATP

Simple training process

The modified micro_speech example that implements spark sound detection

Credits

György Kalmár

Istvan Megyeri

Comments

Embed the widget on your own site

Spark Sound Detection From Raw Audio Data

Spark Sound Detection From Raw Audio Data

Things used in this project

Hardware components

Software apps and online services

Story

Spark sound detection from raw audio data

Project summary:

Schematics

SparkFun RedBoard Artemis + OLED screen circuit diagram

SparkFun RedBoard Artemis + OLED screen schematics

Spark generator circuit diagram

Spark generator schematics

Code

Audio recording on the Artemis ATP

Simple training process

The modified micro_speech example that implements spark sound detection

Credits

György Kalmár

Istvan Megyeri

Comments

Related channels and tags