Close

ML model update

A project log for PionEar: Making Roads Safer for Deaf Drivers

PionEar provides early warning to deaf drivers of an approaching emergency vehicle

jan-haJan Říha 08/23/2023 at 17:370 Comments

Today, I want to provide an update regarding the ML model which I have updated. First of all, I will describe an issue with the old model. During the testing I noticed, that detection sensitivity is very poor when I drive a car at moderate/high speed. After analyzing this problem I have found, that this is caused by additional low-frequency noise generated by the car (engine, tire rolling, etc.), that masks the sound of sirens. 

First I wanted to implement any kind of high-frequency/band-pass filter to analyze and evaluate just the frequency band where a siren typically emits a sound. But I found, that there is no easy way to make it with TinyML board or Edge Impulse studio.

After this, I decided to create a new ML model that will "learn", that low-frequency noise in a car cabin together with siren sound should result in positive class detection. So I turned my current prototype into a sound recorder and recorded several hours of driving sound. After this, I mixed this with already existing siren sounds.

I also used a new dataset for siren sounds: sireNNet. This Dataset seems to have better quality as the recordings are cleaner and do not contain any other (often very strange) type of sirens, that were present in the previous dataset. Even though this dataset is much smaller, the observed result in the real driving conditions was much better.

For a better understanding of how I created the current dataset, please see the block diagram “Dataset_8_2023_diagram.pdf” which I share together with other files. I believe that this might be useful also for other people when trying to build any ML model that should run in a real environment.

In the block diagram, you'll see that I use varying amplification levels for certain classes. I've discovered that this greatly helps the model in achieving a higher level of abstraction when recognizing sounds. Without this adjustment, the model frequently identified different classes (sounds) based on their average sound level rather than their frequency content.

My current dataset is again shared on the Edge Impulse Platform here:

https://studio.edgeimpulse.com/studio/267982

Discussions