Close

Voice control (speech intent) with Picovoice AI Rhino

A project log for ML Hat Cam

Auto zoom Raspberry Pi camera for filming model airplanes

jacob-david-c-cunninghamJacob David C Cunningham 02/26/2023 at 00:400 Comments

So I need a way to control the zoom with my voice.

I tried some different libraries initially eg. sphinx and vosk but they were bad in terms of accuracy.

Also I was probably using them for the wrong thing (not intent).

Someone mentioned picovoice to me and that's what I'm using now.

I got a model of two intents: ZoomIn and ZoomOut (for the utterance Zoom in and Zoom out)

I got this running locally, verified it works offline (since it requires an access key) by turning wifi off.

Then I paired it with the stepper control

See recent video/demo below

My code flow/architecture is still garbage at this point, still bridging things.

The display for example is messed since it needs a single instance and I have to propagate that all the way down.

I added battery life tracking too (based on max measured uptime and CRON 5min increment). I also went away from the box drawn around the active choice in the menu to color, since the box has to be drawn one piece at a time (4 draw calls) with the current code I'm using.

I also fixed an audio issue, in the above you can kind of see that the 3.5mm audio jack plug does not have a ground going to the speaker, that was causing my constant buzzing/hissing issue.

It got really bad when the speech intent listener loop was running.

Discussions