The speech waveform has lots of redundancy, so compression is useful.

The filter bank formulation described below was used to analyse speech

into a number of log-distributed frequency bands. The energy in each of N

filters outputs was sampled at a rate around 128 times slower than the

original waveform, then reconstucted using just N pure sine waves with

amplitudes adjusted to the N filter amplitudes. The result is a slightly

sing-song version of the original voice signal. Sample rate for the

audio is 8 KHz. Analysis was done in Matlab, with reconstruction also in

Matlab, but also with filter output in a header file to a C program

running on the PIC32. A voice sample was analysed and reconstructed

using 15 filter channels, and with the filter power down-sampled to

every 16 milliseconds (approximately one fundamental period).

Overall compression is 8:1 from the 8-bit, 8KHz input samples to the

sampled filter outputs. The Matlab analyser and reconstruction is here.

The C header file

writtten by the Matlab program contains the filter coefficients, and

some constants that the PIC32 reconstruction program needs. The C program

defines N direct digital synthesis units and scales their amplitude

according to filter coefficients stored in the header file, then blasts

them out to a 12-bit SPI attached DAC. The program runs under ProtoThreads.

The spectrum of the reconstructed speech looks nothing like the original but is understandable.

spectrum