Audio Processing Series Part IV : Encoding and Decoding audio data using ADPCM algorithm
Understanding ADPCM: Principles & Implementation
ADPCM (Adaptive Differential Pulse Code Modulation) is an audio compression technique that focuses on encoding the difference between consecutive audio samples instead of their absolute values. By representing only the changes in audio data, ADPCM achieves significant data rate reductions.
This project involves reading an audio file in blocks of 1024 samples, encoding it using ADPCM, and storing the ADPCM code in flash memory. Afterward, the ADPCM code will be retrieved from the flash memory, decoded, and the decompressed audio saved back to flash. Both the ADPCM code and the decompressed audio data will be written to the flash memory for analysis of audio quality. The flash memory has a capacity of 8 megabytes, and since the original uncompressed audio file is approximately 938 kilobytes, space will be allocated as follows: the original audio will be stored at the beginning of the flash memory, the ADPCM code will start at page 8192, and the decompressed audio will begin at page 16384. Each partition will have 2 megabytes allocated, leaving the remainder of the flash memory unused.
Take a look at this Application Note from Microchip for a detailed guide on the implementation of the ADPCM algorithm on PIC microcontrollers. Also, STMicroelectronics offers a software solution STSW-STM32022 for reconstructing audio signals from compressed samples on Cortex-M3 CPU.
Audio Data & ADPCM Code Handling
The original audio data stored in the external flash memory is in 16-bit signed little-endian format. The ADPCM Encode and Decode functions handle uncompressed audio in 16-bit integer (int16_t) format and compressed audio in 8-bit integer (int8_t) format.
The ReadAudioData & WriteAudioData functions convert byte values stored in flash memory to 16-bit audio samples when reading and perform the reverse conversion when writing to flash memory. The ReadPCMCode function splits each byte of data read from flash memory into two ADPCM-compressed samples and stores them in a 16-bit integer array for the ADPCM decoder to process. The WritePCMCode function simply wraps the W25Q library function, as the ADPCM codes are already paired in the ADPCM_EncodeBlock function.
ADPCM Algorithm for Encoding & Decoding
The indexTable is an array that maps the current step index to a new index based on the ADPCM code. It helps adjust the step size used in predicting the next sample.
The stepTable, defines the step sizes used in the prediction process. Each entry in this table corresponds to a step index and provides a specific step size value that determines the range of adjustments made to the predicted sample during encoding and decoding.
The ADPCM encoder state variables are set to zero before compressing and decompressing audio data to make sure that the algorithm begins each session from a known state, preventing potential errors.
Within each loop iteration, this function encodes two 16-bit audio samples into 4-bit ADPCM samples using ADPCM_EncodeSample. It then combines these two 4-bit codes into a single 8-bit byte, with the first sample in the Higher nibble and the second in the Lower nibble. This 8-bit ADPCM code is stored in the adpcmCode array, effectively compressing two audio samples into a single byte, which reduces the storage space needed for the encoded data.
In each loop iteration, it calls ADPCM_DecodeSample, passing in the current 8-bit ADPCM code and the state. The function then stores each resulting decoded 16-bit audio sample in the decodedData array. This effectively reconstructs the original audio data for playback or further processing.
Audio Playback for Analysis
In ADPCM encoding, the original audio data is transformed from a 16-bit format into a more compact 4-bit representation to save storage space. The encoding process doesn’t retain the original amplitude range, so instead of directly storing the full audio waveform, it compresses the data by storing only the changes between samples. This results in the encoded audio having a limited amplitude range (in this case, 0 to 14) that reflects these smaller changes, not the original waveform’s amplitude.
When decoding, the ADPCM algorithm reconstructs the waveform by gradually rebuilding the predicted values of the original samples based on the compressed data. This is why the waveform of the decoded audio closely resembles the original, while the encoded version looks simplified—it only represents the differential steps rather than the actual audio amplitude.
ADPCM is a lossy compression method. During encoding, ADPCM compresses audio by approximating changes in the audio waveform rather than preserving exact sample values. This approximation introduces slight "losses" in the audio data, which become more noticeable when the data is decompressed back into its original format. As a result, the decoded audio closely resembles the original waveform but does not exactly match it at a binary level.