Posts

Latest Article

Counting Cache Hits and Misses on an ARM Cortex-M33

Image
Introduction The Instruction Cache (ICACHE) on the STM32H5 is an 8KB cache memory positioned between the ARM Cortex-M33 CPU and the MCU Bus Matrix. It connects to the Cortex-M33 via the C-AHB (Code) Bus and features two master ports: M1 (128-bit) and M2 (32-bit). The M1 port leads to a multiplexer that distributes access between external Flash memory and SRAM (through the Bus Matrix), while the M2 port directly interfaces with external memory controllers (OCTOSPI & FSMC) via the Bus Matrix. This ICACHE is a 2-way associative cache with a 16-byte cache line structure, comprising 256 sets, each containing two cache lines. It employs Hit-Under-Miss and Critical-Word-First refill strategies and includes a remapping feature that enables caching of up to four external memory regions by aliasing their addresses. The cache supports a Direct-Mapped mode, but its default state is n-way associative. Our objective is to configure the ICACHE peripheral to monitor Hit-and-Miss counters, providin...

Building a Software RNG v1.0 with Timers, LFSR, XOR Shift, and FNV Hash algorithms

Image
Why a Software Random Number Generator? The STM32F401 microcontroller lacks a hardware Random Number Generator (RNG), making a Software RNG necessary for generating random numbers. While hardware RNGs provide higher-quality randomness, a Software RNG can still achieve essential functionality by generating unique cryptographic keys, mimicking real-world randomness, and introducing system unpredictability. How the Software RNG Generates Random Numbers The execution of the Software RNG begins by Generating a Hardware seed using the STM32 timers , providing a source of entropy based on system behavior. Then, a Linear Feedback Shift Register (LFSR) is applied to the seed to add a bit of randomness by shifting and modifying the seed value. The result of the LFSR operation is then combined with the hardware seed using the XOR (Exclusive OR) operation . Finally, the combined result is processed using the FNV Hash function, which generates a 32-bit random number. Extracting Hardware-based See...

Audio Processing Series Part IV : Encoding and Decoding audio data using ADPCM algorithm

Image
Understanding ADPCM: Principles & Implementation ADPCM (Adaptive Differential Pulse Code Modulation) is an audio compression technique that focuses on encoding the difference between consecutive audio samples instead of their absolute values. By representing only the changes in audio data, ADPCM achieves significant data rate reductions. This project involves reading an audio file in blocks of 1024 samples, encoding it using ADPCM, and storing the ADPCM code in flash memory. Afterward, the ADPCM code will be retrieved from the flash memory, decoded, and the decompressed audio saved back to flash. Both the ADPCM code and the decompressed audio data will be written to the flash memory for analysis of audio quality. The flash memory has a capacity of 8 megabytes, and since the original uncompressed audio file is approximately 938 kilobytes, space will be allocated as follows: the original audio will be stored at the beginning of the flash memory, the ADPCM code will start at page 8192...

Audio Processing Series Part II : Storing Audio data on external Serial Flash Memory

Image
Storing Audio data in an SPI Flash Memory This is the second article of the Audio Processing Project on the STM32 F411 Discovery Board. This article focuses on developing the SPI driver and the Library for the W25Q64FV 64 M-bit Serial Flash Memory from Winbond. The W25Qxx series of chips are Serial NOR Flash memory devices with capacities ranging from 4 MB to 128 MB and utilize the Serial Peripheral Interface for communication with microcontrollers. Checkout the STM32F411 Reference Manual (RM0383) and the Datasheet for W25Q64FV 64M-bit SPI Flash Memory .  Developing the SPI Driver Since some of the SPI1 and SPI3 peripheral pins are already occupied by the JTAG interface on the STM32F411 Discovery Board, we will use the SPI2 peripheral for our communication with the external flash memory. Remember to enable the clock to the GPIO Port B and SPI2 peripheral in the RCC registers. For driving the NSS line Low and High to Select and Deselect the Slave device, we can write functions...

Audio Processing Series Part III : Designing an Echo effect

Image
Echo Algorithm & Parameter Tuning The echo effect is a widely used audio processing technique that mimics sound reflections in different environments. Fundamentally, the echo algorithm works by capturing an audio input, introducing a delay to the signal, and blending it back with the original sound. This method requires fine-tuning parameters like delay time, feedback, and amplitude, enabling the generation of multiple repetitions that can differ in intensity and timing. Delay time specifies how long it takes for the echoed sound to return, while feedback determines the amount of the output signal that is redirected back into the input. The delay time is set to 300 milliseconds, determining the duration between the original signal and its echoed repetition. This value creates a noticeable gap that allows listeners to perceive the echo distinctly. The feedback level is configured to 0.7, which indicates that 70% of the output signal is fed back into the input.  Reading Audio Dat...