Posts

Latest Article

Keys That Stick to the Chip: Device‑Specific Root Key & Flash Binding

Image
Why I Needed Both Tricks The main purpose of deriving a device-specific root key and binding the external Flash to the microcontroller is to close off a major attack vector, direct access to the key storage. Since the external Flash holds sensitive key material and can be physically removed from the PCB and read using tools like USB programmers, it becomes a weak link if left unprotected. By encrypting all data in Flash using a key that’s tied specifically to the MCU, any dumped contents become meaningless outside that device. Of course, this only holds if Initialization Vectors (IVs) are not reused; we’ll get into that risk shortly. Normally, this kind of protection is handled using a Hardware Unique Key (HUK), but since that wasn't available, I had to build my own mechanism for device binding. The STM32H563ZI used on the Nucleo-H563ZI development board doesn’t support a Hardware Unique Key (HUK). That feature is only available on certain STM32H5 series chips like the STM32H573 or...

Keys, Chips, and USB: The Story Behind TrustX

Image
Why I Built This? I've always been curious about how Cryptography works on real hardware, not just in code, but on actual devices that Store Keys and do Encryption securely. I’d seen examples of Software-based cryptography, but I wanted to build something more hands-on, a device that does cryptographic stuff on its own , without relying on a PC for any of it. That’s where TrustX started. I wanted to build my own simple Hardware Security Module using just a microcontroller, an STM32H5 in my case, and see how far I could go. The goal wasn’t to build a commercial or certified HSM, but something I could learn from, something that handles Keys securely, does Cryptography operations, and responds to Tamper events, all in hardware. What does this Device actually TrustX isn’t a full-scale enterprise HSM; it’s more like a secure, USB-connected crypto helper. The host PC sends commands, and the device takes care of the actual processing. It can: Encrypt and Decrypt data using AES-128 in C...

Counting Cache Hits and Misses on an ARM Cortex-M33

Image
Introduction The Instruction Cache (ICACHE) on the STM32H5 is an 8KB cache memory positioned between the ARM Cortex-M33 CPU and the MCU Bus Matrix. It connects to the Cortex-M33 via the C-AHB (Code) Bus and features two master ports: M1 (128-bit) and M2 (32-bit). The M1 port leads to a multiplexer that distributes access between external Flash memory and SRAM (through the Bus Matrix), while the M2 port directly interfaces with external memory controllers (OCTOSPI & FSMC) via the Bus Matrix. This ICACHE is a 2-way associative cache with a 16-byte cache line structure, comprising 256 sets, each containing two cache lines. It employs Hit-Under-Miss and Critical-Word-First refill strategies and includes a remapping feature that enables caching of up to four external memory regions by aliasing their addresses. The cache supports a Direct-Mapped mode, but its default state is n-way associative. Our objective is to configure the ICACHE peripheral to monitor Hit-and-Miss counters, providin...

Building a Software RNG v1.0 with Timers, LFSR, XOR Shift, and FNV Hash algorithms

Image
Why a Software Random Number Generator? The STM32F401 microcontroller lacks a hardware Random Number Generator (RNG), making a Software RNG necessary for generating random numbers. While hardware RNGs provide higher-quality randomness, a Software RNG can still achieve essential functionality by generating unique cryptographic keys, mimicking real-world randomness, and introducing system unpredictability. How the Software RNG Generates Random Numbers The execution of the Software RNG begins by Generating a Hardware seed using the STM32 timers , providing a source of entropy based on system behavior. Then, a Linear Feedback Shift Register (LFSR) is applied to the seed to add a bit of randomness by shifting and modifying the seed value. The result of the LFSR operation is then combined with the hardware seed using the XOR (Exclusive OR) operation . Finally, the combined result is processed using the FNV Hash function, which generates a 32-bit random number. Extracting Hardware-based See...

Audio Processing Series Part IV : Encoding and Decoding audio data using ADPCM algorithm

Image
Understanding ADPCM: Principles & Implementation ADPCM (Adaptive Differential Pulse Code Modulation) is an audio compression technique that focuses on encoding the difference between consecutive audio samples instead of their absolute values. By representing only the changes in audio data, ADPCM achieves significant data rate reductions. This project involves reading an audio file in blocks of 1024 samples, encoding it using ADPCM, and storing the ADPCM code in flash memory. Afterward, the ADPCM code will be retrieved from the flash memory, decoded, and the decompressed audio saved back to flash. Both the ADPCM code and the decompressed audio data will be written to the flash memory for analysis of audio quality. The flash memory has a capacity of 8 megabytes, and since the original uncompressed audio file is approximately 938 kilobytes, space will be allocated as follows: the original audio will be stored at the beginning of the flash memory, the ADPCM code will start at page 8192...