Designing a Software-Based Wear Leveling Subsystem for W25Q64FV Serial Flash Memory

NOR Flash Memories & the Need for Wear Leveling

NOR Flash is a type of non-volatile memory used to store data that needs to persist even when a device is powered off. NOR Flash is typically organized into sectors or blocks, which can be individually erased and reprogrammed. Due to the physics of flash cell structure, memory cells suffer from “wear” with every Program/Erase cycle (P/E cycle). This means flash memory has a finite usable life. If a flash memory exceeds this limit, the storage capability becomes unreliable. Wear Leveling is a common technique used by storage media to enhance the longevity of the storage media.

Flash memory cells can only be programmed from a 1 to a 0 state. In order to set any cell from 0 to 1 state, the cell has to be erased to a 1. In order to update any already programmed memory sectors/blocks, the sectors/blocks have to be first erased and then reprogrammed, hence the P/E cycle. Depending on how often an application updates the flash content, frequent P/E cycles could occur, “wearing out” the flash memory cells.


Most Winbond Flash Memory products are rated for a minimum of 100,000 program/erase (P/E) cycles. In a scenario where constant updates are necessary, such as an application that updates a log file (32kB) every 30 minutes, the expected lifespan can be calculated as follows:

( 100,000 cycles * 1 block )/( 2 updates/hour * 24 hours/day ) = 2083 days ( ~5.7 years)

However, if the application updates the log every 5 minutes, the lifespan would be significantly shorter:

( 100,000 cycles * 1 block )/( 12 updates/hour * 24 hours/day ) = 347 days ( < 1 year)

Without wear leveling, the flash memory may begin to experience data retention issues over time.

Memory structure & Security Register region of W25Q64FV

The W25Q64FV array is organized into 32,768 programmable pages of 256 bytes each. Up to 256 bytes can be programmed at a time. Pages can be erased in groups of 16 (4KB sector erase), groups of 128 (32KB block erase), groups of 256 (64KB block erase) or the entire chip (chip erase). The W25Q64FV has 2,048 erasable sectors and 128 erasable blocks respectively.

The W25Q64FV offers three 256-byte Security Registers which can be erased and programmed individually. It allows from one byte to 256 bytes of security register data to be programmed at previously erased (FFh) memory locations. It allows one or more data bytes to be sequentially read from one of the three security registers.

The Security Register Lock Bits (LB3, LB2, LB1) in the W25Q series are non-volatile One-Time Programmable (OTP) bits located in the Status Register. They control the write protection and status of the Security Registers. By default, the Lock Bits are set to 0, meaning the Security Registers remain unlocked. These bits can be individually set to 1 using the Write Status Register instruction, after which the corresponding 256-byte Security Register becomes permanently read-only. For our application, we will not modify the Lock Bits.

Working of Wear Leveling Algorithm

Since the W25Q offers a P/E cycle of more than 100,000, we can use an array of 32-bit unsigned integers to store the Erase Count of each of the 128 Memory Blocks. And, we can use an array of 8-bit unsigned integers to store the Logical-to-Physical Block Map of the flash memory. When the application runs for the first time, it will erase the 3 Security Registers of the memory to store the Erase Counts and the Block Map as a one-time executable operation. 









Then, the application will read these security registers to make a working copy for the metadata of the Flash Memory. 






When the application needs to write a chunk of data to a specific memory block in the flash, the algorithm first checks the erase count of that block. It then scans the metadata's Erase Count section to find the block with the lowest count. If no better option is found, the application proceeds to write to the specified block, increments its Erase Count, and updates the Block Map in the working copy of the metadata. 











Additionally, it updates both the Erase Count and Block Map in the Security Registers to preserve the current values. These updates are also reflected on the Serial console for real-time viewing.









The entire working copy of the Erase Count and Block Map arrays is not written to the Security Registers with each data write, as only the erase count of the specific block being modified changes, while the rest remain unchanged. This approach optimizes the update process by writing only the necessary modifications to the arrays stored in flash memory, ensuring efficiency.










Custom Configuration & Memory usage Enhancement 

This is not a generic library and requires customization to work with different devices, starting from GPIO pin mapping to memory management. The serial console is configured to the VCOM port of the Nucleo Development Board, operating at a baud rate of 115200. The W25Q library does not utilize DMA for data transfer and instead blocks CPU execution for certain operations, without polling the BUSY bit in the Status Register.

The algorithm uses a 32-bit integer to store the erase count, which is excessive and leads to unnecessary memory usage. This can be optimized by reducing the size to 24 bits for storing the erase count, and including an additional section to track the presence of valid data in memory blocks. Although there’s no direct support for 24-bit variables, a 3-byte array can be used to manually handle these values.

This algorithm is ideal for applications where configuration or log files are frequently updated, as it evenly distributes writes across the memory to prevent wear on individual blocks. Additionally, the flash memory retains previously written data until it is explicitly overwritten by the algorithm.

Checkout my GitHub repository for the complete source code of this algorithm

Popular posts from this blog

Capturing images using the Digital Camera Interface | STM32L4 | DCMI | CMSIS

Cifradopro: A baremetal Hardware Security Module using the STM32L4S5 Cortex-M4 MCU

SignGlove: Bridging the Communication Gap for Paralyzed Patients