Design Considerations for Implementing NAND Flash in IoT Applications
IDC, Gartner, Cisco and others estimate that the number of connected devices will grow exponentially from an estimated 5 billion today to 20 to 35 billion in five years. The market is widespread and includes consumer applications (connected homes, security systems), health care, commercial/industrial applications, and automotive applications. As the world gets increasingly connected, IoT will also generate a tremendous amount of data. Trying to transfer these data to centralized data warehouses for data analysis will potentially cause networks to collapse as single streams of data turn into a flood. And this is where NAND flash comes in.
Benefits of NAND Flash
NAND flash, with the unique properties of being non-volatile, cost-effective, and durable, is already enabling billions of devices, storing operating systems, apps, and user data. And as the bandwidth demands continue to grow with the IoT, NAND flash will play an even bigger role in these client devices, maximizing bandwidth utilization and enabling real-time analysis right at the edge.
So as the usage of NAND becomes more ubiquitous, and finds its way into a great number of products as a managed storage solution, we’d like to share some useful information for developers looking to implement the latest NAND based solutions, whether they are SD and microSD cards, or embedded devices such as e.MMC, in their designs.
Before starting designs with any managed storage device, getting a good understanding of the device’s interface, timing, and the protocol will go a long way in making the design cycle easier and faster. Engineers typically run into issues during a design cycle and having the full specifications at hand always helps in overcoming the issues and moving forward with the design. The full specifications for e.MMC are controlled and publish by JEDEC. The latest specification is JESD84-B51 which can be downloaded from the link below. Membership is required but it is free.
Likewise, the full specification for microSD and SD cards is controlled and published by the SD Association, SDA. The latest is Physical Layer Specification Version 5.00 which can be downloaded from the link below. The SDA does require a membership fee to download.
The primary differences that users will see between devices relate to performance and “wear” life. There are multiple types of NAND flash memory that may be used internally to the device, for example 2-bit (MLC) and 3-bit (TLC) per cell flash memory. Many devices may also be configured with SLC areas or caches, and multiple memory devices or “planes” of memory may be accessed in parallel for higher performance. Latencies for read or write operations will differ between devices, and need to be taken into consideration on the host side (for example if you are continuously recording a data stream, you need to be able to buffer enough of that stream on the host side in RAM to account for the worst case write latency on the managed NAND device).
For SD cards, “Speed Class” refers to the “minimum” sustained write speed under a specific SDA defined write pattern; a detailed description of this write pattern is available to SDA members in the SD specifications. If your write pattern significantly differs from the SDA spec, you may observe significantly different performance. Maximum write speeds are also specified in datasheets for SD cards and e.MMC devices – but this will also be dependent on the host side implementation.
Write Amplification and Write Endurance
Another key consideration is “write amplification”. When the host writes small data chunks which are random and unaligned – internal operations inside the managed NAND device will cause additional writes to occur to the memory (management overhead). Large sequential writes aligned to page boundaries typically result in the least “amplification” (In a well designed system this amplification factor could be ~1.1). Small random writes typically result in the most amplification (in a bad design, the factor could be >2.0). This becomes important when considering the life of the memory device. See the example below:
Example of Write Amplification Impact
|Write Endurance||2K Program/Erase cycles|
|Data Written Per Day to Device||2GB|
|Expected Life w/ WA=1
|Expected Life w/ WA=5
To estimate the write amplification for any particular device, and hence its expected life, you can capture a log file of your writes. In some cases you may find that a TLC product rated at a fraction of the write life will be sufficient in your design and be a way to reduce cost. Because of SLC caching and other management techniques, vendors are moving away from a pure cycle rating to a terabytes written (TBW) rating on newer devices, which can better reflect the devices true capability.
As memory cells get closer together, reading one cell can disturb the data in surrounding cells. If the same cell is read enough times, data from surrounding cells may not be readable anymore, resulting in a UECC (uncorrectable ECC). For many applications, this will not be a critical consideration, since writing the device will trigger wear leveling, which will move the least used data, by re-writing it somewhere else in memory, and thus “refreshing” the cells. However in applications that rarely write data to the device, but are constantly reading the same areas, host side refresh techniques maybe required to ensure you do not reach the read-endurance limits. If this is a concern, you should discuss your access patterns with your memory vendor’s applications engineering team to better understand this phenomenon. The simplest refresh technique is to just cycle through the memory periodically and read and rewrite each page of data.
Data retention changes with temperature and cycle life of the memory device. JEDEC specifies 10 years @55°C data retention for a fresh device (typically that means it has not been erased or programmed more than 10% of its rated cycle life). Data retention is not linear with temperature. The first step to understand if data retention is a consideration in the device is to create a worst case daily temperature profile for you application. The memory vendor should then be able to help you calculate expected life. Life can be easily extended by “refreshing the data” – i.e. re-writing the data back to a different memory location, and in some cases it may be recommended that the host triggers such a function periodically (again if you are writing enough data to the device, the standard wear leveling mechanisms should take care of this for you – so it is of most concern in low write environments).
That’s it for this entry…please check back on our blog for the next entry in this series, where we will cover topics including High Reliability Applications, Operation Time Out, Power Immunity, Layout, and Programming considerations.