High Reliability Applications
This is Part 2 in our series covering design considerations while using NAND storage solutions such as SD, microSD, and e.MMC in IoT designs. In Part I, we covered topics including Performance, Write Amplification and Endurance, Read Endurance, and Data Retention. This week, we cover the following:
For most standard applications, a UECC rate of 1×10^15 reads or so may be acceptable. Depending on the type of data that is lost, the host may still be able to recover (e.g. reboot) and reload the data. A journaled file system may also be used for better protection. The most critical areas of data typically include the internal flash memory management tables, boot partition and the host OS kernel. Memory vendors may keep backup copies of internal flash memory management tables to enable recovery in the event of a UECC read. In addition, boot partitions are typically kept in SLC areas with a redundant copy for extra reliability. A backup copy of the host OS kernel and other critical data structures may be kept in the user area to allow for additional redundancy – that would typically be managed by the OS vendor.
Operation Time Out
During a read or write operation to the flash memory, the operation may timeout; this typically happens if the host driver is not following the maximum timeout recommendations for read and write operations. The host needs to consider error handling on a timeout, and retry the operation as necessary.
The host design needs to consider the effect of power loss during flash memory operation. Many e.MMC and some SD card products will be resilient to some degree if power is interrupted to the memory during an erase or write operation. The host should be able to recover from power loss during READ, simply by resetting the flash memory device and starting the read again (or in a worst case, by rebooting). In the case of a write operation, if the host still has the data, the write can be attempted again. The worst case scenario is typically power loss during an internal “housekeeping” operation in the flash memory. If an internal management table or pointer is corrupted, this can lead to unrecoverable data scenarios. Power immunity hardware and software may protect against this to varying degrees – but good host side power supply design is the key to maximizing robustness (such as sufficient power supply capacitance and following the specifications for graceful power cycling).
As e.MMC devices operate on a high speed bus, host designers should follow the vendor recommendations on signal layout. Routing may be possible through NC balls on the BGA to minimize the number of board layers and vias– again the host designer should check with the vendor for specific layout guidelines.
SD cards can also operate in a number of interface modes. The SDA provides documentation with recommendations on how to design with this interface and that should be studied carefully – especially when using the higher speed UHS-104 and UHS-II buses.
Many e.MMC devices are typically pre-programmed before surface mount (SMT) – however as NAND geometries shrink, careful consideration maybe required as to how the device is pre-programmed and in some cases, data may need to be loaded or checked on the host board (after SMT). Again, check with you flash memory vendor for recommended practices for the specific device you are using.
NAND flash memory is a complex technology and is becoming more and more complex with each geometry shrink. Its intrinsic challenges require sophisticated flash management techniques. When designing with today’s high performance SD cards and e.MMC devices, taking into consideration the key topics discussed here will help to optimize system level performance and product life.
Main Author: Benjamin Telya
Contributors: Samuel Yu, Charlene Wan