Although conventional floating gate (FG) Flash memory has already gone into the sub-30 nm node, the technology challenges are formidable beyond 20nm. The fundamental challenges include FG interference, few-electron storage caused statistical fluctuation, poor short-channel effect, WL-WL breakdown, poor reliability, and edge effect sensitivity. Although charge-trapping (CT) devices have been proposed very early and studied for many years, these devices have not prevailed over FG Flash in the > 30nm node. However, beyond 20nm the advantage of CT devices may become more significant. Especially, due to the simpler structure and no need for charge storage isolation, CT is much more desirable than FG in 3D stackable Flash memory. Optimistically, 3D CT Flash memory may allow the Moore's law to continue for at least another decade. In this paper, we review the operation principles of CT devices and several variations such as MANOS and BE-SONOS. We will then discuss 3D memory architectures including the bit-cost scalable approach. Technology challenges and the poly-silicon thin film transistor (TFT) issues will be addressed in detail.