IKCEST

Abstract

In-memory computing is an emerging approach for overcoming memory-accessing bottlenecks, by eliminating the costs of explicitly moving data from point of storage to point of computation outside the array. However, computation increases the dynamic range of signals, such that performing it via the existing structure of dense memory substantially squeezes the signal-to-noise ratio (SNR). In this brief, we explore how computations can be scaled up, to jointly optimize energy/latency/bandwidth gains with SNR requirements. We employ algorithmic techniques to decompose computations so that they can be mapped to multiple parallel memory banks operating at chosen optimal points. Specifically focusing on in-memory classification, we consider a custom IC in 130-nm CMOS IC and demonstrate an algorithm combining error-adaptive classifier boosting and multi-armed bandits, to enable segmentation of a feature vector into multiple subsets. The measured performance of 10-way MNIST digit classification, using images downsampled to <inline-formula> <tex-math notation="LaTeX">$16{times }16$ </tex-math></inline-formula> pixels (mapped across four separate banks), is 91%, close to that simulated using full unsegmented feature vectors. The energy per classification is 879.7 pJ, <inline-formula> <tex-math notation="LaTeX">$14.3{times }$ </tex-math></inline-formula> lower than that of a system based on separated memory and digital accelerator.

Original Text (This is the original text for your reference.)

Scaling Up In-Memory-Computing Classifiers via Boosted Feature Subsets in Banked Architectures

In-memory computing is an emerging approach for overcoming memory-accessing bottlenecks, by eliminating the costs of explicitly moving data from point of storage to point of computation outside the array. However, computation increases the dynamic range of signals, such that performing it via the existing structure of dense memory substantially squeezes the signal-to-noise ratio (SNR). In this brief, we explore how computations can be scaled up, to jointly optimize energy/latency/bandwidth gains with SNR requirements. We employ algorithmic techniques to decompose computations so that they can be mapped to multiple parallel memory banks operating at chosen optimal points. Specifically focusing on in-memory classification, we consider a custom IC in 130-nm CMOS IC and demonstrate an algorithm combining error-adaptive classifier boosting and multi-armed bandits, to enable segmentation of a feature vector into multiple subsets. The measured performance of 10-way MNIST digit classification, using images downsampled to <inline-formula> <tex-math notation="LaTeX">$16{times }16$ </tex-math></inline-formula> pixels (mapped across four separate banks), is 91%, close to that simulated using full unsegmented feature vectors. The energy per classification is 879.7 pJ, <inline-formula> <tex-math notation="LaTeX">$14.3{times }$ </tex-math></inline-formula> lower than that of a system based on separated memory and digital accelerator.

+More

Keywords

10way mnist digit classification cmos computing full unsegmented feature segmentation of signals energy ic inmemory classification parallel memory banks energylatencybandwidth digital accelerator signaltonoise ratio multiarmed bandits algorithmic techniques erroradaptive classifier boosting dynamic range lttexmathgtltinlineformulagt

Cite this article

APA

MLA

Chicago

Yinqi TangJintao ZhangNaveen Verma,.Scaling Up In-Memory-Computing Classifiers via Boosted Feature Subsets in Banked Architectures. 66 (3),477-481.

Language

International

Translate engine

Article's language

Action

Recommended articles

Report