APPLICATIONS OF TECHNOLOGY:
- High performance computing
- IoT
BENEFITS:
- Reduced latency and high throughput
- Improved performance
- High accuracy computations
- Compatibility with various memory devices supporting in-memory computation
BACKGROUND:
Resistive random-access memory (RRAM) devices have garnered significant interest because of their potential to perform multiple simultaneous operations over a large number of processing units through in-memory computations. RRAM devices are attractive due to their small footprint, low power consumption, excellent stability and re-programmability, making them suitable for field-programmable crossbar array (FPCA) architectures. Thus, a RRAM based FPCA offers superior programmability due to its flexibility and ease of programming.
However, challenges such as long-distance routing, device/crossbar irregularities, and high-power analog-digital interfaces have been identified as performance bottlenecks in this architecture. An FPCA architecture that overcomes these bottlenecks, leading to superior computational performance, is presented.
TECHNOLOGY OVERVIEW:
Scientists at Berkeley Lab have developed a memory-centric, reconfigurable, general-purpose computing platform capable of performing analog, digital, and memory functions while efficiently handling explosive amounts of data and performance bottlenecks. The architecture utilizes reprogrammable FPCA with crossbar layouts designed for efficient and massively parallel computing and data storage tasks.
RRAMs are used as the primary memory element in this architecture, but the design is flexible enough to accommodate other memory technologies. Furthermore, the high programming cycles of RRAMs were reduced, leading to a high-throughput, low-latency system; specifically, the crossbar programming time was reduced by 17× for in-memory addition and by 8.5× for in-memory high-accuracy multiplication compared to state of the art technologies. The in-memory high-accuracy multiplication method has led to a significant reduction in the latency and addition operations compared to implementation on state of the art field programmable gate arrays (FPGAs). The developed architecture improves peak neural network training throughput by 81.3× compared to that of state-of-the-art crossbar-based accelerators.
More Information:
DEVELOPMENT STAGE:
Technology concept and/or application formulated
PRINCIPAL INVESTIGATORS:
Hasita Veluri
Dilip Vasudevan
IP Status:
Patent pending
OPPORTUNITIES:
Available for licensing or collaborative research