- High performance computing
- Consumer electronics
- Mobile devices
- Maximizes DRAM efficiency and bandwidth
- Reduces power consumption for computing and mobile devices
Researchers at Berkeley Lab have developed a purely hardware last-level collective prefetcher (LLCP) to address the constraints of DRAM performance and power for bulk-synchronous data-parallel applications that are key drivers for multi-core, e.g., image processing, climate modeling, physics simulation, gaming, face recognition, and many others.
The Berkeley Lab LLCP exploits the highly correlated prefetch patterns of data-parallel algorithms not recognized by a prefetcher oblivious to data parallelism. LLCP generates prefetches on behalf of multiple cores in memory address order to maximize DRAM efficiency and bandwidth. The technology can prefetch from multiple memory pages without expensive translations.
Compared to other prefetchers, LLCP improves execution time by 5.5% on average (10% maximum), increases DRAM bandwidth by 9% to 18%, decreases DRAM rank energy by 6%, produces 27% more timely prefetches, and increases coverage by 25% at minimum.
DEVELOPMENT STAGE: See researcher test results in the IEEE Computer Society publication linked below.
STATUS: Patent pending. Available for licensing or collaborative research.
REFERENCE NUMBER: 2018-061