Abstract Summary

OzMAC targets sparsity-aware inference by reducing unnecessary multiply-accumulate work. The design explores how bit sparsity can translate directly into lower energy cost in deep learning accelerators.

Research Context

This paper contributes to my research program in sparsity, DL inference, VLSI-SoC 2024. It is part of the broader work on efficient ML systems, hardware-software co-design, and deployment-aware computer architecture.

sparsityDL inferenceVLSI-SoC 2024

Related Papers