The GOP/s required for EIE to achieve the same application throughput (Frames/s) is much lower than competing approaches because EIE exploits sparsity to eliminate 97% of the GOP/s performed by dense approaches. 3 TOP/s on an uncompressed network requires only 100 GOP/s on a compressed network. EIE’s throughput is scalable to over 256 PEs. Without EIE’s dedicated logic, however, model compression by itself applied on a CPU/GPU yields only 3× speedup.