Model Compression & Optimization

Model compression has emerged as an important area of research for deploying deep learning models on IoT devices. However, model compression is not a sufficient solution to fit the models within the memory of a single device; as a result we need to distribute them across multiple devices. This leads to a distributed inference paradigm in which communication costs represent another major bottleneck. To this end, we focus on knowledge distillation and ‘teacher’ – ‘student’ type of architectures for distributed model compression, as well as data independent model compression.

model compressions

Selected Publications

11 entries « 1 of 2 »

Li, Guihong; Bhardwaj, Kartikeya; Yang, Yuedong; Marculescu, Radu

TIPS: Topologically Important Path Sampling for Anytime Neural Networks Conference

International Conference on Machine Learning (ICML), 2023.

Links | BibTeX

Goksoy, Alper-A; Li, Guihong; Mandal, Sumit K.; Ogras, Umit Y.; Marculescu, Radu

CANNON: Communication-Aware Sparse Neural Network Optimization Journal Article

In: IEEE Transactions on Emerging Topics in Computing, 2023.

Links | BibTeX

Hoang, Duc N. M; Liu, Shiwei; Marculescu, Radu; Wang, Zhangyang

Revisiting Pruning at Initialization through the Lens of Ramanujan Graph Conference

International Conference on Learning Representations (ICLR), 2023.

Links | BibTeX

Krishnakumar, Anish; Marculescu, Radu; Ogras, Umit Y

INDENT: Incremental Online Decision Tree Training for Domain-Specific Systems-on-Chip Proceedings Article

In: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1-9, 2022.

Links | BibTeX

Farcas, Allen-Jasmin; Chen, Xiaohan; Wang, Zhangyang; Marculescu, Radu

Model Elasticity for Hardware Heterogeneity in Federated Learning Systems Proceedings Article

In: Proceedings of the 1st ACM Workshop on Data Privacy and Federated Learning Technologies for Mobile Edge Network (FedEdge), pp. 19-24, 2022.

Links | BibTeX

Li, Guihong; Mandal, Sumit K; Ogras, Umit Y; Marculescu, Radu

FLASH: Fast Neural Architecture Search with Hardware Optimization Journal Article

In: ACM Transactions on Embedded Computing Systems, vol. 20, no. 63, pp. 1-26, 2021.

Links | BibTeX

11 entries « 1 of 2 »