|
|
End context-adaptative deep sensing model with edge-end collaboration |
Hong-li WANG( ),Bin GUO*( ),Si-cong LIU,Jia-qi LIU,Yun-gang WU,Zhi-wen YU |
College of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China |
|
|
Abstract The end context adaptative of deep models with edge-end collaboration was analyzed. The partition and alternating direction method of multiplier method (X-ADMM) was proposed. The model compression was employed to simplify the model structure, and the model was partitioned at layer granularity to find the best partition point. The model can collaborate with edge-end devices to improve model operation efficiency. The graph based adaptive DNN surgery algorithm (GADS) was proposed in order to realize the dynamic adaptation of model partition. The model will preferentially search for the partition point that best meets resource constraints among surrounding partition states to achieve rapid adaptation when the running context (e.g., storage, power, bandwidth) of the model changes. The experimental results showed that the model realized the adaptive tuning of model partition point in an average of 0.1 ms. The total running latency was reduced by 56.65% at the highest with no more than 2.5% accuracy loss.
|
Received: 26 January 2021
Published: 07 May 2021
|
|
Fund: 国家重点研发计划资助项目(2019YFB1703901);国家自然科学基金资助项目(61772428,61725205) |
Corresponding Authors:
Bin GUO
E-mail: wanghongli@mail.nwpu.edu.cn;guob@nwpu.edu.cn
|
边端融合的终端情境自适应深度感知模型
研究边端融合的深度模型终端情境自适应问题. 提出边端融合增强的模型压缩方法(X-ADMM),利用模型压缩技术简化模型结构,以层为粒度寻找模型最佳分割点,协同边端设备提高运行效率. 为了实现模型分割的动态自适应,提出基于图的自适应深度模型手术刀算法(GADS). 当模型运行情境(如存储、电量、带宽等)发生变化时,优先在邻近分割状态中快速搜索最能满足资源约束的分割点,实现快速自适应调整. 实验结果表明,该模型平均在0.1 ms内实现了模型分割点的自适应调优,在保证模型精度下降不超过2.5%的情况下,运行总时延最高下降了56.65%.
关键词:
深度学习,
边缘智能,
模型压缩,
模型分割,
自适应感知
|
|
[1] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097- 1105
|
|
|
[2] |
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
|
|
|
[3] |
ALOM M Z, TAHA T M, YAKOPCIC C, et al A state-of-the-art survey on deep learning theory and architectures[J]. Electronics, 2019, 8 (3): 292
doi: 10.3390/electronics8030292
|
|
|
[4] |
ZHOU Z, CHEN X, LI E, et al Edge intelligence: paving the last mile of artificial intelligence with edge computing[J]. Proceedings of the IEEE, 2019, 107 (8): 1738- 1762
doi: 10.1109/JPROC.2019.2918951
|
|
|
[5] |
LUO P, ZHU Z, LIU Z, et al. Face model compression by distilling knowledge from neurons [C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Arizona: AAAI, 2016: 3560-3566.
|
|
|
[6] |
HAN S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2015: 1135-1143.
|
|
|
[7] |
LUO J H, WU J, LIN W. Thinet: a filter level pruning method for deep neural network compression [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5058-5066.
|
|
|
[8] |
LIU S, LIN Y, ZHOU Z, et al. On-demand deep model compression for mobile devices: a usage-driven model selection framework [C]// Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. Munich: ACM, 2018: 389-400.
|
|
|
[9] |
ZHAO Z, BARIJOUGH K M, GERSTLAUER A DeepThings: distributed adaptive deep learning inference on resource-constrained IoT edge clusters[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37 (11): 2348- 2359
doi: 10.1109/TCAD.2018.2858384
|
|
|
[10] |
KANG Y, HAUSWALD J, GAO C, et al Neurosurgeon: collaborative intelligence between the cloud and mobile edge[J]. ACM SIGARCH Computer Architecture News, 2017, 45 (1): 615- 629
doi: 10.1145/3093337.3037698
|
|
|
[11] |
LI H, HU C, JIANG J, et al. JALAD: joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution [C]// 2018 IEEE 24th International Conference on Parallel and Distributed Systems. Singapore: IEEE, 2018: 671-678.
|
|
|
[12] |
MAO J, CHEN X, NIXON K W, et al. Modnn: local distributed mobile computing system for deep neural network [C]// Design, Automation and Test in Europe Conference and Exhibition. Lausanne: IEEE, 2017: 1396-1401.
|
|
|
[13] |
KO J H, NA T, AMIR M F, et al. Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms [C]// 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. New Zealand: IEEE, 2018: 1-6.
|
|
|
[14] |
LIU N, MA X, XU Z, et al. AutoCompress: an automatic DNN structured pruning framework for ultra-high compression rates [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020, 34(04): 4876-4883.
|
|
|
[15] |
HE Y, ZHANG X, SUN J. Channel pruning for accelerating very deep neural networks [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 1389-1397.
|
|
|
[16] |
HU C, BAO W, WANG D, et al. Dynamic adaptive DNN surgery for inference acceleration on the edge [C]// IEEE Conference on Computer Communications. Paris: IEEE, 2019: 1423-1431.
|
|
|
[17] |
CHEN Y H, EMER J, SZE V Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks[J]. ACM SIGARCH Computer Architecture News, 2016, 44 (3): 367- 379
doi: 10.1145/3007787.3001177
|
|
|
[18] |
YANG T J, CHEN Y H, SZE V. Designing energy-efficient convolutional neural networks using energy-aware pruning [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 5687-5695.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|