多模态大模型边缘部署与推理加速技术综述

多模态大模型边缘部署与推理加速技术综述

陈思如,舒元超

Survey on edge deployment and inference acceleration of multimodal large language models

Siru CHEN,Yuanchao SHU

表 2 适用于边缘侧部署的语言模型主干

Tab.2 LLMs backbone for edge-side deployment

模型系列	模型名称	参数量/10⁹
LLaMA	LLaMA^[7]	7.0
	LLaMA2^[40]	7.0
	LLaMA3.2^[67]	1.0/3.0
Qwen	Qwen^[8]	1.8/7.0
	Qwen1.5^[68]	0.5/1.8/4.0/7.0
	Qwen2^[31]	0.5/1.5/7.0
	Qwen2.5^[35]	0.5/1.5/3.0/7.0
	Qwen3^[69]	0.6/1.7/4.0/8.0
Vicuna	Vicuna^[70]	7.0
MobileLLaMA	MobileLLaMA^[11]	1.3/3.1
Gemini	Gemini Nano1^[10]	1.8
Gemini	Gemini Nano2^[10]	3.25
Phi	Phi-1^[71]	1.3
	Phi-1.5^[44]	1.3
	Phi-2^[13]	2.7
	Phi-3^[24]	3.8/7.0
InternLM	InternLM2^[27]	1.8
InternLM	InternLM2.5^[72]	7.0
TinyLlama	TinyLlama^[53]	1.1