多模态大模型边缘部署与推理加速技术综述
|
|
陈思如,舒元超
|
Survey on edge deployment and inference acceleration of multimodal large language models
|
|
Siru CHEN,Yuanchao SHU
|
|
| 表 2 适用于边缘侧部署的语言模型主干 |
| Tab.2 LLMs backbone for edge-side deployment |
|
| 模型系列 | 模型名称 | 参数量/109 | | LLaMA | LLaMA[7] | 7.0 | | LLaMA2[40] | 7.0 | | LLaMA3.2[67] | 1.0/3.0 | | Qwen | Qwen[8] | 1.8/7.0 | | Qwen1.5[68] | 0.5/1.8/4.0/7.0 | | Qwen2[31] | 0.5/1.5/7.0 | | Qwen2.5[35] | 0.5/1.5/3.0/7.0 | | Qwen3[69] | 0.6/1.7/4.0/8.0 | | Vicuna | Vicuna[70] | 7.0 | | MobileLLaMA | MobileLLaMA[11] | 1.3/3.1 | | Gemini | Gemini Nano1[10] | 1.8 | | Gemini Nano2[10] | 3.25 | | Phi | Phi-1[71] | 1.3 | | Phi-1.5[44] | 1.3 | | Phi-2[13] | 2.7 | | Phi-3[24] | 3.8/7.0 | | InternLM | InternLM2[27] | 1.8 | | InternLM2.5[72] | 7.0 | | TinyLlama | TinyLlama[53] | 1.1 |
|
|
|