Android malicious behavior recognition and classification method based on random forest algorithm

doi:10.3785/j.issn.1008-973X.2019.10.019

Journal of ZheJiang University (Engineering Science)

2019, Vol. 53

Issue (10): 2013-2023 DOI: 10.3785/j.issn.1008-973X.2019.10.019

Automation Technology, Computer Technology

Android malicious behavior recognition and classification method based on random forest algorithm

Dong-xiang KE(

),Li-min PAN*(

),Sen-lin LUO,Han-qing ZHANG

Information System and Security Countermeasure Experimental Center, Beijing Institute of Technology, Beijing 100081, China

Download:

HTML

PDF(693KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

An Android malware behavior identification and classification method was proposed based on random forest (RF) algorithm aiming at the problem that the existing Android malware detection method cannot identify or classify the detected malicious behavior. The types of Android malware behavior were defined, and the potentially malicious behavior was triggered with a complex Android malicious behavior induction method. Application behavior can be captured by system function hook and transformed into behavior log. Then application behavioral feature set can be extracted from behavior log. The random forest algorithm was used to identify and classify the malicious behavior from the behavior log. The experimental results showed that proposed method had 91.6% accuracy in malware behavior identification and 96.8% accuracy in malicious behavior classification.

Key words： Android security machine learning random forest (RF) malware detection malicious behavior classification

Received: 15 November 2018 Published: 30 September 2019

CLC:

TP 399

Corresponding Authors: Li-min PAN E-mail: 384209891@qq.com;panlimin@bit.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Dong-xiang KE
	Li-min PAN
	Sen-lin LUO
	Han-qing ZHANG

Cite this article:

Dong-xiang KE,Li-min PAN,Sen-lin LUO,Han-qing ZHANG. Android malicious behavior recognition and classification method based on random forest algorithm. Journal of ZheJiang University (Engineering Science), 2019, 53(10): 2013-2023.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.10.019 OR http://www.zjujournals.com/eng/Y2019/V53/I10/2013

基于随机森林算法的Android恶意行为识别与分类方法

针对当前Android恶意软件检测方法对检测出的恶意行为无法进行识别和分类的问题，提出基于随机森林（RF）算法的Android恶意行为的识别与分类方法. 该方法在对Android恶意软件的类型进行定义的基础上，利用融合多种触发机制的Android恶意行为诱导方法触发软件的潜在恶意行为；通过Hook关键系统函数对Android软件行为进行采集并生成行为日志，基于行为日志提取软件行为特征集；使用随机森林算法，对行为日志中的恶意行为进行识别与分类. 实验结果表明，该方法对Android恶意软件识别的准确率达到91.6%，对恶意行为分类的平均准确率达到96.8%.

关键词： Android安全, 机器学习, 随机森林（RF）, 恶意软件检测, 恶意行为分类

Fig.1 Android malicious behavior recognition and classification framework structure

Fig.2 Multi-level behavior data acquisition schematic

Tab.1 Entry description of behavior log

Tab.2 Geographic information stealing code

Fig.3 API sequence for privacy stealing

Tab.3 List of apis related to privacy theft

恶意行为	监控函数列表	备注
恶意扣费	SmsManager.sendTextMessage	可用于发送sp短信，订购附加服务
	BroadcastReceiver.abortBroadcast
	ContentResolve.delete
资费消耗	DefaultHttpClient.execute	通过网络接收数据，消耗流量
	AbstractHttpClient.execute
	Socket.getInputStream
	Socket.getOutputStream
	URL.openConnection
	OutputStream.write	用于保存从网络接受的数据
	InputStream.read	读取从网络接收的数据
隐私窃取	ContentResolver.query	根据参数不同，可用于获取短信，通讯录，照片等隐私
	LocationManager.getProvider	用于获取地理位置信息
	Location.getLatitude
	Location.getLongitude
	PackageManager.getInstalledPackages	获取已安装应用
	PackageManager.getInstalledApplications	获取已安装应用
	TelePhonyManager.getSubscriberId	获取手机IMSI号码
	TelePhonyManager.getDeviceId	获取手机设备号
	SmsManager.sendTextMessage	可用于将窃取的隐私通过短信发送出去
	DefaultHttpClient.execute	可用于将窃取的隐私通过网络发送
	URL.openConnection
	AbstractHttpClient.execute
流氓行为	Runtime.exec（"su"）	可用于获取root权限
	DevicePolicyManager.isAdminActive	获取设备管理员权限
	ApplicationPackageManager.setComponentEnabledSetting	隐藏应用图标
	ApplicationPackageManager.installPackage	静默安装
	ShortcutIconResource.fromContext	创建快捷方式
	Dialog.onCreate	可用于广告弹窗
	java.lang.Runtime.exec（"mount"）	将应用设置为系统应用
	java.lang.Runtime.exec（"cp"）
	java.lang.Runtime.exec（"chmod"）
系统破坏	Runtime.exec（"su"）	可用于获取root权限
	DevicePolicyManager.isAdminActive	用于获取设备管理员权限
	ActivityManager.getRunningAppProcesses	用于查看现有进程信息
	ActivityManager.killBackgroundProcesses	用于终止其他进程
	ActivityManager.forceStopPackage	用于终止其他应用
	ApplicationPackageManager.deletePackage	用于卸载其他应用
	File.delete	用于删除用户文件
	Cipher.getInstance	用于加密用户文件
	MessageDigest.getInstance	用于加密用户文件
	ApplicationPackageManager.setComponentEnabledSetting	可用于终止其他组件
	android.app.admin.DevicePolicyManager.resetPassword	可用于修改锁屏密码，并锁屏
	android.app.admin.DevicePolicyManager.lockNow	可用于修改锁屏密码，并锁屏
权限提升	Runtime.exec（"su"）	可用于获取root权限
	DevicePolicyManager.isAdminActive	用于获取设备管理员权限
	mmap	通过脏牛、Futex、zergRush等漏洞进行权限提升攻击时所需使用的native函数
	madvise
	malloc
	phtread_create
	getgid
	futex_lock_pi
	futex_lock_pi_atomic
	mount
	fopen（"/proc/mounts"，"r"）
	setresuid	设置文件的S权限位

Fig.4 Malicious behavior detection process

Tab.4 Malware recognition experiment environment

Tab.5 Malware recognition experiment results

Tab.6 Number of malicious behavior samples

Tab.7 Malicious behavior classification experiment environment

Tab.8 Malicious behavior classification confusion matrix

Tab.9 Malicious behavior classification experiment results


[1]	EGHAM. Gartner says worldwide sales of smartphones recorded 1st ever decline during the 4th quarter of 2017 [EB/OL].[2018-05-01]. https://www.gartnner.com/newsroom/id/3859963.

[2]	FENG Y, ANAND S, DILLIG I, et al. Apposcopy: semantics-based detection of Android malware through static analysis [C] // ACM Sigsoft International Symposium on Foundations of Software Engineering. Hong Kong: ACM, 2014: 576-587.

[3]	MOUHEB D, MOUHEB D, MOUHEB D, et al. Cypider: building community-based cyber-defense infrastructure for android malware detection [C] // Conference on Computer Security Applications. Atlanta: ACM, 2016: 348-362.

[4]	FELDMAN S, STADTHER D, WANG B. Manilyzer: automated Android malware detection through manifest analysis [C] // IEEE International Conference on Mobile Ad Hoc and Sensor Systems. Dallas: IEEE, 2015: 767-772.

[5]	LI J, SUN L, YAN Q, et al Significant permission identification for machine-learning-based Android malware detection[J]. IEEE Transactions on Industrial Informatics, 2018, 14 (7): 3216- 3225 doi: 10.1109/TII.2017.2789219

[6]	TALHA K A, ALPER D I, AYDIN C APK auditor: permission-based Android malware detection system[J]. Digital Investigation, 2015, 13 (10): 1- 14

[7]	SUN L, LI Z, YAN Q, et al. SigPID: significant permission identification for android malware detection [C] // International Conference on Malicious and Unwanted Software. Fajardo: IEEE, 2017: 1-8.

[8]	MASSARELLI L, ANIELLO L, CICCOTELLI C, et al. Android malware family classification based on resource consumption over time [C] // International Conference on Malicious and Unwanted Software. Fajardo: IEEE, 2017: 31-38.

[9]	MALIK J, KAUSHAL R. CREDROID: Android malware detection by network traffic analysis [C] // ACM Workshop on Privacy-Aware Mobile Computing. Paderborn: ACM, 2016: 28-36.

[10]	ZULKIFLI A, HAMID I R A, SHAH W M, et al. Android malware detection based on network traffic using decision tree algorithm [C] // International Conference on Soft Computing and Data Mining. Cham: Springer, 2018: 485-494.

[11]	SUN Y S, CHEN C C, HSIAO S W, et al. ANTSdroid: automatic malware family behaviour generation and analysis for Android apps [C] // Australasian Conference on Information Security and Privacy. Cham: Springer, 2018: 796-804.

[12]	HUANG J, ZHANG X, TAN L, et al. AsDroid: detecting stealthy behaviors in Android applications by user interface and program behavior contradiction [C] // International Conference on Software Engineering. Zurich: ACM, 2014: 1036-1046.

[13]	DAMOPOULOS D, KAMBOURAKIS G, PORTOKALIDIS G. The best of both worlds: a framework for the synergistic operation of host and cloud anomaly-based IDS for smartphones [C] // European Workshop on System Security. Amsterdam: ACM, 2014: 6.

[14]	ENCK W, GILBERT P, CHUN B G, et al. TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones [C] // Usenix Conference on Operating Systems Design and Implementation. Broomfield: ACM, 2014: 393-407.

[15]	ZHANG Y, YANG M, XU B, et al. Vetting undesirable behaviors in android apps with permission use analysis [C] // ACM Sigsac Conference on Computer and Communications Security. Berlin: ACM, 2013: 611-622.

[16]	中国反病毒联盟. 移动互联网恶意程序描述格式[EB/OL].[2018-05-01]. https://white.anva.org.cn/rel/file/ydwj.pdf.

[1]	You ZHAN,Qiang LI,Xiao-tian MA,Chen-ping WANG,Yan-jun QIU. Macro and micro texture based prediction of pavement surface friction[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 684-694.

[2]	Yong YU,Jing-yuan XUE,Sheng DAI,Qiang-wei BAO,Gang ZHAO. Quality prediction and process parameter optimization method for machining parts[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(3): 441-447.

[3]	Qiao-hong CHEN,YI CHEN,Wen-shu Li,Yu-bo JIA. Clothing image classification based on multi-scale SE-Xception[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1727-1735.

[4]	Hui-fang WANG,Chen-yu ZHANG. Prediction of voltage stability margin in power system based on extreme gradient boosting algorithm[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 606-613.

[5]	Le XIE,Xi-dan HENG,Yang LIU,Qi-long JIANG,Dong LIU. Transformer fault diagnosis based on linear discriminant analysis and step-by-step machine learning[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(11): 2266-2272.

[6]	Zhi-yuan WAN,Jia-heng TAO,Jia-kun LIANG,Zhen-gong CAI,Cheng CHANG,Lin QIAO,Qiao-ni ZHOU. Large-scale empirical study on machine learning related questions on Stack Overflow[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(5): 819-828.

[7]	Jiang-kuan XING,Hai-ou WANG,Kun LUO,Yun BAI,Jian-ren FAN. Random forest model for predicting kinetic parameters of biomass devolatilization[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 605-612.

[8]	HU Li-sha, WANG Su-zhen, CHEN Yi-qiang, GAO Chen-long, HU Chun-yu, JIANG Xin-long, CHEN Zhen-yu, GAO Xing-yu. Fall detection algorithms based on wearable device: a review[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(9): 1717-1728.

[9]	WANG Hong-kai, CHEN Zhong-hua, ZHOU Zong-wei, LI Ying-ci, LU Pei-ou, WANG Wen-zhi, LIU Wan-yu, YU Li-juan. Evaluation of machine learning classifiers for diagnosing mediastinal lymph node metastasis of lung cancer from PET/CT images[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(4): 788-797.

[10]	WU Peng-zhou, YU Hui-min, ZENG Xiong. Object counting based on regularized risk minimization[J]. Journal of ZheJiang University (Engineering Science), 2014, 48(7): 1226-1233.

Viewed

Full text

Abstract

Cited

Shared

Discussed