基于改进三体训练法的半监督专利文本分类方法
|
胡云青,邱清盈,余秀,武建伟
|
Semi-supervised patent text classification method based on improved Tri-training algorithm
|
Yun-qing HU,Qing-ying QIU,Xiu YU,Jian-wei WU
|
|
表 5 aclImdb数据集特征选择对比结果(试验5) |
Tab.5 Comparsion results of feature selection on aclImdb dataset (Test 5) |
|
分类器 | F1 | Dim=150 | Dim=250 | Dim=350 | Dim=450 | Dim=550 | Dim=650 | Dim=750 | Dim=850 | Dim=950 | Xgboost | IG_New&Xgboost | 0.665 | 0.670 | 0.670 | 0.675 | 0.708 | 0.704 | 0.700 | 0.700 | 0.700 | IG&Xgboost | 0.648 | 0.660 | 0.660 | 0.662 | 0.690 | 0.690 | 0.674 | 0.670 | 0.670 | SVM | IG_New&SVM | 0.664 | 0.670 | 0.665 | 0.673 | 0.671 | 0.674 | 0.674 | 0.674 | 0.635 | IG&SVM | 0.595 | 0.652 | 0.584 | 0.594 | 0.635 | 0.604 | 0.585 | 0.592 | 0.592 | NB | IG_New&NB | 0.660 | 0.660 | 0.660 | 0.667 | 0.667 | 0.654 | 0.663 | 0.660 | 0.650 | IG&NB | 0.625 | 0.594 | 0.610 | 0.653 | 0.660 | 0.622 | 0.600 | 0.615 | 0.602 |
|
|
|