Please wait a minute...
Frontiers of Information Technology & Electronic Engineering  2010, Vol. 11 Issue (11): 882-892    DOI: 10.1631/jzus.C1001007
    
Jie Yuan, Bao-gang Wei, Li-dong Wang, Wei-ming Lu, Yue-ting Zhuang
CMSOF: a structured data organization framework for scanned Chinese medicine books in digital libraries
Jie Yuan, Bao-gang Wei, Li-dong Wang, Wei-ming Lu, Yue-ting Zhuang
School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
 全文: PDF 
Abstract: Organizing unstructured information from books into a well-defined structure is a significant challenge in digital libraries. Most digital libraries can provide only search services at the granularity of books and few libraries allow books to be accessed at the granularity of chapters, as manually constructing directory information for books is time-consuming. Extracting structured data from scanned books thus remains an urgent and important work. In this paper, we propose a novel structured data organization framework called CMSOF to organize scanned data automatically, and apply it to a Chinese medicine digital library. In the framework, image blocks and text blocks on the scanned page of books are separated based on the gray histogram projection method or a hybrid method of region growth and the Ada-Boosting classifier at first, and then the text structure is obtained from text blocks by text size and font type recognition. Finally, image blocks and structured OCRed text are correlated at the semantic level. By integrating the structured data into a Chinese medicine information system (CMIS), we can organize the Chinese medicine books well and users can access the books with flexibility, which indicates that CMSOF is an efficient framework to organize books mixed with images and text.
Key words: Digital library    Chinese medicine    Structured data organization    Cross media    Image separation
收稿日期: 2010-09-14 出版日期: 2010-11-04
CLC:  TP391.4  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Li-dong Wang
Wei-ming Lu
Jie Yuan
Bao-gang Wei
Yue-ting Zhuang

引用本文:

Jie Yuan, Bao-gang Wei, Li-dong Wang, Wei-ming Lu, Yue-ting Zhuang. CMSOF: a structured data organization framework for scanned Chinese medicine books in digital libraries. Front. Inform. Technol. Electron. Eng., 2010, 11(11): 882-892.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C1001007        http://www.zjujournals.com/xueshu/fitee/CN/Y2010/V11/I11/882

[1] Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li. 基于注意机制编码解码模型的答案选择方法[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 535-544.
[2] Rong-Feng Zhang , Ting Deng , Gui-Hong Wang , Jing-Lun Shi , Quan-Sheng Guan . 基于可靠特征点分配算法的鲁棒性跟踪框架[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 545-558.
[3] Yue-ting Zhuang, Fei Wu, Chun Chen, Yun-he Pan. 挑战与希望:AI2.0时代从大数据到知识[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(1): 3-14.
[4] Le-kui Zhou, Si-liang Tang, Jun Xiao, Fei Wu, Yue-ting Zhuang. 基于众包标签数据深度学习的命名实体消歧算法[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(1): 97-106.
[5] M. F. Kazemi, M. A. Pourmina, A. H. Mazinan. 图像水印框架的层级-方向分解分析[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(11): 1199-1217.
[6] Guang-hui Song, Xiao-gang Jin, Gen-lang Chen, Yan Nie. 基于两级层次特征学习的图像分类方法[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(9): 897-906.
[7] Jia-yin Song, Wen-long Song, Jian-ping Huang, Liang-kuan Zhu. 基于边界分析的森林冠层半球图像中心点定位与分割[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(8): 741-749.
[8] Gao-li Sang, Hu Chen, Ge Huang, Qi-jun Zhao. 基于稠密多变量标签的“连续”头部姿态估计方法[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(6): 516-526.
[9] Xi-chuan Zhou, Fang Tang, Qin Li, Sheng-dong Hu, Guo-jun Li, Yun-jian Jia, Xin-ke Li, Yu-jie Feng. 基于多维尺度拉普拉斯分析方法的全球流感疫情监测[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(5): 413-421.
[10] Chu-hua Huang, Dong-ming Lu, Chang-yu Diao. 基于多尺度轮廓插值生成准密集时变点云模型序列[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(5): 422-434.
[11] Xiao-hu Ma, Meng Yang, Zhao Zhang. 局部不相关的局部判别嵌入人脸识别算法[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(3): 212-223.
[12] Fu-xiang Lu, Jun Huang. 超越隐主题包模型:针对场景类别识别的空间金字塔匹配[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(10): 817-828.
[13] Yu Liu, Bo Zhu. 带有几何形变的变形图像配准[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(10): 829-837.
[14] Zheng-wei Huang, Wen-tao Xue, Qi-rong Mao. 基于无监督特征学习的语音情感识别方法[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 358-366.
[15] Xun Liu, Yin Zhang, San-yuan Zhang, Ying Wang, Zhong-yan Liang, Xiu-zi Ye. 基于高清监控图像的工程车辆检测算法[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 346-357.