Issue 2
Apr.  2017
Turn off MathJax
Article Contents
YU Chen, MAO Zhe, GAO Song. An Approach of Extracting Information for Maritime Unstructured Text Based on Rules[J]. Journal of Transport Information and Safety, 2017, 35(2): 40-47. doi: 10.3963/j.issn.1674-4861.2017.02.007
Citation: YU Chen, MAO Zhe, GAO Song. An Approach of Extracting Information for Maritime Unstructured Text Based on Rules[J]. Journal of Transport Information and Safety, 2017, 35(2): 40-47. doi: 10.3963/j.issn.1674-4861.2017.02.007

An Approach of Extracting Information for Maritime Unstructured Text Based on Rules

doi: 10.3963/j.issn.1674-4861.2017.02.007
  • Publish Date: 2017-04-28
  • Structural processing of maritime data plays an important role in maritime safety.There is a plenty of maritime related information on internet.However, most of the information is unstructured data which has different formats.An approach of extracting maritime information and converting unstructured text into structural data is proposed in this paper.Web crawlers are used to obtain the text data from maritime-related Web pages.According to the definitions of the texts, they are divided into four items, which are time, location, vessel name, and type of accident.According to the extraction process and its common trigger words, the maritime lexicon for segmentation of Chinese words and part-of-speech tagging is constructed.Relying on an analysis of a large number of accident corpuses, the rules for extraction of information are summarized.The structured maritime data is then formulated.In order to verify the feasibility of this approach in term of extracting information based on rules, the data from the website of The Yangtze river maritime bureau is applied as a case study.The results indicate that the precision of extracting time information is 100%, with the recall rate of 91%.The precision of extracting location information is 94.52%, with the recall rate of 69%.The precision of extracting vessel name information is 97.75%, with the recall rate of 86%.The precision of extracting accident type information is 96.6%, with the recall rate of 87%.

     

  • loading
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (600) PDF downloads(5) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return