scholarly journals Developing an innovative entity extraction method for unstructured data

Author(s):  
Waleed Zaghloul ◽  
Silvana Trimi
Author(s):  
Abeer K. AL-Mashhadany ◽  
Dalal N. Hamood ◽  
Ahmed T. Sadiq Al-Obaidi ◽  
Waleed K. Al-Mashhsdany

<span id="docs-internal-guid-5dcc170c-7fff-e8e4-10d4-4a07701ca923"><span>Unstructured data becomes challenges because in recent years have observed the ability to gather a massive amount of data from annotated documents. This paper interested with Arabic unstructured text analysis. Manipulating unstructured text and converting it into a form understandable by computer is a high-level aim. An important step to achieve this aim is to understand numerical phrases. This paper aims to extract numerical data from Arabic unstructured text in general. This work attempts to recognize numerical characters phrases, analyze them and then convert them into integer values. The inference engine is based on the Arabic linguistic and morphological rules. The applied method encompasses rules of numerical nouns with Arabic morphological rules, in order to achieve high accurate extraction method. Arithmetic operations are applied to convert the numerical phrase into integer value. The proper operation is determined depending on linguistic and morphological rules. It will be shown that applying Arabic linguistic rules together with arithmetic operations succeeded in extracting numerical data from Arabic unstructured text with high accuracy reaches to 100%.</span></span>


Author(s):  
Sergey Bratus ◽  
Anna Rumshisky ◽  
Alexy Khrabrov ◽  
Rajenda Magar ◽  
Paul Thompson

Author(s):  
Farhad Abedini ◽  
Fariborz Mahmoudi ◽  
Seyedeh Masoumeh Mirhashem

2021 ◽  
pp. 724-732
Author(s):  
Zeqi Ma ◽  
Lingwei Ma ◽  
Dongmei Fu ◽  
Guangxuan Song ◽  
Dawei Zhang

2014 ◽  
Vol 571-572 ◽  
pp. 1202-1205
Author(s):  
Yuan Sun ◽  
Qian Zhao

Tibetan-Chinese named entity extraction is the foundation of cross language information processing, and provides a basis for machine translation and cross language information retrieval research. In this paper, we use the multi-language links of Wikipedia to obtain Tibetan-Chinese comparable corpus, and combine sentence length, word matching and entity boundary words together to get parallel sentence. Then we extract Tibetan-Chinese named entity from the comparable corpus in three ways: (1) Extracting Natural labeling information. (2) Acquiring the links of Tibetan entries and Chinese entries. (3) Using sequence intersection method, which includes the sentence representation, Chinese named entity recognition and corresponding Tibetan sentences intersection. Finally, the results show the extraction method based on comparable corpus is effective.


Author(s):  
Douglas C. Barker

A number of satisfactory methods are available for the electron microscopy of nicleic acids. These methods concentrated on fragments of nuclear, viral and mitochondrial DNA less than 50 megadaltons, on denaturation and heteroduplex mapping (Davies et al 1971) or on the interaction between proteins and DNA (Brack and Delain 1975). Less attention has been paid to the experimental criteria necessary for spreading and visualisation by dark field electron microscopy of large intact issociations of DNA. This communication will report on those criteria in relation to the ultrastructure of the (approx. 1 x 10-14g) DNA component of the kinetoplast from Trypanosomes. An extraction method has been developed to eliminate native endonucleases and nuclear contamination and to isolate the kinetoplast DNA (KDNA) as a compact network of high molecular weight. In collaboration with Dr. Ch. Brack (Basel [nstitute of Immunology), we studied the conditions necessary to prepare this KDNA Tor dark field electron microscopy using the microdrop spreading technique.


Planta Medica ◽  
2008 ◽  
Vol 74 (09) ◽  
Author(s):  
JR Tormo ◽  
N Tabanera ◽  
D Conway ◽  
P Ramos ◽  
A Redondo ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document