A Rapid Method for Information Extraction from Borehole Log Images

Borehole logs are very important for geological analysis and application. Extracting structured information from borehole logs in the image format is the key to any analysis and application based on borehole data. The current method has defects in solving the beard phenomenon of the borehole log and the identification of special geological symbols. This paper proposes an automatic extraction method for borehole log information by combining the structural analysis based on the corner mark, as well as the structural understanding based on deep learning. The principles and key technologies of the method are described in detail. The performance of the method was tested by specific examples. This method is implemented on a geological information platform called QuantyView. The information extraction of 100 borehole logs with the same specification is used to verify the effectiveness of the proposed method. The results show that the method can not only effectively solve the inconsistency between the thickness and the description information in the borehole log but it can also address the low recognition efficiency of professional vocabulary, which can improve the extraction efficiency and accuracy of the borehole log information.

Download Full-text

A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v7i2.3459 ◽

2013 ◽

Vol 7 (2) ◽

pp. 574-579 ◽

Cited By ~ 3

Author(s):

Dr Sunitha Abburu ◽

G. Suresh Babu

Keyword(s):

Information Extraction ◽

Data Extraction ◽

Research Work ◽

Web Pages ◽

Web Documents ◽

E Learning ◽

Structured Information ◽

Frame Work ◽

Effective Decision ◽

The Web

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.Â But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies Â data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.Â Â It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.Â The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.

Download Full-text

Channel waves in cross‐borehole data

Geophysics ◽

10.1190/1.1443247 ◽

1992 ◽

Vol 57 (2) ◽

pp. 334-342 ◽

Cited By ~ 13

Author(s):

Larry R. Lines ◽

Kenneth R. Kelly ◽

John Queen

Keyword(s):

Seismic Velocity ◽

Test Facility ◽

Equation Modeling ◽

Borehole Data ◽

Geological Information ◽

Channel Wave ◽

First Arrival ◽

Borehole Seismic ◽

Seismic Velocity Contrasts ◽

Ray Bending

Layered geological formations with large seismic velocity contrasts can effectively create channel waves in cross‐borehole seismic data. The existence of channel waves for such waveguides can be confirmed by ray tracing, wave equation modeling, and modal analysis. Channel wave arrivals are identified in cross‐borehole data recorded at Conoco’s Newkirk test facility. For these data, where velocity contrasts are about 2 to 1, tomography based on first arrival traveltimes, is limited due to problems with extreme ray bending and seismic shadow zones. However, it may be possible to extract geological information using channel wave information. The seismometer differencing method appears to be a promising approach for detecting waveguide boundaries by use of cross‐borehole data.

Download Full-text

Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2011-000776 ◽

2012 ◽

Vol 19 (5) ◽

pp. 824-832 ◽

Cited By ~ 38

Author(s):

Yan Xu ◽

Kai Hong ◽

Junichi Tsujii ◽

Eric I-Chao Chang

Keyword(s):

Machine Learning ◽

Information Extraction ◽

Feature Engineering ◽

Rule Based ◽

Structured Information ◽

Discharge Summaries

Download Full-text

Prediction and Analysis of Extracting Relations using Spacy Model

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f8524.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3281-3287

Keyword(s):

Natural Language ◽

Information Extraction ◽

Performance Measures ◽

Text Summarization ◽

Language Understanding ◽

Language Generation ◽

Automatic Text Summarization ◽

Structured Information ◽

Automatic Text ◽

F Measure

Text is an extremely rich resources of information. Each and every second, minutes, peoples are sending or receiving hundreds of millions of data. There are various tasks involved in NLP are machine learning, information extraction, information retrieval, automatic text summarization, question-answered system, parsing, sentiment analysis, natural language understanding and natural language generation. The information extraction is an important task which is used to find the structured information from unstructured or semi-structured text. The paper presents a methodology for extracting the relations of biomedical entities using spacy. The framework consists of following phases such as data creation, load and converting the data into spacy object, preprocessing, define the pattern and extract the relations. The dataset is downloaded from NCBI database which contains only the sentences. The created model evaluated with performance measures like precision, recall and f-measure. The model achieved 87% of accuracy in retrieving of entities relation.

Download Full-text

Ekstraksi Informasi Halaman Web Menggunakan Pendekatan Bootstrapping pada Ontology-Based Information Extraction

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.7540 ◽

2015 ◽

Vol 9 (2) ◽

pp. 111 ◽

Cited By ~ 1

Author(s):

Erma Susanti ◽

Khabib Mustofa

Keyword(s):

Information Extraction ◽

Language Processing ◽

Semantic Content ◽

Extraction Process ◽

Web Pages ◽

Structured Information ◽

Improved Performance ◽

Types Of Information ◽

Unstructured Information

AbstrakEkstraksi informasi merupakan suatu bidang ilmu untuk pengolahan bahasa alami, dengan cara mengubah teks tidak terstruktur menjadi informasi dalam bentuk terstruktur. Berbagai jenis informasi di Internet ditransmisikan secara tidak terstruktur melalui website, menyebabkan munculnya kebutuhan akan suatu teknologi untuk menganalisa teks dan menemukan pengetahuan yang relevan dalam bentuk informasi terstruktur. Contoh informasi tidak terstruktur adalah informasi utama yang ada pada konten halaman web. Bermacam pendekatan untuk ekstraksi informasi telah dikembangkan oleh berbagai peneliti, baik menggunakan metode manual atau otomatis, namun masih perlu ditingkatkan kinerjanya terkait akurasi dan kecepatan ekstraksi. Pada penelitian ini diusulkan suatu penerapan pendekatan ekstraksi informasi dengan mengkombinasikan pendekatan bootstrapping dengan Ontology-based Information Extraction (OBIE). Pendekatan bootstrapping dengan menggunakan sedikit contoh data berlabel, digunakan untuk memimalkan keterlibatan manusia dalam proses ekstraksi informasi, sedangkan penggunakan panduan ontologi untuk mengekstraksi classes (kelas), properties dan instance digunakan untuk menyediakan konten semantik untuk web semantik. Pengkombinasian kedua pendekatan tersebut diharapkan dapat meningkatan kecepatan proses ekstraksi dan akurasi hasil ekstraksi. Studi kasus untuk penerapan sistem ekstraksi informasi menggunakan dataset “LonelyPlanet”. Kata kunci—Ekstraksi informasi, ontologi, bootstrapping, Ontology-Based Information Extraction, OBIE, kinerja Abstract Information extraction is a field study of natural language processing by converting unstructured text into structured information. Several types of information on the Internet is transmitted through unstructured information via websites, led to emergence of the need a technology to analyze text and found relevant knowledge into structured information. For example of unstructured information is existing main information on the content of web pages. Various approaches for information extraction have been developed by many researchers, either using manual or automatic method, but still need to be improved performance related accuracy and speed of extraction. This research proposed an approach of information extraction that combines bootstrapping approach with Ontology-Based Information Extraction (OBIE). Bootstrapping approach using small seed of labelled data, is used to minimize human intervention on information extraction process, while the use of guide ontology for extracting classes, properties and instances, using for provide semantic content for semantic web. Combining both approaches expected to increase speed of extraction process and accuracy of extraction results. Case study to apply information extraction system using “LonelyPlanet” datasets. Keywords— Information extraction, ontology, bootstrapping, Ontology-Based Information Extraction, OBIE, performance

Download Full-text

Search Engine-Based Web Information Extraction

Web Technologies ◽

10.4018/978-1-60566-982-3.ch109 ◽

2011 ◽

pp. 2048-2081

Author(s):

Gijs Geleijnse ◽

Jan Korst

Keyword(s):

Semantic Web ◽

Information Extraction ◽

Search Engine ◽

Community Based ◽

Web Information Extraction ◽

Structure Information ◽

Web Information ◽

Structured Information ◽

The Web ◽

Standard Semantic

In this chapter we discuss approaches to find, extract, and structure information from natural language texts on the Web. Such structured information can be expressed and shared using the standard Semantic Web languages and hence be machine interpreted. In this chapter we focus on two tasks in Web information extraction. The first part focuses on mining facts from the Web, while in the second part, we present an approach to collect community-based meta-data. A search engine is used to retrieve potentially relevant texts. From these texts, instances and relations are extracted. The proposed approaches are illustrated using various case-studies, showing that we can reliably extract information from the Web using simple techniques.

Download Full-text

A Structured Information Extraction Algorithm for Scientific Papers based on Feature Rules Learning

Journal of Software ◽

10.4304/jsw.8.1.55-62 ◽

2013 ◽

Vol 8 (1) ◽

Cited By ~ 3

Author(s):

Jianguo Chen ◽

Hao Chen

Keyword(s):

Information Extraction ◽

Extraction Algorithm ◽

Scientific Papers ◽

Structured Information

Download Full-text

Three different approaches to provide urban geological information from a geological survey perspective: the Catalan case study

10.5194/egusphere-egu2020-13620 ◽

2020 ◽

Author(s):

Guillem Subiela ◽

Miquel Vilà ◽

Roser Pi ◽

Elena Sánchez

Keyword(s):

Urban Areas ◽

Geological Survey ◽

Anthropogenic Activity ◽

Geological Environment ◽

Borehole Data ◽

Geological Information ◽

Geological Factors ◽

Geological Map ◽

The Government

Studying urban geology is a key way to identify municipal issues involved with urban development and sustainability, land resources and hazard awareness in highly populated areas. In the last decade, one of the lines of work of the Catalan Geological Survey (Institut Cartogr&#224;fic i Geol&#242;gic de Catalunya) has been the development of (i) 1:5.000 scale Urban Geological Map of Catalonia project. Besides, two pilot projects have recently been started: (ii) the system of layers of geological information and (iii) the fundamental geological guides of municipalities. This communication focuses on the presentation of these projects and their utility, with the aim of finding effective ways of transferring geological knowledge and information of a territory, from a geological survey perspective.The 1:5.000 urban geological maps of Catalonia (i) have been a great ambitious project focused on providing detailed, consistent and accurate geological, geotechnical and anthropogenic activity information of the main urban areas of Catalonia. Nevertheless, it must be taken into account that the compilation and elaboration of a large volume of geological information and also the high level of detail require a lot of time for data completeness.In order to optimize a greater distribution of information, a system of layers of geological information (ii) covering urban areas is being developed. This pilot project consists of providing specific layers of Bedrock materials, Quaternary deposits, anthropogenic grounds, structural measures, geochemical compositions, borehole data and so on. However, as information layers are treated individually, it may not be clear the coherence between data from different layers of information and its use is currently limited to Earth-science professionals working with geological data.Hence, as a strategy to reach a wider range of users and also provide a homogeneous and varied geological information, the development of fundamental geological guides for municipalities is also being carried out (iii). These documents include the general geological characterization of the municipality, the description of the main geological factors (related to geotechnical properties, hydrogeology, environmental concerns and geological hazards and resources) and the list of the sources of geological information to be considered. Moreover, each guide contains a 1:50.000 geological map that has cartographic continuity with the neighbouring municipalities. The municipal guides allow a synthesis of the geological environment of the different Catalan municipalities and give fundamental recommendations for the characterization of the geological environment of the municipality.In conclusion, the three projects facilitate the characterization of geological environment of urban areas, the evaluation of geological factors in ground studies and also, in general, the management of the environment. These products differ depending on the degree of detail, the coherence of the geological information, the necessary knowledge for their execution or their purpose of use. This set of projects defines a geological urban framework, which is adjusted depending on the government&#8217;s requirements, the society&#8217;s needs and the geological survey&#8217;s available resources.

Download Full-text

Structured Information Extraction from Natural Disaster Events on Twitter

ACM Turing Centenary Celebration on - ACM-TURING '12 ◽

10.1145/2663792.2663794 ◽

2014 ◽

Cited By ~ 8

Author(s):

Sandeep Panem ◽

Manish Gupta ◽

Vasudeva Varma

Keyword(s):

Information Extraction ◽

Natural Disaster ◽

Structured Information

Download Full-text

Text Mining: Design of Interactive Search Engine Based Regular Expressions of Online Automobile Advertisements

International Journal of Engineering Pedagogy (iJEP) ◽

10.3991/ijep.v10i3.12419 ◽

2020 ◽

Vol 10 (3) ◽

pp. 35

Author(s):

Ahmed Adeeb Jalal

Keyword(s):

Text Mining ◽

Information Extraction ◽

Search Engine ◽

Language Processing ◽

Research Field ◽

Web Pages ◽

Regular Expressions ◽

Data Volume ◽

Structured Information ◽

Mining Design

Technology world has greatly evolved over the past decades, which led to inflated data volume. This progress of technology in the digital form generated scattered texts across millions of web pages. Unstructured texts contain a vast amount of textual data. Discover of useful and interesting relations from unstructured texts requires more processing by computers. Therefore, text mining and information extraction have become an exciting research field to get structured and valuable information. This paper focuses on text pre-processing of automotive advertisements domains to configure a structured database. The structured database was created by extract the information over unstructured automotive advertisements, which is an area of natural language processing. Information extraction deals with finding factual information in text using learning regular expressions. We manually craft rule-based specific approaches to extract structured information from unstructured web pages. Structured information will be provided by user-friendly search engine designed for topic-specific knowledge. Consequently, this information that extracted from these advertisements uses to perform a structured search over certain interesting attributes. Thus, the tuples are assigned a probability and indexed to support the efficiency of extraction and exploration via user queries.

Download Full-text