An Ontology Approach to Data Integration using
Mapping Method

Text mining is a technique to discover meaningful patterns from the available text documents. The pattern sighting from the text and document association of document is a well-known problem in data mining. Analysis of text content and categorization of the documents is a composite task of data mining. Some of them are supervised and some of them unsupervised manner of document compilation. The term “Federated Databases” refers to the in sequence integration of distributed, autonomous and heterogeneous databases. Nevertheless, a federation can also include information systems, not only databases. At integrating data, more than a few issues must be addressed. Here, we focus on the trouble of heterogeneity, more specifically on semantic heterogeneity – that is, problems correlated to semantically equivalent concepts or semantically related/unrelated concepts. In categorize to address this problem; we apply the idea of ontologies as a tool for data integration. In this paper, we make clear this concept and we briefly explain a technique for constructing ontology by using a hybrid ontology approach.

Download Full-text

Use of text mining techniques for unsupervised organization of digital procedural acts

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.83581 ◽

2018 ◽

Vol 25 (4) ◽

pp. 74

Author(s):

Alfredo Silveira Araújo Neto ◽

Marcos Negreiros

Keyword(s):

Data Mining ◽

Text Mining ◽

Text Documents ◽

Digital Format ◽

Large Databases ◽

Context Data ◽

Self Discovery ◽

The Many ◽

Many Sources ◽

And Storage

The rapid advances in technologies related to the capture and storage of data in digital format have allowed to organizations the accumulation of a volume of information extremely high, constituted a higher proportion of data in unstructured format, represented by texts. However, it is noted that the retrieval of useful information from these large repositories has been a very challenging activity. In this context, data mining is presented as a self-discovery process that acts on large databases and enables the knowledge extraction from raw text documents. Among the many sources of textual documents are electronic diaries of justice, which are intended to make public officially all the acts of the Judiciary. Despite the publication in digital form has provided improvements represented by the removal of imperfections related to divulgation at printed format, it is observed that the application of data mining methods could render more rapid analysis of its contents. In this sense, this article establishes a tool capable of automatically grouping and categorizing digital procedural acts, based on the evaluation of text mining techniques applied to groups determination activity. In addition, the strategy of defining the descriptors of the groups, that is usually conducted based on the most frequent words in the documents, was evaluated and remodeled in order to use, instead of words, the most regularly identified concepts in the texts.

Download Full-text

Research for Data Exchange Technology of Heterogeneous Database Based on XML

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.392-394.903 ◽

2008 ◽

Vol 392-394 ◽

pp. 903-907 ◽

Cited By ~ 2

Author(s):

Ya Ping Wang ◽

Jiang Hua Ge ◽

Jun Peng Shao ◽

S.T. Han ◽

Zhi Qiang Li

Keyword(s):

Data Integration ◽

Relational Database ◽

Data Exchange ◽

Mapping Method ◽

Exchange Model ◽

Heterogeneous Databases ◽

Heterogeneous Database ◽

Manufacturing Enterprise ◽

Design And Manufacturing ◽

Manufacturing Software

Aim at an existent problems of manufacturing enterprise data exchange between design and manufacturing software, this paper puts forward the solution of data exchange between heterogeneous database with XML, which is based on the systematical analysis of currently data exchange methods between heterogeneous databases, construct the data exchange model between heterogeneous database based on XML. It gives mainly mapping method which is used for data exchange between XML and relational database, and analyzes the algorithm of connecting relational database, in order to solve the problem of transparently interoperation of heterogeneous database and realize data integration and sharing between manufacturing enterprise heterogeneous database. So it provides effective methods.

Download Full-text

Inspecting Hybrid Data Mining Approaches in Decision Support Systems for Humanities Texts Criticism

Iraqi Journal of Science ◽

10.24996/ijs.2021.62.11.30 ◽

2021 ◽

pp. 4101-4109

Author(s):

Baraa Hasan Hadi ◽

Tareef Kamil Mustafa

Keyword(s):

Data Mining ◽

Decision Support ◽

Text Mining ◽

Language Processing ◽

Hybrid Approach ◽

Decision Support Techniques ◽

Relevant Information ◽

Delta Method ◽

Technological Advancement ◽

Text Documents

The majority of systems dealing with natural language processing (NLP) and artificial intelligence (AI) can assist in making automated and automatically-supported decisions. However, these systems may face challenges and difficulties or find it confusing to identify the required information (characterization) for eliciting a decision by extracting or summarizing relevant information from large text documents or colossal content. When obtaining these documents online, for instance from social networking or social media, these sites undergo a remarkable increase in the textual content. The main objective of the present study is to conduct a survey and show the latest developments about the implementation of text-mining techniques in humanities when summarizing and eliciting automated decisions. This process relies on technological advancement and considers (1) the automated-decision support-techniques commonly used in humanities, (2) the performance evolution and the use of the stylometric approach in text-mining, and (3) the comparisons of the results of chunking text by using different attributes in Burrows' Delta method. This study also provides an overview of the efficiency of applying some selected data-mining (DM) methods with various text-mining techniques to support the critics' decision in artistry ‒ one field of humanities. The automatic choice of criticism in this field was supported by a hybrid approach to these procedures.

Download Full-text

TEXT AND DATA MINING TECHNIQUES IN ASPECT OF KNOWLEDGE ACQUISITION FOR DECISION SUPPORT SYSTEM IN CONSTRUCTION INDUSTRY / DUOMENŲ RINKIMO METODAI STATYBOS SPRENDIMŲ PARAMOS SISTEMAI

Technological and Economic Development of Economy ◽

10.3846/tede.2010.14 ◽

2010 ◽

Vol 16 (2) ◽

pp. 219-232 ◽

Cited By ~ 25

Author(s):

Marcin Gajzler

Keyword(s):

Data Mining ◽

Decision Support ◽

Text Mining ◽

Knowledge Acquisition ◽

Construction Industry ◽

Support System ◽

Text Documents ◽

Fundamental Feature ◽

Mining Technique ◽

Text And Data Mining

This article presents the possibilities of using mining techniques in building Decision Support Systems. One of the biggest problems is the issue of gaining data and knowledge, their mutual representation and reciprocal usage. Data and knowledge make up the resources of the system and are its key link. It has been estimated that 70% to 80% of the sources available for general use are text documents. The text mining technique is defined as a process aiming to extract previously unknown information from text resources (e.g. technological cards). The fundamental feature of text mining is the ability to converse text documents in formal form, which opens up great possibilities of conducting further analysis. This article presents chosen IT tools using text mining technique, along with the elements of the text mining analysis. The main objectives are the simplification of the process of knowledge acquisition, its automation and shortening as well as the creation of ready‐made models containing knowledge. Previous tests with knowledge acquisition (surveys, questionnaires) were time‐consuming and exacting for experts. Santrauka Straipsnyje pateikiamos informacijos rinkimo metodu pritaikymo galimybės sprendimų paramos sistemoms statyboje. Daugiausia problemų sukelia informacijos gavimas, tinkamas jos atvaizdavimas ir naudojimas. Duomenys yra pagrindinis sistemos išteklius. Nustatyta, kad nuo 70 iki 80 % visu turimų bendrojo naudojimo informacijos šaltinių yra tekstiniai dokumentai. Tekstines informacijos rinkimo technika yra suprantama kaip procesas, kuriuo siekiama išgauti anksčiau nežinoma informacija iš tekstiniu dokumentu (pavyzdžiui, technologiniu kortelių). Pagrindine šios technikos savybė ‐ galimybė tekstinių dokumentų informacija pateikti formalizuota forma, tai atveria plačiu galimybių tolesnei analizei. Šiame straipsnyje pateikiamos pasirinktos IT priemonės, naudojamos tekstinei informacijai rinkti. Autoriaus tikslas ‐ su paprastinti informacijos rinkimą, ji automatizuoti ir sutrumpinti, sukurti informacija apimančius modelius. Ankstesni informacijos kaupimo metodai (apklausos, anketos) reikalavo daug ekspertų darbo ir laiko.

Download Full-text

A Text and Data Analytics Approach to Enrich the Quality of Unstructured Research Information

Computer and Information Science ◽

10.5539/cis.v12n4p84 ◽

2019 ◽

Vol 12 (4) ◽

pp. 84 ◽

Cited By ~ 1

Author(s):

Otmane Azeroual

Keyword(s):

Data Mining ◽

Information Systems ◽

Text Mining ◽

Semantic Integration ◽

Sources Of Information ◽

Academic Institutions ◽

Quality Model ◽

Research Information ◽

The Individual

With the increased accessibility of research information, the demands on research information systems (RIS) that are expected to automatically generate and process knowledge are increasing. Furthermore, the quality of the RIS data entries of the individual sources of information causes problems. If the data is structured in RIS, users can read and filter out their information and knowledge needs without any problems. This technique, which nevertheless allows text databases and text sources to be analyzed and knowledge extracted from unknown texts, is referred to as text mining or text data mining based on the principles of data mining. Text mining allows automatically classifying large heterogeneous sources of research information and assigning them to specific topics. Research information has always played a major role in higher education and academic institutions, although they were usually available in unstructured form in RIS and grow faster than structured data. This can be a waste of time searching for RIS staff in universities and can lead to bad decision-making. For this reason, the present paper proposes a new approach to obtaining structured research information from heterogeneous information systems. It is a subset of an approach to the semantic integration of unstructured data using the example of a RIS. The purpose of this paper is to investigate text and data mining methods in the context of RIS and to develop an improvement quality model as an aid to RIS using universities and academic institutions to enrich unstructured research information.

Download Full-text

Textual Data Mining For Knowledge Discovery and Data Classification: A Comparative Study

European Scientific Journal ESJ ◽

10.19044/esj.2017.v13n21p429 ◽

2017 ◽

Vol 13 (21) ◽

pp. 429

Author(s):

Nadeem Ur-Rahman

Keyword(s):

Data Mining ◽

Text Mining ◽

Language Processing ◽

Business Processes ◽

Decision Makers ◽

Text Documents ◽

Textual Data ◽

Statistical Natural Language Processing ◽

Mining Methods ◽

Many Sources

Business Intelligence solutions are key to enable industrial organisations (either manufacturing or construction) to remain competitive in the market. These solutions are achieved through analysis of data which is collected, retrieved and re-used for prediction and classification purposes. However many sources of industrial data are not being fully utilised to improve the business processes of the associated industry. It is generally left to the decision makers or managers within a company to take effective decisions based on the information available throughout product design and manufacture or from the operation of business or production processes. Substantial efforts and energy are required in terms of time and money to identify and exploit the appropriate information that is available from the data. Data Mining techniques have long been applied mainly to numerical forms of data available from various data sources but their applications to analyse semi-structured or unstructured databases are still limited to a few specific domains. The applications of these techniques in combination with Text Mining methods based on statistical, natural language processing and visualisation techniques could give beneficial results. Text Mining methods mainly deal with document clustering, text summarisation and classification and mainly rely on methods and techniques available in the area of Information Retrieval (IR). These help to uncover the hidden information in text documents at an initial level. This paper investigates applications of Text Mining in terms of Textual Data Mining (TDM) methods which share techniques from IR and data mining. These techniques may be implemented to analyse textual databases in general but they are demonstrated here using examples of Post Project Reviews (PPR) from the construction industry as a case study. The research is focused on finding key single or multiple term phrases for classifying the documents into two classes i.e. good information and bad information documents to help decision makers or project managers to identify key issues discussed in PPRs which can be used as a guide for future project management process.

Download Full-text

Data mining in healthcare information systems: Case studies in Northern Lebanon

The Third International Conference on e-Technologies and Networks for Development (ICeND2014) ◽

10.1109/icend.2014.6991370 ◽

2014 ◽

Cited By ~ 1

Author(s):

Ahmad Shahin ◽

Walid Moudani ◽

Fadi Chakik ◽

Mohamad Khalil

Keyword(s):

Data Mining ◽

Information Systems ◽

Case Studies ◽

Healthcare Information ◽

Healthcare Information Systems

Download Full-text

Plagiarism Detection Process using Data Mining Techniques

International Journal of Recent Contributions from Engineering Science & IT (iJES) ◽

10.3991/ijes.v5i4.7869 ◽

2017 ◽

Vol 5 (4) ◽

pp. 68

Author(s):

Mahwish Abid ◽

Muhammad Usman ◽

Muhammad Waleed Ashraf

Keyword(s):

Data Mining ◽

Text Mining ◽

Computer Systems ◽

Plagiarism Detection ◽

Data Mining Techniques ◽

Detection Process ◽

Using Data ◽

Day By Day

<strong>As the technology is growing very fast and usage of computer systems is increased as compared to the old times, plagiarism is the phenomenon which is increasing day by day. Wrongful appropriation of someone else’s work is known as plagiarism. Manually detection of plagiarism is difficult so this process should be automated. There are various tools which can be used for plagiarism detection. Some works on intrinsic plagiarism while other work on extrinsic plagiarism. Data mining the field which can help in detecting the plagiarism as well as can help to improve the efficiency of the process. Different data mining techniques can be used to detect plagiarism. Text mining, clustering, bi-gram, tri-grams, n-grams are the techniques which can help in this process</strong>

Download Full-text