scholarly journals An Ontology Approach to Data Integration using Mapping Method

Author(s):  
Dr.A.Mekala

Text mining is a technique to discover meaningful patterns from the available text documents. The pattern sighting from the text and document association of document is a well-known problem in data mining. Analysis of text content and categorization of the documents is a composite task of data mining. Some of them are supervised and some of them unsupervised manner of document compilation. The term “Federated Databases” refers to the in sequence integration of distributed, autonomous and heterogeneous databases. Nevertheless, a federation can also include information systems, not only databases. At integrating data, more than a few issues must be addressed. Here, we focus on the trouble of heterogeneity, more specifically on semantic heterogeneity – that is, problems correlated to semantically equivalent concepts or semantically related/unrelated concepts. In categorize to address this problem; we apply the idea of ontologies as a tool for data integration. In this paper, we make clear this concept and we briefly explain a technique for constructing ontology by using a hybrid ontology approach.

2018 ◽  
Vol 25 (4) ◽  
pp. 74
Author(s):  
Alfredo Silveira Araújo Neto ◽  
Marcos Negreiros

The rapid advances in technologies related to the capture and storage of data in digital format have allowed to organizations the accumulation of a volume of information extremely high, constituted a higher proportion of data in unstructured format, represented by texts. However, it is noted that the retrieval of useful information from these large repositories has been a very challenging activity. In this context, data mining is presented as a self-discovery process that acts on large databases and enables the knowledge extraction from raw text documents. Among the many sources of textual documents are electronic diaries of justice, which are intended to make public officially all the acts of the Judiciary. Despite the publication in digital form has provided improvements represented by the removal of imperfections related to divulgation at printed format, it is observed that the application of data mining methods could render more rapid analysis of its contents. In this sense, this article establishes a tool capable of automatically grouping and categorizing digital procedural acts, based on the evaluation of text mining techniques applied to groups determination activity. In addition, the strategy of defining the descriptors of the groups, that is usually conducted based on the most frequent words in the documents, was evaluated and remodeled in order to use, instead of words, the most regularly identified concepts in the texts.


2008 ◽  
Vol 392-394 ◽  
pp. 903-907 ◽  
Author(s):  
Ya Ping Wang ◽  
Jiang Hua Ge ◽  
Jun Peng Shao ◽  
S.T. Han ◽  
Zhi Qiang Li

Aim at an existent problems of manufacturing enterprise data exchange between design and manufacturing software, this paper puts forward the solution of data exchange between heterogeneous database with XML, which is based on the systematical analysis of currently data exchange methods between heterogeneous databases, construct the data exchange model between heterogeneous database based on XML. It gives mainly mapping method which is used for data exchange between XML and relational database, and analyzes the algorithm of connecting relational database, in order to solve the problem of transparently interoperation of heterogeneous database and realize data integration and sharing between manufacturing enterprise heterogeneous database. So it provides effective methods.


2021 ◽  
pp. 4101-4109
Author(s):  
Baraa Hasan Hadi ◽  
Tareef Kamil Mustafa

The majority of systems dealing with natural language processing (NLP) and artificial intelligence (AI) can assist in making automated and automatically-supported decisions. However, these systems may face challenges and difficulties or find it confusing to identify the required information (characterization) for eliciting a decision by extracting or summarizing relevant information from large text documents or colossal content.   When obtaining these documents online, for instance from social networking or social media, these sites undergo a remarkable increase in the textual content. The main objective of the present study is to conduct a survey and show the latest developments about the implementation of text-mining techniques in humanities when summarizing and eliciting automated decisions. This process relies on technological advancement and considers (1) the automated-decision support-techniques commonly used in humanities, (2) the performance evolution and the use of the stylometric approach in text-mining, and (3) the comparisons of the results of chunking text by using different attributes in Burrows' Delta method. This study also provides an overview of the efficiency of applying some selected data-mining (DM) methods with various text-mining techniques to support the critics' decision in artistry ‒ one field of humanities. The automatic choice of criticism in this field was supported by a hybrid approach to these procedures.


2010 ◽  
Vol 16 (2) ◽  
pp. 219-232 ◽  
Author(s):  
Marcin Gajzler

This article presents the possibilities of using mining techniques in building Decision Support Systems. One of the biggest problems is the issue of gaining data and knowledge, their mutual representation and reciprocal usage. Data and knowledge make up the resources of the system and are its key link. It has been estimated that 70% to 80% of the sources available for general use are text documents. The text mining technique is defined as a process aiming to extract previously unknown information from text resources (e.g. technological cards). The fundamental feature of text mining is the ability to converse text documents in formal form, which opens up great possibilities of conducting further analysis. This article presents chosen IT tools using text mining technique, along with the elements of the text mining analysis. The main objectives are the simplification of the process of knowledge acquisition, its automation and shortening as well as the creation of ready‐made models containing knowledge. Previous tests with knowledge acquisition (surveys, questionnaires) were time‐consuming and exacting for experts. Santrauka Straipsnyje pateikiamos informacijos rinkimo metodu pritaikymo galimybės sprendimų paramos sistemoms statyboje. Daugiausia problemų sukelia informacijos gavimas, tinkamas jos atvaizdavimas ir naudojimas. Duomenys yra pagrindinis sistemos išteklius. Nustatyta, kad nuo 70 iki 80 % visu turimų bendrojo naudojimo informacijos šaltinių yra tekstiniai dokumentai. Tekstines informacijos rinkimo technika yra suprantama kaip procesas, kuriuo siekiama išgauti anksčiau nežinoma informacija iš tekstiniu dokumentu (pavyzdžiui, technologiniu kortelių). Pagrindine šios technikos savybė ‐ galimybė tekstinių dokumentų informacija pateikti formalizuota forma, tai atveria plačiu galimybių tolesnei analizei. Šiame straipsnyje pateikiamos pasirinktos IT priemonės, naudojamos tekstinei informacijai rinkti. Autoriaus tikslas ‐ su paprastinti informacijos rinkimą, ji automatizuoti ir sutrumpinti, sukurti informacija apimančius modelius. Ankstesni informacijos kaupimo metodai (apklausos, anketos) reikalavo daug ekspertų darbo ir laiko.


2019 ◽  
Vol 12 (4) ◽  
pp. 84 ◽  
Author(s):  
Otmane Azeroual

With the increased accessibility of research information, the demands on research information systems (RIS) that are expected to automatically generate and process knowledge are increasing. Furthermore, the quality of the RIS data entries of the individual sources of information causes problems. If the data is structured in RIS, users can read and filter out their information and knowledge needs without any problems. This technique, which nevertheless allows text databases and text sources to be analyzed and knowledge extracted from unknown texts, is referred to as text mining or text data mining based on the principles of data mining. Text mining allows automatically classifying large heterogeneous sources of research information and assigning them to specific topics. Research information has always played a major role in higher education and academic institutions, although they were usually available in unstructured form in RIS and grow faster than structured data. This can be a waste of time searching for RIS staff in universities and can lead to bad decision-making. For this reason, the present paper proposes a new approach to obtaining structured research information from heterogeneous information systems. It is a subset of an approach to the semantic integration of unstructured data using the example of a RIS. The purpose of this paper is to investigate text and data mining methods in the context of RIS and to develop an improvement quality model as an aid to RIS using universities and academic institutions to enrich unstructured research information.


2017 ◽  
Vol 13 (21) ◽  
pp. 429
Author(s):  
Nadeem Ur-Rahman

Business Intelligence solutions are key to enable industrial organisations (either manufacturing or construction) to remain competitive in the market. These solutions are achieved through analysis of data which is collected, retrieved and re-used for prediction and classification purposes. However many sources of industrial data are not being fully utilised to improve the business processes of the associated industry. It is generally left to the decision makers or managers within a company to take effective decisions based on the information available throughout product design and manufacture or from the operation of business or production processes. Substantial efforts and energy are required in terms of time and money to identify and exploit the appropriate information that is available from the data. Data Mining techniques have long been applied mainly to numerical forms of data available from various data sources but their applications to analyse semi-structured or unstructured databases are still limited to a few specific domains. The applications of these techniques in combination with Text Mining methods based on statistical, natural language processing and visualisation techniques could give beneficial results. Text Mining methods mainly deal with document clustering, text summarisation and classification and mainly rely on methods and techniques available in the area of Information Retrieval (IR). These help to uncover the hidden information in text documents at an initial level. This paper investigates applications of Text Mining in terms of Textual Data Mining (TDM) methods which share techniques from IR and data mining. These techniques may be implemented to analyse textual databases in general but they are demonstrated here using examples of Post Project Reviews (PPR) from the construction industry as a case study. The research is focused on finding key single or multiple term phrases for classifying the documents into two classes i.e. good information and bad information documents to help decision makers or project managers to identify key issues discussed in PPRs which can be used as a guide for future project management process.


Author(s):  
Mahwish Abid ◽  
Muhammad Usman ◽  
Muhammad Waleed Ashraf

<strong>As the technology is growing very fast and usage of computer systems is increased  as compared to the old times, plagiarism is the phenomenon which is increasing day by day. Wrongful appropriation of someone else’s work is known as plagiarism. Manually detection of plagiarism is difficult so this process should be automated. There are various tools which can be used for plagiarism detection. Some works on intrinsic plagiarism while other work on extrinsic plagiarism. Data mining the field which can help in detecting the plagiarism as well as can help to improve the efficiency of the process. Different data mining techniques can be used to detect plagiarism. Text mining, clustering, bi-gram, tri-grams, n-grams are the techniques which can help in this process</strong>


Sign in / Sign up

Export Citation Format

Share Document