scholarly journals MalayIK: An Ontological Approach to Knowledge Transformation in Malay Unstructured Documents

Author(s):  
Fatimah Sidi ◽  
Iskandar Ishak ◽  
Marzanah A. Jabar

The number of unstructured documents written in Malay language is enormously available on the web and intranets. However, unstructured documents cannot be queried in simple ways, hence the knowledge contained in such documents can neither be used by automatic systems nor could be understood easily and clearly by humans. This paper proposes a new approach to transform extracted knowledge in Malay unstructured document using ontology by identifying, organizing, and structuring the documents into an interrogative structured form. A Malay knowledge base, the MalayIK corpus is developed and used to test the MalayIK-Ontology against Ontos, an existing data extraction engine. The experimental results from MalayIK-Ontology have shown a significant improvement of knowledge extraction over Ontos implementation. This shows that clear knowledge organization and structuring concept is able to increase understanding, which leads to potential increase in sharable and reusable of concepts among the community.

Author(s):  
Tianxing Wu ◽  
Guilin Qi ◽  
Bin Luo ◽  
Lei Zhang ◽  
Haofen Wang

Extracting knowledge from Wikipedia has attracted much attention in recent ten years. One of the most valuable kinds of knowledge is type information, which refers to the axioms stating that an instance is of a certain type. Current approaches for inferring the types of instances from Wikipedia mainly rely on some language-specific rules. Since these rules cannot catch the semantic associations between instances and classes (i.e. candidate types), it may lead to mistakes and omissions in the process of type inference. The authors propose a new approach leveraging attributes to perform language-independent type inference of the instances from Wikipedia. The proposed approach is applied to the whole English and Chinese Wikipedia, which results in the first version of MulType (Multilingual Type Information), a knowledge base describing the types of instances from multilingual Wikipedia. Experimental results show that not only the proposed approach outperforms the state-of-the-art comparison methods, but also MulType contains lots of new and high-quality type information.


2022 ◽  
pp. 580-606
Author(s):  
Tianxing Wu ◽  
Guilin Qi ◽  
Bin Luo ◽  
Lei Zhang ◽  
Haofen Wang

Extracting knowledge from Wikipedia has attracted much attention in recent ten years. One of the most valuable kinds of knowledge is type information, which refers to the axioms stating that an instance is of a certain type. Current approaches for inferring the types of instances from Wikipedia mainly rely on some language-specific rules. Since these rules cannot catch the semantic associations between instances and classes (i.e. candidate types), it may lead to mistakes and omissions in the process of type inference. The authors propose a new approach leveraging attributes to perform language-independent type inference of the instances from Wikipedia. The proposed approach is applied to the whole English and Chinese Wikipedia, which results in the first version of MulType (Multilingual Type Information), a knowledge base describing the types of instances from multilingual Wikipedia. Experimental results show that not only the proposed approach outperforms the state-of-the-art comparison methods, but also MulType contains lots of new and high-quality type information.


Author(s):  
Quanzhi Li ◽  
Yi-fang Brook Wu

This chapter presents a new approach of mining the Web to identify people of similar background. To find similar people from the Web for a given person, two major research issues are person representation and matching persons. In this chapter, a person representation method which uses a person’s personal Web site to represent this person’s background is proposed. Based on this person representation method, the main proposed algorithm integrates textual content and hyperlink information of all the Web pages belonging to a personal Web site to represent a person and match persons. Other algorithms are also explored and compared to the main proposed algorithm. The evaluation methods and experimental results are presented.


2020 ◽  
Vol 4 (2) ◽  
pp. 69
Author(s):  
Agung Purnomo Sidik

This research was conducted to implement the Bayes algorithm in an expert system to diagnose types of diseases in cassava plants. The research data was taken from the Binjai City Agriculture and Fisheries Office in 2018. The expert system was built based on the web, where the application was built using the PHP programming language and MySQL DBMS. The results showed that the Bayes algorithm can be used in expert system applications to diagnose types of cassava plant diseases. In the Bayes algorithm, the knowledge base is taken from the data of the amount of data from cassava plants that suffer from disease, so the results of diagnosing cassava plants are based on existing data. Therefore, the more patient data that is used as a knowledge base, the better the diagnosis results are given.


1999 ◽  
Author(s):  
C. Dumschat ◽  
J. Callaghan ◽  
R. Cockerline ◽  
L. Davison
Keyword(s):  

2013 ◽  
Vol 7 (2) ◽  
pp. 574-579 ◽  
Author(s):  
Dr Sunitha Abburu ◽  
G. Suresh Babu

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.  But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies  data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.   It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.  The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.


Author(s):  
Francisco Lamas ◽  
Miguel A. M. Ramirez ◽  
Antonio Carlos Fernandes

Flow Induced Motions are always an important subject during both design and operational phases of an offshore platform life. These motions could significantly affect the performance of the platform, including its mooring and oil production systems. These kind of analyses are performed using basically two different approaches: experimental tests with reduced models and, more recently, with Computational Fluid Dynamics (CFD) dynamic analysis. The main objective of this work is to present a new approach, based on an analytical methodology using static CFD analyses to estimate the response on yaw motions of a Tension Leg Wellhead Platform on one of the several types of motions that can be classified as flow-induced motions, known as galloping. The first step is to review the equations that govern the yaw motions of an ocean platform when subjected to currents from different angles of attack. The yaw moment coefficients will be obtained using CFD steady-state analysis, on which the yaw moments will be calculated for several angles of attack, placed around the central angle where the analysis is being carried out. Having the force coefficients plotted against the angle values, we can adjust a polynomial curve around each analysis point in order to evaluate the amplitude of the yaw motion using a limit cycle approach. Other properties of the system which are flow-dependent, such as damping and added mass, will also be estimated using CFD. The last part of this work consists in comparing the analytical results with experimental results obtained at the LOC/COPPE-UFRJ laboratory facilities.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Eleanor F. Miller ◽  
Andrea Manica

Abstract Background Today an unprecedented amount of genetic sequence data is stored in publicly available repositories. For decades now, mitochondrial DNA (mtDNA) has been the workhorse of genetic studies, and as a result, there is a large volume of mtDNA data available in these repositories for a wide range of species. Indeed, whilst whole genome sequencing is an exciting prospect for the future, for most non-model organisms’ classical markers such as mtDNA remain widely used. By compiling existing data from multiple original studies, it is possible to build powerful new datasets capable of exploring many questions in ecology, evolution and conservation biology. One key question that these data can help inform is what happened in a species’ demographic past. However, compiling data in this manner is not trivial, there are many complexities associated with data extraction, data quality and data handling. Results Here we present the mtDNAcombine package, a collection of tools developed to manage some of the major decisions associated with handling multi-study sequence data with a particular focus on preparing sequence data for Bayesian skyline plot demographic reconstructions. Conclusions There is now more genetic information available than ever before and large meta-data sets offer great opportunities to explore new and exciting avenues of research. However, compiling multi-study datasets still remains a technically challenging prospect. The mtDNAcombine package provides a pipeline to streamline the process of downloading, curating, and analysing sequence data, guiding the process of compiling data sets from the online database GenBank.


2021 ◽  
pp. 004051752098812
Author(s):  
Xixi Qian ◽  
Yuanying Shen ◽  
Qiaoli Cao ◽  
Jun Ruan ◽  
Chongwen Yu

A simulation describing the fiber movement during the condensation was conducted, and the effect of the condensation in the carding machine was studied. The simulation results showed that the condensation has the blending and the evening effect on the condensed sliver, which can be explained by the fiber rearrangement. Moreover, the increasing web width and the decreasing condensing length can result in a more uniform sliver. Further, the evening effect of the web width on the web was verified by experiments. The simulation results were in general agreement with the experimental results.


2014 ◽  
Vol 596 ◽  
pp. 292-296
Author(s):  
Xin Li Li

PageRank algorithms only consider hyperlink information, without other page information such as page hits frequency, page update time and web page category. Therefore, the algorithms rank a lot of advertising pages and old pages pretty high and can’t meet the users' needs. This paper further studies the page meta-information such as category, page hits frequency and page update time. The Web page with high hits frequency and with smaller age should get a high rank, while the above two factors are more or less dependent on page category. Experimental results show that the algorithm has good results.


Sign in / Sign up

Export Citation Format

Share Document