Using Text Mining Algorithms for Patent Documents and Publications

Author(s):  
Bart Van Looy ◽  
Tom Magerman
2014 ◽  
Vol 136 (11) ◽  
Author(s):  
Michael W. Glier ◽  
Daniel A. McAdams ◽  
Julie S. Linsey

Bioinspired design is the adaptation of methods, strategies, or principles found in nature to solve engineering problems. One formalized approach to bioinspired solution seeking is the abstraction of the engineering problem into a functional need and then seeking solutions to this function using a keyword type search method on text based biological knowledge. These function keyword search approaches have shown potential for success, but as with many text based search methods, they produce a large number of results, many of little relevance to the problem in question. In this paper, we develop a method to train a computer to identify text passages more likely to suggest a solution to a human designer. The work presented examines the possibility of filtering biological keyword search results by using text mining algorithms to automatically identify which results are likely to be useful to a designer. The text mining algorithms are trained on a pair of surveys administered to human subjects to empirically identify a large number of sentences that are, or are not, helpful for idea generation. We develop and evaluate three text classification algorithms, namely, a Naïve Bayes (NB) classifier, a k nearest neighbors (kNN) classifier, and a support vector machine (SVM) classifier. Of these methods, the NB classifier generally had the best performance. Based on the analysis of 60 word stems, a NB classifier's precision is 0.87, recall is 0.52, and F score is 0.65. We find that word stem features that describe a physical action or process are correlated with helpful sentences. Similarly, we find biological jargon feature words are correlated with unhelpful sentences.


Author(s):  
Chandrakant Ekkirala

Semantic technologies have gained prominence over the last several years. Semantic technologies are explored in detail and semantic integration of data will be outlined. The various data integration techniques and approaches will also be touched upon. Text Mining, different associated algorithms and the various tools and technologies used in text mining will be enumerated in detail. The chapter will have the following sections – 1. Data Integration Techniques • Data Integration Technique – Extraction, Transformation and Loading (ETL) • Data Integration Technique – Data Federation 2. Data Integration Approaches • Need Based Data Integration • Periodic Data Integration • Continuous Data Integration 3. Semantic Integration 4. Semantic Technologies 5. Semantic Web Technologies 6. Text Mining 7. Text Mining Algorithms 8. Tools and Technologies for Text Mining


2020 ◽  
Vol 202 ◽  
pp. 15004
Author(s):  
Aditya Tegar Satria ◽  
Mustafid ◽  
Dinar Mutiara Kusumo Nugraheni

Nowadays, the utilization of Internet of Things (IoT) is commonly used in the tourism industry, including aviation, where passengers of flight services can rate their satisfaction levels towards the product and service they use by writing their reviews in the form of text-based data on many popular websites. These passenger reviews are collections of potential big data and can be analyzed in order to extract meaningful informations. Some text mining algorithms are already in common use, including the Bayes formula and Support Vector Machine methods. This research proposes an implementation of the Bayes and SVM methods where these algorithms will operate independently yet integrated with other modules such as input data, text pre-processing and shows output result concisely in one single information system. The proposed system was successfully delivered 1000 documents of passenger reviews as input data, then after implemented the pre-processing method, the Bayes formula was used to classify the document reviews into 5 categories, including plane condition, flight comfort, staff service, food and entertainment, and price. While simultanously, the positive and negative sentiment contained in the review document was analyzed with SVM method and shows the accuracy score of 83.6% for a training to testing set ratio of 50:50, while 82.75% accuracy for the 60:40 ratio, and 83.3% accuracy for the 70:30 ratio. This research shows that two different text mining algorithms can be implemented simultaneously in a effective and efficient way, while still providing an accurate and satisfying performance results in one integrated information system.


2016 ◽  
Vol 134 (8) ◽  
pp. 39-43 ◽  
Author(s):  
Shivani Sharma ◽  
Saurabh Kr.

2018 ◽  
Author(s):  
Shatrunjai P. Singh ◽  
Swagata Karkare ◽  
Sudhir M. Baswan ◽  
Vijendra P. Singh

1.AbstractContent summarization is an important area of research in traditional data mining. The volume of studies published on anti-epileptic drugs (AED) has increased exponentially over the last two decades, making it an important area for the application of text mining based summarization algorithms. In the current study, we use text analytics algorithms to mine and summarize 10,000 PubMed abstracts related to anti-epileptic drugs published within the last 10 years. A Text Frequency – Inverse Document Frequency based filtering was applied to identify drugs with highest frequency of mentions within these abstracts. The US Food and Drug database was scrapped and linked to the results to quantify the most frequently mentioned modes of action and elucidate the pharmaceutical entities marketing these drugs. A sentiment analysis model was created to score the abstracts for sentiment positivity or negativity. Finally, a modified Latent Dirichlet Allocation topic model was generated to extract key topics associated with the most frequently mentioned AEDs. Results of this study provide accurate and data intensive insights on the progress of anti-epileptic drug research.


2011 ◽  
pp. 216-222
Author(s):  
Bohdan M. Pavlyshenko

The model of semantic context of lexemes which represent the structure semantic configuration of lexems corpus of text arrays has been proposed. It is shown that partially ordered set of semantic concepts are formed in the lexem semantic context. Concepts’ intents are defined by semantic fields, concepts extents – by lexems.


Sign in / Sign up

Export Citation Format

Share Document