Using Text Mining Algorithms for Patent Documents and Publications

Assessment of Latent Semantic Analysis (LSA) Text Mining Algorithms for Large Scale Mapping of Patent and Scientific Publication Documents

SSRN Electronic Journal ◽

10.2139/ssrn.2096159 ◽

2011 ◽

Cited By ~ 2

Author(s):

Bart Van Looy ◽

Bart Baesens ◽

Tom Magerman ◽

Koenraad Debackere

Keyword(s):

Text Mining ◽

Latent Semantic Analysis ◽

Large Scale ◽

Semantic Analysis ◽

Scientific Publication ◽

Mining Algorithms

Download Full-text

Exploring Automated Text Classification to Improve Keyword Corpus Search Results for Bioinspired Design

Journal of Mechanical Design ◽

10.1115/1.4028167 ◽

2014 ◽

Vol 136 (11) ◽

Cited By ~ 8

Author(s):

Michael W. Glier ◽

Daniel A. McAdams ◽

Julie S. Linsey

Keyword(s):

Text Mining ◽

Text Classification ◽

Keyword Search ◽

Idea Generation ◽

Support Vector ◽

Biological Knowledge ◽

Svm Classifier ◽

Search Results ◽

Bioinspired Design ◽

Mining Algorithms

Bioinspired design is the adaptation of methods, strategies, or principles found in nature to solve engineering problems. One formalized approach to bioinspired solution seeking is the abstraction of the engineering problem into a functional need and then seeking solutions to this function using a keyword type search method on text based biological knowledge. These function keyword search approaches have shown potential for success, but as with many text based search methods, they produce a large number of results, many of little relevance to the problem in question. In this paper, we develop a method to train a computer to identify text passages more likely to suggest a solution to a human designer. The work presented examines the possibility of filtering biological keyword search results by using text mining algorithms to automatically identify which results are likely to be useful to a designer. The text mining algorithms are trained on a pair of surveys administered to human subjects to empirically identify a large number of sentences that are, or are not, helpful for idea generation. We develop and evaluate three text classification algorithms, namely, a Naïve Bayes (NB) classifier, a k nearest neighbors (kNN) classifier, and a support vector machine (SVM) classifier. Of these methods, the NB classifier generally had the best performance. Based on the analysis of 60 word stems, a NB classifier's precision is 0.87, recall is 0.52, and F score is 0.65. We find that word stem features that describe a physical action or process are correlated with helpful sentences. Similarly, we find biological jargon feature words are correlated with unhelpful sentences.

Download Full-text

Semantic Interation, Text Mining, Tools and Technologies

Software Innovations in Clinical Drug Development and Safety - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-4666-8726-4.ch004 ◽

2015 ◽

pp. 47-63

Author(s):

Chandrakant Ekkirala

Keyword(s):

Text Mining ◽

Data Integration ◽

Semantic Integration ◽

Semantic Technologies ◽

Integration Technique ◽

Semantic Web Technologies ◽

Web Technologies ◽

Periodic Data ◽

Integration Techniques ◽

Mining Algorithms

Semantic technologies have gained prominence over the last several years. Semantic technologies are explored in detail and semantic integration of data will be outlined. The various data integration techniques and approaches will also be touched upon. Text Mining, different associated algorithms and the various tools and technologies used in text mining will be enumerated in detail. The chapter will have the following sections – 1. Data Integration Techniques • Data Integration Technique – Extraction, Transformation and Loading (ETL) • Data Integration Technique – Data Federation 2. Data Integration Approaches • Need Based Data Integration • Periodic Data Integration • Continuous Data Integration 3. Semantic Integration 4. Semantic Technologies 5. Semantic Web Technologies 6. Text Mining 7. Text Mining Algorithms 8. Tools and Technologies for Text Mining

Download Full-text

Implementation of Integrated Bayes Formula and Support Vector Machine for Analysing Airline’s Passengers Review

E3S Web of Conferences ◽

10.1051/e3sconf/202020215004 ◽

2020 ◽

Vol 202 ◽

pp. 15004

Author(s):

Aditya Tegar Satria ◽

Mustafid ◽

Dinar Mutiara Kusumo Nugraheni

Keyword(s):

Support Vector Machine ◽

Information System ◽

Text Mining ◽

Input Data ◽

Tourism Industry ◽

Support Vector ◽

Accuracy Score ◽

Bayes Formula ◽

Performance Results ◽

Mining Algorithms

Nowadays, the utilization of Internet of Things (IoT) is commonly used in the tourism industry, including aviation, where passengers of flight services can rate their satisfaction levels towards the product and service they use by writing their reviews in the form of text-based data on many popular websites. These passenger reviews are collections of potential big data and can be analyzed in order to extract meaningful informations. Some text mining algorithms are already in common use, including the Bayes formula and Support Vector Machine methods. This research proposes an implementation of the Bayes and SVM methods where these algorithms will operate independently yet integrated with other modules such as input data, text pre-processing and shows output result concisely in one single information system. The proposed system was successfully delivered 1000 documents of passenger reviews as input data, then after implemented the pre-processing method, the Bayes formula was used to classify the document reviews into 5 categories, including plane condition, flight comfort, staff service, food and entertainment, and price. While simultanously, the positive and negative sentiment contained in the review document was analyzed with SVM method and shows the accuracy score of 83.6% for a training to testing set ratio of 50:50, while 82.75% accuracy for the 60:40 ratio, and 83.3% accuracy for the 70:30 ratio. This research shows that two different text mining algorithms can be implemented simultaneously in a effective and efficient way, while still providing an accurate and satisfying performance results in one integrated information system.

Download Full-text

Benchmarking of Scientific Research Clusters by Use of Text Mining Algorithms on Textual Artefacts

2016 International Conference on Information Systems Engineering (ICISE) ◽

10.1109/icise.2016.9 ◽

2016 ◽

Author(s):

Stefan Schroeder ◽

Christian Tummel ◽

Ingrid Isenhardt ◽

Sabina Jeschke ◽

Anja Richert

Keyword(s):

Text Mining ◽

Scientific Research ◽

Mining Algorithms

Download Full-text

On Text Mining Algorithms for Automated Maintenance of Hierarchical Knowledge Directory

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/11811220_18 ◽

2006 ◽

pp. 202-214

Author(s):

Han-joon Kim

Keyword(s):

Text Mining ◽

Mining Algorithms

Download Full-text

Review on Text Mining Algorithms

International Journal of Computer Applications ◽

10.5120/ijca2016907972 ◽

2016 ◽

Vol 134 (8) ◽

pp. 39-43 ◽

Cited By ~ 3

Author(s):

Shivani Sharma ◽

Saurabh Kr.

Keyword(s):

Text Mining ◽

Mining Algorithms

Download Full-text

RANKING FOUR AND FIVE STAR HOTELS BASED ON CUSTOMER SATISFACTION WITH TEXT MINING ALGORITHMS: A SURVEY RESEARCH ON BANGKOK HOTELS

10.17501/icoht.2016.4102 ◽

2016 ◽

Author(s):

Farhad Safiri ◽

◽

Mohammad Jafar Jalali ◽

Keyword(s):

Text Mining ◽

Customer Satisfaction ◽

Survey Research ◽

Mining Algorithms

Download Full-text

The application of text mining algorithms in summarizing trends in anti-epileptic drug research

10.1101/269308 ◽

2018 ◽

Cited By ~ 2

Author(s):

Shatrunjai P. Singh ◽

Swagata Karkare ◽

Sudhir M. Baswan ◽

Vijendra P. Singh

Keyword(s):

Text Mining ◽

Latent Dirichlet Allocation ◽

Drug Research ◽

Topic Model ◽

Analysis Model ◽

Data Intensive ◽

Document Frequency ◽

Anti Epileptic Drug ◽

The Us ◽

Mining Algorithms

1.AbstractContent summarization is an important area of research in traditional data mining. The volume of studies published on anti-epileptic drugs (AED) has increased exponentially over the last two decades, making it an important area for the application of text mining based summarization algorithms. In the current study, we use text analytics algorithms to mine and summarize 10,000 PubMed abstracts related to anti-epileptic drugs published within the last 10 years. A Text Frequency – Inverse Document Frequency based filtering was applied to identify drugs with highest frequency of mentions within these abstracts. The US Food and Drug database was scrapped and linked to the results to quantify the most frequently mentioned modes of action and elucidate the pharmaceutical entities marketing these drugs. A sentiment analysis model was created to score the abstracts for sentiment positivity or negativity. Finally, a modified Latent Dirichlet Allocation topic model was generated to extract key topics associated with the most frequently mentioned AEDs. Results of this study provide accurate and data intensive insights on the progress of anti-epileptic drug research.

Download Full-text

MODEL OF SEMANTIC CONTEXT OF LEXEMES IN THE TEXT MINING ALGORITHMS

International Journal of Computing ◽

10.47839/ijc.10.3.751 ◽

2011 ◽

pp. 216-222

Author(s):

Bohdan M. Pavlyshenko

Keyword(s):

Text Mining ◽

Partially Ordered Set ◽

Ordered Set ◽

Semantic Context ◽

Partially Ordered ◽

Semantic Fields ◽

Semantic Concepts ◽

Mining Algorithms

The model of semantic context of lexemes which represent the structure semantic configuration of lexems corpus of text arrays has been proposed. It is shown that partially ordered set of semantic concepts are formed in the lexem semantic context. Concepts’ intents are defined by semantic fields, concepts extents – by lexems.

Download Full-text