Research and Application of Automated Search Engine Based on Machine Learning

Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.

Download Full-text

Building Search Engine Using Machine Learning Technique

2019 International Conference on Intelligent Computing and Control Systems (ICCS) ◽

10.1109/iccs45141.2019.9065846 ◽

2019 ◽

Cited By ~ 1

Author(s):

Rushikesh Karwa ◽

Vikas Honmane

Keyword(s):

Machine Learning ◽

Search Engine ◽

Machine Learning Technique ◽

Learning Technique

Download Full-text

User Model of Personalized Search Engine for Product Design Based on Machine Learning

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.460-461.747 ◽

2011 ◽

Vol 460-461 ◽

pp. 747-753

Author(s):

Ying Shi Kang ◽

Hai Ning Wang

Keyword(s):

Machine Learning ◽

Product Design ◽

Search Engine ◽

Interaction Design ◽

Hot Spot ◽

Rapid Development ◽

Web Design ◽

Internet Technology ◽

Web Pages ◽

Personalized Search

With the rapid development of internet technology, focusing on the product design of individual users, emphasizing the interaction design for Web and improving the user experience have become an inevitable trend of Web design, and also the hot spot of the design of personalized search engine. This paper proposed an optimized algorithm for building user models for product design websites. In order to show the design dimensions of Web pages presented by a browser, a concept of freshness is presented in this algorithm. By analyzing the user behavior of browsing Web pages, the model was updated using methods of machine learning. At last, the performance and effectiveness of this algorithm was analyzed and estimated through the simulation experiment.

Download Full-text

A Machine Learning Technique for Semantic Search Engine

Procedia Engineering ◽

10.1016/j.proeng.2012.06.260 ◽

2012 ◽

Vol 38 ◽

pp. 2164-2171 ◽

Cited By ~ 7

Author(s):

G. Nagarajan ◽

K.K. Thyagharajan

Keyword(s):

Machine Learning ◽

Search Engine ◽

Semantic Search ◽

Machine Learning Technique ◽

Learning Technique ◽

Semantic Search Engine

Download Full-text

Keyword Categorization using Statistical Methods

TEM Journal ◽

10.18421/tem103-47 ◽

2021 ◽

pp. 1377-1384

Author(s):

Dominika Krasňanská ◽

Silvia Komara ◽

Mária Vojtková

Keyword(s):

Machine Learning ◽

Detailed Analysis ◽

Search Engine ◽

Statistical Methods ◽

Online Advertising ◽

The Internet ◽

Keyword Analysis ◽

Total Analysis Time ◽

Total Analysis ◽

Insight Into

Keyword analysis is a way to gain insight into market behaviour. It is a detailed analysis of words and phrases that are relevant to the selected area. Keyword analysis should be the first step in any search engine optimization, as it reveals what keywords users enter into search engines when searching the Internet. The keyword categorization process takes up almost half of the total analysis time, as it is not automated. There is currently no known tool in the online advertising market that facilitates keyword categorization. The main goal of this paper is to streamline the process of keyword analysis using selected statistical methods of machine learning applied in the categorization of a specific example.

Download Full-text

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained with Machine Learning to Support Research

10.5121/csit.2021.110709 ◽

2021 ◽

Author(s):

Felipe Cujar-Rosero ◽

David Santiago Pinchao Ortiz ◽

Silvio Ricardo Timaran Pereira ◽

Jimmy Mateo Guerrero Restrepo

Keyword(s):

Machine Learning ◽

Virtual Environment ◽

Search Engine ◽

Language Processing ◽

Machine Learning Algorithms ◽

Semantic Search ◽

Research Projects ◽

Machine Learning Model ◽

The University ◽

Semantic Search Engine

This paper presents the final results of the research project that aimed to build a Semantic Search Engine that uses an Ontology and a model trained with Machine Learning to support the semantic search of research projects of the System of Research from the University of Nariño. For the construction of FENIX, as this Engine is called, it was used a methodology that includes the stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the Semantic Search Engine. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms and specifically Natural Language Processing algorithms were used such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also done in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of FENIX managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that FENIX was successful in all the searches that were carried out because its results were satisfactory.

Download Full-text

A Kind of Web Database Classification Based on Machine Learning

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.50-51.644 ◽

2011 ◽

Vol 50-51 ◽

pp. 644-648

Author(s):

Xiao Qing Zhou ◽

Xiao Ping Tang

Keyword(s):

Machine Learning ◽

Search Engine ◽

Deep Web ◽

Web Database ◽

Taxonomic Approach ◽

The Web

The traditional search engine is unable to correct search for the magnanimous information in Deep Web hides. The Web database's classification is the key step which integrates with the Web database classification and retrieves. This article has proposed one kind of classification based on machine learning's web database. The experiment has indicated that after this taxonomic approach undergoes few sample training, it can achieve the very good classified effect, and along with training sample's increase, this classifier's performance maintains stable and the rate of accuracy and the recalling rate fluctuate in the very small scope.

Download Full-text

Web Search Engine Misinformation Notifier Extension (SEMiNExt): A Machine Learning Based Approach during COVID-19 Pandemic

Healthcare ◽

10.3390/healthcare9020156 ◽

2021 ◽

Vol 9 (2) ◽

pp. 156

Author(s):

Abdullah Bin Shams ◽

Ehsanul Hoque Apu ◽

Ashiqur Rahman ◽

Md. Mohsin Sarker Raihan ◽

Nazeeba Siddika ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Real Time ◽

Search Engine ◽

Language Processing ◽

Web Search ◽

Training Data ◽

Small Data ◽

Web Search Engine ◽

User Query

Misinformation such as on coronavirus disease 2019 (COVID-19) drugs, vaccination or presentation of its treatment from untrusted sources have shown dramatic consequences on public health. Authorities have deployed several surveillance tools to detect and slow down the rapid misinformation spread online. Large quantities of unverified information are available online and at present there is no real-time tool available to alert a user about false information during online health inquiries over a web search engine. To bridge this gap, we propose a web search engine misinformation notifier extension (SEMiNExt). Natural language processing (NLP) and machine learning algorithm have been successfully integrated into the extension. This enables SEMiNExt to read the user query from the search bar, classify the veracity of the query and notify the authenticity of the query to the user, all in real-time to prevent the spread of misinformation. Our results show that SEMiNExt under artificial neural network (ANN) works best with an accuracy of 93%, F1-score of 92%, precision of 92% and a recall of 93% when 80% of the data is trained. Moreover, ANN is able to predict with a very high accuracy even for a small training data size. This is very important for an early detection of new misinformation from a small data sample available online that can significantly reduce the spread of misinformation and maximize public health safety. The SEMiNExt approach has introduced the possibility to improve online health management system by showing misinformation notifications in real-time, enabling safer web-based searching on health-related issues.

Download Full-text

Application of Machine Learning in Google Services- A Case Study

International Journal of Case Studies in Business, IT, and Education ◽

10.47992/ijcsbe.2581.6942.0117 ◽

2021 ◽

pp. 24-37

Author(s):

Siji Jose Pulluparambil ◽

Subrahmanya Bhat

Keyword(s):

Machine Learning ◽

Search Engine ◽

Case Study Research ◽

Secondary Data ◽

Important Variable ◽

Machine Learning Algorithms ◽

Survey Method ◽

Future Outcomes ◽

Google Search

Purpose: Google Search is currently the most preferred search engine worldwide, making it one of the websites with the highest traffic. It assists people in discovering the content they are searching for, from the large repository of the World Wide Web. Google has grown to be the best in the search engine market that it is the single most important variable to be considered when optimizing a website for search. There are many ranking algorithms used by Google to make the searching process more precise. Google has the vision “to provide access to the world's information in one click”. Machine learning is the most popular methodology applied in predicting future outcomes or organizing information to assist people in making required decisions.ML algorithms are trained over instances or examples through which they analyze the historical data available and learn from past experiences. By repeatedly training over the samples, the patterns in the data can be identified in order to make predictions about the future. Google, as an organization, can be a pioneer in ML, and as a technology product, can be a use case for machine learning. Here, a case analysis has been prepared on few applications of machine learning in the products and services of Google. Within this paper, we highlight their technological history, services with machine learning applications, financial plans, and challenges. The paper also tries to examine the various products of Google which apply ML, such as Google Maps, Gmail, Google Photos, Google Assistant, and review the algorithms used in each service. Approach: The detailed survey method on secondary data is used for analysing the data. Findings: Based on the developed case study, it is clearly evident that Google is using machine learning algorithms with few artificial intelligence features to enhance the quality of the services they provide. Originality: A new way of analysis was performed to identify the methods used in the organization’s services. Paper Type: Descriptive Case Study Research

Download Full-text

Research and Application of Automated Search Engine Based on Machine Learning

Integrating performance of web search engine with Machine Learning approach

Knowledge Derived From Wikipedia For Computing Semantic Relatedness

Building Search Engine Using Machine Learning Technique

User Model of Personalized Search Engine for Product Design Based on Machine Learning

A Machine Learning Technique for Semantic Search Engine

Keyword Categorization using Statistical Methods

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained with Machine Learning to Support Research

A Kind of Web Database Classification Based on Machine Learning

Web Search Engine Misinformation Notifier Extension (SEMiNExt): A Machine Learning Based Approach during COVID-19 Pandemic

Application of Machine Learning in Google Services- A Case Study

Export Citation Format