scholarly journals Using Machine Learning for Web Page Classification in Search Engine Optimization

2021 ◽  
Vol 13 (1) ◽  
pp. 9
Author(s):  
Goran Matošević ◽  
Jasminka Dobša ◽  
Dunja Mladenić

This paper presents a novel approach of using machine learning algorithms based on experts’ knowledge to classify web pages into three predefined classes according to the degree of content adjustment to the search engine optimization (SEO) recommendations. In this study, classifiers were built and trained to classify an unknown sample (web page) into one of the three predefined classes and to identify important factors that affect the degree of page adjustment. The data in the training set are manually labeled by domain experts. The experimental results show that machine learning can be used for predicting the degree of adjustment of web pages to the SEO recommendations—classifier accuracy ranges from 54.59% to 69.67%, which is higher than the baseline accuracy of classification of samples in the majority class (48.83%). Practical significance of the proposed approach is in providing the core for building software agents and expert systems to automatically detect web pages, or parts of web pages, that need improvement to comply with the SEO guidelines and, therefore, potentially gain higher rankings by search engines. Also, the results of this study contribute to the field of detecting optimal values of ranking factors that search engines use to rank web pages. Experiments in this paper suggest that important factors to be taken into consideration when preparing a web page are page title, meta description, H1 tag (heading), and body text—which is aligned with the findings of previous research. Another result of this research is a new data set of manually labeled web pages that can be used in further research.

2014 ◽  
Vol 2 (2) ◽  
pp. 103-112 ◽  
Author(s):  
Taposh Kumar Neogy ◽  
Harish Paruchuri

The essence of a web page is an inherently predisposed issue, one that is built on behaviors, interests, and intelligence. There are relatively a ton of reasons web pages are critical to the new world, as the matter cannot be overemphasized. The meteoric growth of the internet is one of the most potent factors making it hard for search engines to provide actionable results. With classified directories, search engines store web pages. To store these pages, some of the engines rely on the expertise of real people. Most of them are enabled and classified using automated means but the human factor is dominant in their success. From experimental results, we can deduce that the most effective and critical way to automate web pages for search engines is via the integration of machine learning.  


2013 ◽  
Vol 9 (1) ◽  
pp. 926-931 ◽  
Author(s):  
Parveen Rani ◽  
Er. Sukhpreet Singh

SEO stands for Search Engine Optimization. It is a technique that searches various web pages for specified keywords and ranks these Web pages according to some parameters. They are used to feed pages to search engines.  The main importance of SEO is that it helps to find the relevant data and increase the rank of a webpage in search engines’ results. In our paper, we develop a new algorithm M-HITS (Modified HITS) to provide the page rank. M-HITS Algorithm is a new version of HITS algorithm. It is developed by extending the properties of HITS algorithm.


2010 ◽  
Vol 44-47 ◽  
pp. 4041-4049 ◽  
Author(s):  
Hong Zhao ◽  
Chen Sheng Bai ◽  
Song Zhu

Search engines can bring a lot of benefit to the website. For a site, each page’s search engine ranking is very important. To make web page ranking in search engine ahead, Search engine optimization (SEO) make effect on the ranking. Web page needs to set the keywords as “keywords" to use SEO. The paper focuses on the content of a given word, and extracts the keywords of each page by calculating the word frequency. The algorithm is implemented by C # language. Keywords setting of webpage are of great importance on the information and products


2019 ◽  
Vol 16 (9) ◽  
pp. 3712-3716
Author(s):  
Kailash Kumar ◽  
Abdulaziz Al-Besher

This paper examines the overlapping of the results retrieved between three major search engines namely Google, Yahoo and Bing. A rigorous analysis of overlap among these search engines was conducted on 100 random queries. The overlap of first ten web page results, i.e., hundred results from each search engine and only non-sponsored results from these above major search engines were taken into consideration. Search engines have their own frequency of updates and ranking of results based on their relevance. Moreover, sponsored search advertisers are different for different search engines. Single search engine cannot index all Web pages. In this research paper, the overlapping analysis of the results were carried out between October 1, 2018 to October 31, 2018 among these major search engines namely, Google, Yahoo and Bing. A framework is built in Java to analyze the overlap among these search engines. This framework eliminates the common results and merges them in a unified list. It also uses the ranking algorithm to re-rank the search engine results and displays it back to the user.


2019 ◽  
Vol 12 (2) ◽  
pp. 110-119 ◽  
Author(s):  
Jayaraman Sethuraman ◽  
Jafar A. Alzubi ◽  
Ramachandran Manikandan ◽  
Mehdi Gheisari ◽  
Ambeshwar Kumar

Background: The World Wide Web houses an abundance of information that is used every day by billions of users across the world to find relevant data. Website owners employ webmasters to ensure their pages are ranked top in search engine result pages. However, understanding how the search engine ranks a website, which comprises numerous web pages, as the top ten or twenty websites is a major challenge. Although systems have been developed to understand the ranking process, a specialized tool based approach has not been tried. Objective: This paper develops a new framework and system that process website contents to determine search engine optimization factors. Methods: To analyze the web page dynamically by assessing the web site content based on specific keywords, elimination method was used in an attempt to reveal various search engine optimization techniques. Conclusion: Our results lead to conclude that the developed system is able to perform a deeper analysis and find factors which play a role in bringing the site on the top of the list.


2002 ◽  
Vol 63 (4) ◽  
pp. 354-365 ◽  
Author(s):  
Susan Augustine ◽  
Courtney Greene

Have Internet search engines influenced the way students search library Web pages? The results of this usability study reveal that students consistently and frequently use the library Web site’s internal search engine to find information rather than navigating through pages. If students are searching rather than navigating, library Web page designers must make metadata and powerful search engines priorities. The study also shows that students have difficulty interpreting library terminology, experience confusion discerning difference amongst library resources, and prefer to seek human assistance when encountering problems online. These findings imply that library Web sites have not alleviated some of the basic and long-range problems that have challenged librarians in the past.


Author(s):  
Renée Ridgway

Search engines have become the technological and organizational means to navigate, filter, and rank online information for users. During the seventeenth to nineteenth centuries in Europe, the ‘pre-history’ of search engines were the ‘bureau d’adresse’ or ‘address office’ that provided information and services to clients as they gathered data. Registers, censuses, and archives eventually shifted to relational databases owned by commercial platforms, advertising agencies cum search engines that provide non-neutral answers in exchange for user data. With ‘cyberorganization’, personalized advertisement, machine-learning algorithms, and ‘surveillance capitalism’ organize the user through their ‘habit’ of search. However, there are alternatives such as the p2p search engine YaCy and anonymity browsing with Tor.


2011 ◽  
Vol 3 (4) ◽  
pp. 62-70 ◽  
Author(s):  
Stephen O’Neill ◽  
Kevin Curran

Search engine optimization (SEO) is the process of improving the visibility, volume and quality of traffic to website or a web page in search engines via the natural search results. SEO can also target other areas of a search, including image search and local search. SEO is one of many different strategies used for marketing a website but SEO has been proven the most effective. An Internet marketing campaign may drive organic search results to websites or web pages but can be involved with paid advertising on search engines. All search engines have a unique way of ranking the importance of a website. Some search engines focus on the content while others review Meta tags to identify who and what a web site’s business is. Most engines use a combination of Meta tags, content, link popularity, click popularity and longevity to determine a sites ranking. To make it even more complicated, they change their ranking policies frequently. This paper provides an overview of search engine optimisation strategies and pitfalls.


Author(s):  
Anuradha T ◽  
Tayyaba Nousheen

The web is the heap and huge collection of wellspring of data. The Search Engine are used for retrieving the information from World Wide Web (WWW). Search Engines are helpful for searching user keywords and provide the accurate result in fraction of seconds. This paper proposed Machine Learning based search engine which will give more relevant user searches in the form of web pages. To display the user entered query search engine plays a major role of basic interface. Every site comprises of the heaps of site pages that are being made and sent on the server.


2019 ◽  
Vol 8 (S2) ◽  
pp. 35-38
Author(s):  
Mu. Annalakshmi ◽  
A. Padmapriya

Finding the required information in the vast area of web has been increasingly difficult in recent days since the web is overloaded with enormous content in the form of text, images, audio and video. Search engines help in this context to some extent but there are difficulties with them also. This paper proposes a framework in XML for the web pages in results of the search engines which helps in information filtering and search engine optimization.


Sign in / Sign up

Export Citation Format

Share Document