Search Engine as a mediating technology of organization

The Oxford Handbook of Media, Technology, and Organization Studies ◽

10.1093/oxfordhb/9780198809913.013.34 ◽

2019 ◽

pp. 411-429

Author(s):

Renée Ridgway

Keyword(s):

Machine Learning ◽

Search Engine ◽

Search Engines ◽

Relational Databases ◽

Machine Learning Algorithms ◽

Online Information ◽

Advertising Agencies ◽

User Data ◽

History Of ◽

Personalized Advertisement

Search engines have become the technological and organizational means to navigate, filter, and rank online information for users. During the seventeenth to nineteenth centuries in Europe, the ‘pre-history’ of search engines were the ‘bureau d’adresse’ or ‘address office’ that provided information and services to clients as they gathered data. Registers, censuses, and archives eventually shifted to relational databases owned by commercial platforms, advertising agencies cum search engines that provide non-neutral answers in exchange for user data. With ‘cyberorganization’, personalized advertisement, machine-learning algorithms, and ‘surveillance capitalism’ organize the user through their ‘habit’ of search. However, there are alternatives such as the p2p search engine YaCy and anonymity browsing with Tor.

Download Full-text

Using Machine Learning for Web Page Classification in Search Engine Optimization

Future Internet ◽

10.3390/fi13010009 ◽

2021 ◽

Vol 13 (1) ◽

pp. 9

Author(s):

Goran Matošević ◽

Jasminka Dobša ◽

Dunja Mladenić

Keyword(s):

Machine Learning ◽

Search Engine ◽

Search Engines ◽

Machine Learning Algorithms ◽

Practical Significance ◽

Web Pages ◽

Web Page ◽

Data Set ◽

Search Engine Optimization ◽

Novel Approach

This paper presents a novel approach of using machine learning algorithms based on experts’ knowledge to classify web pages into three predefined classes according to the degree of content adjustment to the search engine optimization (SEO) recommendations. In this study, classifiers were built and trained to classify an unknown sample (web page) into one of the three predefined classes and to identify important factors that affect the degree of page adjustment. The data in the training set are manually labeled by domain experts. The experimental results show that machine learning can be used for predicting the degree of adjustment of web pages to the SEO recommendations—classifier accuracy ranges from 54.59% to 69.67%, which is higher than the baseline accuracy of classification of samples in the majority class (48.83%). Practical significance of the proposed approach is in providing the core for building software agents and expert systems to automatically detect web pages, or parts of web pages, that need improvement to comply with the SEO guidelines and, therefore, potentially gain higher rankings by search engines. Also, the results of this study contribute to the field of detecting optimal values of ranking factors that search engines use to rank web pages. Experiments in this paper suggest that important factors to be taken into consideration when preparing a web page are page title, meta description, H1 tag (heading), and body text—which is aligned with the findings of previous research. Another result of this research is a new data set of manually labeled web pages that can be used in further research.

Download Full-text

Seek and Ye Shall Find

Handbook of Research on Computer Mediated Communication ◽

10.4018/978-1-59904-863-5.ch053 ◽

2011 ◽

pp. 740-754 ◽

Cited By ~ 2

Author(s):

Suely Fragoso

Keyword(s):

Mass Media ◽

Search Engine ◽

Mass Distribution ◽

Search Engines ◽

Distribution Model ◽

The Internet ◽

Information Distribution ◽

Distribution Mode ◽

History Of ◽

The Web

This chapter proposes that search engines apply a verticalizing pressure on the WWW many-to-many information distribution model, forcing this to revert to a distributive model similar to that of the mass media. The argument for this starts with a critical descriptive examination of the history of search mechanisms for the Internet. Parallel to this there is a discussion of the increasing ties between the search engines and the advertising market. The chapter then presents questions concerning the concentration of traffic on the Web around a small number of search engines which are in the hands of an equally limited number of enterprises. This reality is accentuated by the confidence that users place in the search engine and by the ongoing acquisition of collaborative systems and smaller players by the large search engines. This scenario demonstrates the verticalizing pressure that the search engines apply to the majority of WWW users, that bring it back toward the mass distribution mode.

Download Full-text

The Social Model of Translation and Its Application to Internet Search Engines Specialized in Health: The ASEM Search Engine for Neuromuscular Diseases

Meta Journal des traducteurs ◽

10.7202/044246ar ◽

2010 ◽

Vol 55 (2) ◽

pp. 374-386

Author(s):

Joan Miquel-Vergés ◽

Elena Sánchez-Trigo

Keyword(s):

Health Information ◽

Search Engine ◽

Search Engines ◽

Neuromuscular Diseases ◽

Social Model ◽

Online Information ◽

Internet Search ◽

Quality Health ◽

Use Of The Internet ◽

Specialized Texts

The use of the Internet as a source of health information is greatly increasing. However, identifying relevant and valid information can be problematic. This paper firstly analyses the efficiency of Internet search engines specialized in health in order to then determine the quality of the online information related to a specific medical subdomain like that of neuromuscular diseases. Our aim is to present a model for the development and use of a bilingual electronic corpus (MYOCOR), related to the said neuromuscular diseases in order to: a) on one hand, provide a quality health information tool for health professionals, patients and relatives, as well as for translators and writers of specialized texts, and software developers, and b) on the other hand, use the same as a base for the implementation of a search engine (using keywords and semantics), like the ASEM (Federación Española Contra las Enfermedades Neuromusculares) search engine for neuromuscular diseases.

Download Full-text

Behavioral Analysis of User Data on Social Media Applications using Machine Learning Algorithms

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtstciet17 ◽

2020 ◽

Vol 6 (8S) ◽

pp. 89-94

Author(s):

Prof. Chethan Raj C, Abhishek V Dhapte and Namratha V

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Behavioral Analysis ◽

User Data ◽

Media Applications

Download Full-text

Analisis Penerapan Google Custom Search API pada Sistem Pencarian Data

Journal of Computer and Information System ( J-CIS ) ◽

10.31605/jcis.v1i1.626 ◽

2021 ◽

Vol 1 (1) ◽

pp. 29-35

Author(s):

Ismail Majid

Keyword(s):

Search Engine ◽

Search Engines ◽

Important Application ◽

Search Method ◽

Online Information ◽

Test Results ◽

Search System ◽

Search Results ◽

Information Media

Abstrak Sistem Pencarian merupakan aplikasi penting diterapkan pada sebuah media informasi online, namun sejak hadirnya mesin pencari seperti Google orang lebih suka menggunakan alat ini untuk menemukan informasi. Karena metode pencarian yang digunakan terbukti keandalannya. Apakah kita mampu seperti itu? Penelitian ini membuktikan bahwa dengan menerapkan metode Google Custom Search API, kita dapat membangun sistem pencarian layaknya seperti mesin pencari Google, hasil pengujian menunjukkan hasil pencarian yang ditampilkan sangat relevan dan rata-rata berada pada peringkat pertama. Keuntungan lainnya metode ini dilengkapi koreksi ejaan salah untuk menyempurnakan kata kunci sebenarnya. Abstract Search system is an important application applied to an online information media, but since the presence of search engines like Google, people prefer to use this tool to find information. Because the search method used is proven to be reliable. Are we able to be like that? This research proves that by implementing the Google Custom Search API method, we can build a search system like Google's search engine, the test results show that the search results displayed are very relevant and on average are ranked first. Another advantage of this method is that it includes incorrect spelling corrections to perfect the actual keywords.

Download Full-text

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained with Machine Learning to Support Research

10.5121/csit.2021.110709 ◽

2021 ◽

Author(s):

Felipe Cujar-Rosero ◽

David Santiago Pinchao Ortiz ◽

Silvio Ricardo Timaran Pereira ◽

Jimmy Mateo Guerrero Restrepo

Keyword(s):

Machine Learning ◽

Virtual Environment ◽

Search Engine ◽

Language Processing ◽

Machine Learning Algorithms ◽

Semantic Search ◽

Research Projects ◽

Machine Learning Model ◽

The University ◽

Semantic Search Engine

This paper presents the final results of the research project that aimed to build a Semantic Search Engine that uses an Ontology and a model trained with Machine Learning to support the semantic search of research projects of the System of Research from the University of Nariño. For the construction of FENIX, as this Engine is called, it was used a methodology that includes the stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the Semantic Search Engine. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms and specifically Natural Language Processing algorithms were used such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also done in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of FENIX managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that FENIX was successful in all the searches that were carried out because its results were satisfactory.

Download Full-text

Detection of Phishing Websites using an Efficient Feature-Based Machine Learning Framework

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5909.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2857-2862

Keyword(s):

Machine Learning ◽

Personal Information ◽

Machine Learning Algorithms ◽

Sensitive Information ◽

Cyber Attack ◽

Learning Framework ◽

Internet Users ◽

User Data ◽

Feature Based ◽

Classification Prediction

Phishing is a cyber-attack which is socially engineered to trick naive online users into revealing sensitive information such as user data, login credentials, social security number, banking information etc. Attackers fool the Internet users by posing as a legitimate webpage to retrieve personal information. This can also be done by sending emails posing as reputable companies or businesses. Phishing exploits several vulnerabilities effectively and there is no one solution which protects users from all vulnerabilities. A classification/prediction model is designed based on heuristic features that are extracted from website domain, URL, web protocol, source code to eliminate the drawbacks of existing anti-phishing techniques. In the model we combine some existing solutions such as blacklisting and whitelisting, heuristics and visual-based similarity which provides higher level security. We use the model with different Machine Learning Algorithms, namely Logistic Regression, Decision Trees, K-Nearest Neighbours and Random Forests, and compare the results to find the most efficient machine learning framework.

Download Full-text

A cross-sectional study on quality of diabetes information identified from the Internet (Preprint)

10.2196/preprints.14757 ◽

2019 ◽

Author(s):

Jingchun Fan ◽

Jean Craig ◽

Na Zhao ◽

Fujian Song

Keyword(s):

Health Information ◽

Search Engine ◽

Search Engines ◽

The Internet ◽

Online Health Information ◽

Online Information ◽

Significant Difference ◽

Content Coverage ◽

Google Search

BACKGROUND Increasingly people seek health information from the Internet, in particular, health information on diseases that require intensive self-management, such as diabetes. However, the Internet is largely unregulated and the quality of online health information may not be credible. OBJECTIVE To assess the quality of online information on diabetes identified from the Internet. METHODS We used the single term “diabetes” or equivalent Chinese characters to search Google and Baidu respectively. The first 50 websites retrieved from each of the two search engines were screened for eligibility using pre-determined inclusion and exclusion criteria. Included websites were assessed on four domains: accessibility, content coverage, validity and readability. RESULTS We included 26 websites from Google search engine and 34 from Baidu search engine. There were significant differences in website provider (P<0.0001), but not in targeted population (P=0.832) and publication types (P=0.378), between the two search engines. The website accessibility was not statistically significantly different between the two search engines, although there were significant differences in items regarding website content coverage. There was no statistically significant difference in website validity between the Google and Baidu search engines (mean Discern score 3.3 vs 2.9, p=0.156). The results to appraise readability for English website showed that that Flesch Reading Ease scores ranged from 23.1 to 73.0 and the mean score of Flesch-Kincaid Grade Level ranged range from 5.7 to 19.6. CONCLUSIONS The content coverage of the health information for patients with diabetes in English search engine tended to be more comprehensive than that from Chinese search engine. There was a lack of websites provided by health organisations in China. The quality of online health information for people with diabetes needs to be improved to bridge the knowledge gap between website service and public demand.

Download Full-text

Phrase Based Information Retrieval Analysis in Various Search Engines Using Machine Learning Algorithms

Data Management, Analytics and Innovation - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-13-9364-8_21 ◽

2019 ◽

pp. 281-293

Author(s):

S. Amudha ◽

I. Elizabeth Shanthi

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

Search Engines ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Retrieval Analysis

Download Full-text

A Survey on Unsupervised K-Means Algorithm in Big Data Environment

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v11i330262 ◽

2021 ◽

pp. 1-8

Author(s):

Fatama Sharf Al-deen ◽

Fadl Mutaher Ba-Alwi

Keyword(s):

Machine Learning ◽

Information Technology ◽

Big Data ◽

Literature Review ◽

Relational Databases ◽

Rapid Development ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Prominent Feature ◽

Data Environment

Due to the rapid development in information technology, Big Data has become one of its prominent feature that had a great impact on other technologies dealing with data such as machine learning technologies. K-mean is one of the most important machine learning algorithms. The algorithm was first developed as a clustering technology dealing with relational databases. However, the advent of Big Data has highly effected its performance. Therefore, many researchers have proposed several approaches to improve K-mean accuracy in Big Data environment. In this paper, we introduce a literature review about different technologies proposed for k-mean algorithm development in Big Data. We demonstrate a comparison between them according to several criteria, including the proposed algorithm, the database used, Big Data tools, and k-mean applications. This paper helps researchers to see the most important challenges and trends of the k-mean algorithm in the Big Data environment.

Download Full-text