Chapter 2 The Many Ways of Searching the Web Together: A Comparison of Social Search Engines

The web landscape has undergone massive changes in the past years. On the other hand, search engine technology has not quite kept the same pace. In this article we look at the current scenarios, and argue how social flows can be used to make up for a better generation of search engines. We consider how society and technological progress somehow changed the rules of the game, introducing good but also bad components, and see how this situation could be modeled by search engines. Along this line of thinking, we show how the real components of interest are not just web pages, but flows of information of any kind, that need to be merged: this opens up for a wide range of improvements and far-looking developments, towards a new horizon of social search.

Download Full-text

Classification of means and methods of the Web semantic retrieval

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2017.01.030 ◽

2017 ◽

pp. 030-050

Author(s):

J.V. Rogushina ◽

Keyword(s):

Search Engines ◽

Domain Knowledge ◽

Information Needs ◽

Web Search ◽

User Interaction ◽

Query Languages ◽

Semantic Search ◽

Semantic Retrieval ◽

The Web

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.

Download Full-text

Using NLP for Fact Checking: A Survey

Designs ◽

10.3390/designs5030042 ◽

2021 ◽

Vol 5 (3) ◽

pp. 42

Author(s):

Eric Lazarski ◽

Mahmood Al-Khassaweneh ◽

Cynthia Howard

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computer Science ◽

Language Processing ◽

The Internet ◽

Fake News ◽

Fact Checking ◽

The Many ◽

Human Powered ◽

The Web

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.

Download Full-text

Search Engine Update

Legal Information Management ◽

10.1017/s1472669600000566 ◽

2001 ◽

Vol 1 (3) ◽

pp. 28-31 ◽

Cited By ~ 1

Author(s):

Valerie Stevenson

Keyword(s):

Search Engine ◽

Search Engines ◽

Search Strategy ◽

Search Strategies ◽

Boolean Logic ◽

Web Searches ◽

Search Techniques ◽

Looking Back ◽

The Web

Looking back to 1999, there were a number of search engines which performed equally well. I recommended defining the search strategy very carefully, using Boolean logic and field search techniques, and always running the search in more than one search engine. Numerous articles and Web columns comparing the performance of different search engines came to different conclusions on the ‘best’ search engines. Over the last year, however, all the speakers at conferences and seminars I have attended have recommended Google as their preferred tool for locating all kinds of information on the Web. I confess that I have now abandoned most of my carefully worked out search strategies and comparison tests, and use Google for most of my own Web searches.

Download Full-text

Linked Data in a Nutshell: A Starter Kit of Selected Annotated Bibliography and Resources for Academic Librarians: Part One

International Journal of Librarianship ◽

10.23974/ijol.2018.vol3.1.47 ◽

2018 ◽

Vol 3 (1) ◽

pp. 36

Author(s):

Weiling Liu

Keyword(s):

Information Retrieval ◽

Case Studies ◽

Linked Data ◽

Annotated Bibliography ◽

Academic Librarians ◽

English Only ◽

Current State ◽

The Many ◽

Data Evolution ◽

The Web

It has been a decade since Tim Berners-Lee coined Linked Data in 2006. More and more Linked Data datasets have been made available for information retrieval on the Web. It is essential for librarians, especially academic librarians, to keep up with the state of Linked Data. There is so much information about Linked Data that one may wonder where to begin when they want to join the Linked Data community. With this in mind, the author compiled this annotated bibliography as a starter kit. Due to the many resources available, this list focuses on literature in English only and of specific projects, case studies, research studies, and tools that may be helpful to academic librarians, in addition to the overview of Linked Data concept and the current state of Linked Data evolution and adoption.

Download Full-text

SEO and Digital News Media: Visibility of Cultural Information in Spain’s Leading Newspapers

Tripodos ◽

10.51698/tripodos.2019.44p41-61 ◽

2021 ◽

pp. 41-61

Author(s):

Carlos Lopezosa ◽

Lluís Codina ◽

Mario Pérez-Montoro

Keyword(s):

Comparative Analysis ◽

News Media ◽

Search Engines ◽

Web Traffic ◽

Social Signals ◽

Digital News ◽

The Media ◽

Visibility Index ◽

The Web ◽

Digital Newspapers

This paper undertakes a comparative analysis of the visibility, and of other SEO indicators, of the culture sections of Spain’s leading digital newspapers —specifically, elmundo.es, elpais. com, lavanguardia.com, abc.es, elconfidencial.com and 20minutos.es— based on data collected by the media analytics company, comScore, and the web traffic metric, Alexa Rank. The analysis employs a set of positioning indicators: namely, a visibility index, keywords, social signals, keyword profiles, URLs, SERP-Snippets, reference domains and best anchor texts, as made available by SISTRIX, an SEO analytics audit toolbox. Thus, we were able to determine which of the digital newspapers’ culture sections has the best visibility. Likewise, we were able to identify which of these media are best positioned on Google, presumably as a result of more effective positioning strategies. We conclude with a discussion of our results and, on the basis of these findings, recommend ways in which the visibility of journalistic information can be optimised in search engines. SEO i cibermitjans: visibilitat de la informació cultural dels principals diaris d’Espanya Aquest article realitza una anàlisi comparativa de visibilitat i altres indicadors SEO de la secció de cultura dels principals cibermitjans espanyols: elmundo.es, elpais.com, lavanguardia.com, abc. es, elconfidencial.com i 20minutos. es. Les anàlisis s’han dut a terme amb la utilització d’un conjunt d’indicadors de posicionament (visibilitat, paraules clau, senyals socials, paraules clau, url, snippets, dominis de referència i millors textos àncora) utilitzant l’eina de auditoria i anàlisi de posicionament en cercadors, SISTRIX. Ens preguntem quin d’aquests mitjans té millor una secció de notícies culturals amb millor visibilitat. L’estudi dut a terme amb els indicadors seleccionats permet, d’aquesta manera, presentar una anàlisi comparativa del periodisme cultural i identificar quins d’aquests mitjans presenten millors posicions a Google, presumiblement, com a resultat d’estratègies de posicionament. Finalitzem amb una discussió dels resultats juntament amb unes recomanacions finals per optimitzar la visibilitat de la informació periodística en els cercadors.

Download Full-text

Economic Dictionaries on the Web

HERMES - Journal of Language and Communication in Business ◽

10.7146/hjlcb.v26i50.97787 ◽

2017 ◽

Vol 26 (50) ◽

pp. 13

Author(s):

Daniele Besomi

Keyword(s):

Search Engines ◽

The Internet ◽

Electronic Editing ◽

Paper Structure ◽

Research Students ◽

The Relationship ◽

The Web

This paper surveys the economic dictionaries available on the internet, both for free and on subscription, addressed to various kinds of audiences from schoolchildren to research students and academics. The focus is not much on content, but on whether and how the possibilities opened by electronic editing and by the modes of distribution and interaction opened by the internet are exploited in the organization and presentation of the materials. The upshot is that although a number of web dictionaries have taken advantage of some of the innovations offered by the internet (in particular the possibility of regularly updating, of turning cross-references into hyperlinks, of adding links to external materials, of adding more or less complex search engines), the observation that internet lexicography has mostly produced more ef! cient dictionary without, however, fundamentally altering the traditional paper structure can be con! rmed for this particular subset of reference works. In particular, what is scarcely explored is the possibility of visualizing the relationship between entries, thus abandoning the project of the early encyclopedists right when the technology provides the means of accomplishing it.

Download Full-text

Using the Internet to Make Local Music More Available to the South Wales Community

Web Design and Development ◽

10.4018/978-1-4666-8619-9.ch052 ◽

2016 ◽

pp. 1157-1172

Author(s):

Jonathan Bishop ◽

Lisa Mannay

Keyword(s):

Social Media ◽

The Political ◽

The Internet ◽

Creative People ◽

National Anthem ◽

South Wales ◽

Welsh Language ◽

Economic Landscape ◽

The Many ◽

The Web

Wales is the “land of the poets so soothing to me,” according to its national anthem. The political and economic landscape does not on the whole provide for the many creative people that are in Welsh communities. Social media Websites like MySpace and YouTube as well as Websites like MTV.com, eJay, and PeopleSound, whilst providing space for artists to share their works, but do not usually consider the needs of local markets, such as in relation to Welsh language provision through to acknowledgement of Welsh place names and Wales's status as a country. The chapter finds that there are distinct issues in relation to presenting information via the Web- or Tablet-based devises and suggests some of the considerations needed when designing multi-platform environments.

Download Full-text

Dark Web

Encyclopedia of Criminal Activities and the Deep Web ◽

10.4018/978-1-5225-9715-5.ch010 ◽

2020 ◽

pp. 152-164

Author(s):

Punam Bedi ◽

Neha Gupta ◽

Vinita Jindal

Keyword(s):

World Wide Web ◽

Search Engines ◽

World Wide ◽

Data Dissemination ◽

Deep Web ◽

Web Browsers ◽

Web Content ◽

The World ◽

Dark Web ◽

The Web

The World Wide Web is a part of the Internet that provides data dissemination facility to people. The contents of the Web are crawled and indexed by search engines so that they can be retrieved, ranked, and displayed as a result of users' search queries. These contents that can be easily retrieved using Web browsers and search engines comprise the Surface Web. All information that cannot be crawled by search engines' crawlers falls under Deep Web. Deep Web content never appears in the results displayed by search engines. Though this part of the Web remains hidden, it can be reached using targeted search over normal Web browsers. Unlike Deep Web, there exists a portion of the World Wide Web that cannot be accessed without special software. This is known as the Dark Web. This chapter describes how the Dark Web differs from the Deep Web and elaborates on the commonly used software to enter the Dark Web. It highlights the illegitimate and legitimate sides of the Dark Web and specifies the role played by cryptocurrencies in the expansion of Dark Web's user base.

Download Full-text

MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch015 ◽

2013 ◽

pp. 250-265

Author(s):

K.G. Srinivasa ◽

Anil Kumar Muppalla ◽

Varun A. Bharghava ◽

M. Amulya

Keyword(s):

Search Engines ◽

World Wide ◽

Speed Of Convergence ◽

User Choice ◽

Link Structure ◽

Ranking Algorithms ◽

The World ◽

Retrieval Algorithms ◽

Web Graph ◽

The Web

In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a MapReduce environment are used to improve the speed of crawling and indexing. The proposed ranking algorithm is an iterative method that makes use of the link structure of the Web and is developed using MapReduce framework to improve the speed of convergence of ranking the WebPages. Categorization is used to retrieve and order the results according to the user choice to personalize the search. A new score is introduced in this paper that is associated with each WebPage and is calculated using user’s query and number of occurrences of the terms in the query in the document corpus. The experiments are conducted on Web graph datasets and the results are compared with the serial versions of crawler, indexer and ranking algorithms.

Download Full-text