Information Retrieval in the Hidden Web

2021 ◽  
pp. 50-71
Author(s):  
Shakeel Ahmed ◽  
Shubham Sharma ◽  
Saneh Lata Yadav

Information retrieval is finding material of unstructured nature within large collections stored on computers. Surface web consists of indexed content accessible by traditional browsers whereas deep or hidden web content cannot be found with traditional search engines and requires a password or network permissions. In deep web, dark web is also growing as new tools make it easier to navigate hidden content and accessible with special software like Tor. According to a study by Nature, Google indexes no more than 16% of the surface web and misses all of the deep web. Any given search turns up just 0.03% of information that exists online. So, the key part of the hidden web remains inaccessible to the users. This chapter deals with positing some questions about this research. Detailed definitions, analogies are explained, and the chapter discusses related work and puts forward all the advantages and limitations of the existing work proposed by researchers. The chapter identifies the need for a system that will process the surface and hidden web data and return integrated results to the users.

Author(s):  
Punam Bedi ◽  
Neha Gupta ◽  
Vinita Jindal

The World Wide Web is a part of the Internet that provides data dissemination facility to people. The contents of the Web are crawled and indexed by search engines so that they can be retrieved, ranked, and displayed as a result of users' search queries. These contents that can be easily retrieved using Web browsers and search engines comprise the Surface Web. All information that cannot be crawled by search engines' crawlers falls under Deep Web. Deep Web content never appears in the results displayed by search engines. Though this part of the Web remains hidden, it can be reached using targeted search over normal Web browsers. Unlike Deep Web, there exists a portion of the World Wide Web that cannot be accessed without special software. This is known as the Dark Web. This chapter describes how the Dark Web differs from the Deep Web and elaborates on the commonly used software to enter the Dark Web. It highlights the illegitimate and legitimate sides of the Dark Web and specifies the role played by cryptocurrencies in the expansion of Dark Web's user base.


The Dark Web ◽  
2018 ◽  
pp. 114-137
Author(s):  
Dilip Kumar Sharma ◽  
A. K. Sharma

Web crawlers specialize in downloading web content and analyzing and indexing from surface web, consisting of interlinked HTML pages. Web crawlers have limitations if the data is behind the query interface. Response depends on the querying party's context in order to engage in dialogue and negotiate for the information. In this paper, the authors discuss deep web searching techniques. A survey of technical literature on deep web searching contributes to the development of a general framework. Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.


Author(s):  
Andrey Aleksandrov ◽  
Andrey Safronov

The article examines the concept, essence, specificity, structural elements of the Surface Network (from the English «Surface web») as well as so-called Deep Internet (from the English «Deep web»). The peculiarity of the use of the deep Internet in which the content is available only through connections created with the help of special software is discussed. The article describes the type of network separated from the rest of the public content forming the Darknet. It existed under the name of ARPANET (network of advanced research project agencies) before the civilian Internet known to us today has been separated from it. The creators of the Darknet haven’t foreseen all its applications. The paper lists software products used to connect to the Darknet. The purpose of special software products usage is to ensure its users’ maximum anonymity to complicate the tracking of their identity, IP-address as well as location in the network. The study reveals the main types of Darknet crimes and outlines ways to improve law enforcement activities to tackle these crimes. In addition, it identifies the problem of development and increasing use of the dark web for criminal purposes.


2021 ◽  
Vol 2021 ◽  
pp. 1-21
Author(s):  
Randa Basheer ◽  
Bassel Alkhatib

From proactive detection of cyberattacks to the identification of key actors, analyzing contents of the Dark Web plays a significant role in deterring cybercrimes and understanding criminal minds. Researching in the Dark Web proved to be an essential step in fighting cybercrime, whether with a standalone investigation of the Dark Web solely or an integrated one that includes contents from the Surface Web and the Deep Web. In this review, we probe recent studies in the field of analyzing Dark Web content for Cyber Threat Intelligence (CTI), introducing a comprehensive analysis of their techniques, methods, tools, approaches, and results, and discussing their possible limitations. In this review, we demonstrate the significance of studying the contents of different platforms on the Dark Web, leading new researchers through state-of-the-art methodologies. Furthermore, we discuss the technical challenges, ethical considerations, and future directions in the domain.


Author(s):  
Dilip Kumar Sharma ◽  
A. K. Sharma

Web crawlers specialize in downloading web content and analyzing and indexing from surface web, consisting of interlinked HTML pages. Web crawlers have limitations if the data is behind the query interface. Response depends on the querying party’s context in order to engage in dialogue and negotiate for the information. In this paper, the authors discuss deep web searching techniques. A survey of technical literature on deep web searching contributes to the development of a general framework. Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.


Author(s):  
Ramanujam Elangovan

The deep web (also called deepnet, the invisible web, dark web, or the hidden web) refers to world wide web content that is not part of the surface web, which is indexed by standard search engines. The more familiar “surface” web contains only a small fraction of the information available on the internet. The deep web contains much of the valuable data on the web, but is largely invisible to standard web crawling techniques. Besides it being the huge source of information, it also provides the rostrum for cybercrime like by providing download links for movies, music, games, etc. without having their copyrights. This article aims to provide context and policy recommendations pertaining to the dark web. The dark web's complete history, from its creation to the latest incidents and the way to access and their sub forums are briefly discussed with respective to the user perspective.


2020 ◽  
Vol 12 (24) ◽  
pp. 93-118
Author(s):  
Danilo Henrique Nunes ◽  
Lucas Souza Lehfeld ◽  
Jonatas Santos Silva
Keyword(s):  
Deep Web ◽  

Devido a expansão e progressão tecnológica, bem como o uso intenso da Internet, foi possibilitado o surgimento de diversos crimes praticados no âmbito virtual, tais como o ciberterrorismo. Desse modo, o presente trabalho monográfico propõe, em síntese, um estudo acerca dos atos ciberterroristas, perfazendo uma análise de sua conceituação e atuação no âmbito virtual, seja na Surface Web, Deep Web ou Dark Web e, ainda, de como deverá ser aplicado o ordenamento jurídico brasileiro quando da incidência do referido crime. Ademais, destaca-se que esses ataques terroristas, um problema mundial que aparentava ser distante da realidade brasileira e que agora são passíveis de realização no ciberespaço, possuem o potencial de causar o pânico na sociedade, ameaçando, inclusive, um Estado, tornando-os refém do medo. Tendo em vista tais aspectos, será feita uma apreciação da legitimidade para aplicação da Lei Antiterrorismo (Lei nº 13.260, promulgada em 16 de março de 2016) de caráter excepcional, bem como um estudo de caso concreto da primeira aplicação efetiva da Lei. A problemática da nova realidade tecnológica será abordada mediante pesquisa exploratória que visa analisar uma real possibilidade de ataque ciberterrorista. O método de pesquisa utilizado é o bibliográfico, através de artigos, críticas e reflexões, ponderando seu impacto no Direito Penal vigente. Todavia, não se pode descartar a possível desestabilização do Estado em razão da propagação ao terror causado por essa cibercriminalidade, sendo necessária uma adequada abordagem devido ao grau de periculosidade, excepcionalidade e dimensão dos danos, com embasamento nos direitos fundamentais previstos na nossa Carta Magna.


Sign in / Sign up

Export Citation Format

Share Document