scholarly journals The Digital Library and the Archiving System for Educational Institutes

Author(s):  
Atta ur Rahman ◽  
Fahd Abdulsalam Alhaidari

At present, there are several formats that exist through which data is distributed among online stakeholders. An example of this is the XML, which like other such formats is helpful for traditional inquiry methods and for forming the foundation of query languages such as SPARQL and SQL. Information about primary representation demands a broader assistance for the languages where every piece of data from any resource can substantiate the original queries for searching. Such models are useful for XML based retrieval since several cooperative XML search engines have been developed already. These search engines perform semantic investigation of XML files with data surrounded by the important fields. Therefore, XML files are used to store and index data intended for competent retrieval. In this research, an attempt is made to fill this gap of customized representation and retrieval with a focus on the educational domain. An institute's repository of books, e-books, journals, articles and research theses has been used to retrieve results. A system has been proposed and developed to store the contents of Institute's Databank as an object of the Digital Library. A structured method has been proposed to organize all the data and a system has been developed which extracts meaningful information from the Data Bank. The information repository is established, and the entire data is represented in terms of a unit called Digital Object in the Digital Library. The single unit is represented by recording some quantitative data about it referred to as ‘Metadata'. The search is focused on extracting meaningful information from the repository by applying some filtration strategies to get relevant information, best matched with the query terms. At the end, a partitioning and parallelism focused architecture to archive the information for sharing, back-up and collaboration is also proposed. Comparison of the proposed scheme with state of the art schemes is provided in terms of computational complexity and recall measurement.


2017 ◽  
pp. 030-050
Author(s):  
J.V. Rogushina ◽  

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.



Database ◽  
2021 ◽  
Vol 2021 ◽  
Author(s):  
Valerio Arnaboldi ◽  
Jaehyoung Cho ◽  
Paul W Sternberg

Abstract Finding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, facilitating knowledge transfer. However, digesting the information returned by these systems—often a large number of documents—still requires considerable effort. In this paper, we present Wormicloud, a new tool that summarizes scientific articles in a graphical way through word clouds. This tool is aimed at facilitating the discovery of new experimental results not yet curated by model organism databases and is designed for both researchers and biocurators. Wormicloud is customized for the Caenorhabditis  elegans literature and provides several advantages over existing solutions, including being able to perform full-text searches through Textpresso, which provides more accurate results than other existing literature search engines. Wormicloud is integrated through direct links from gene interaction pages in WormBase. Additionally, it allows analysis on the gene sets obtained from literature searches with other WormBase tools such as SimpleMine and Gene Set Enrichment. Database URL: https://wormicloud.textpressolab.com



2016 ◽  
Vol 49 (1) ◽  
pp. 302-310 ◽  
Author(s):  
Michael Kachala ◽  
John Westbrook ◽  
Dmitri Svergun

Recent advances in small-angle scattering (SAS) experimental facilities and data analysis methods have prompted a dramatic increase in the number of users and of projects conducted, causing an upsurge in the number of objects studied, experimental data available and structural models generated. To organize the data and models and make them accessible to the community, the Task Forces on SAS and hybrid methods for the International Union of Crystallography and the Worldwide Protein Data Bank envisage developing a federated approach to SAS data and model archiving. Within the framework of this approach, the existing databases may exchange information and provide independent but synchronized entries to users. At present, ways of exchanging information between the various SAS databases are not established, leading to possible duplication and incompatibility of entries, and limiting the opportunities for data-driven research for SAS users. In this work, a solution is developed to resolve these issues and provide a universal exchange format for the community, based on the use of the widely adopted crystallographic information framework (CIF). The previous version of the sasCIF format, implemented as an extension of the core CIF dictionary, has been available since 2000 to facilitate SAS data exchange between laboratories. The sasCIF format has now been extended to describe comprehensively the necessary experimental information, results and models, including relevant metadata for SAS data analysis and for deposition into a database. Processing tools for these files (sasCIFtools) have been developed, and these are available both as standalone open-source programs and integrated into the SAS Biological Data Bank, allowing the export and import of data entries as sasCIF files. Software modules to save the relevant information directly from beamline data-processing pipelines in sasCIF format are also developed. This update of sasCIF and the relevant tools are an important step in the standardization of the way SAS data are presented and exchanged, to make the results easily accessible to users and to promote further the application of SAS in the structural biology community.



Author(s):  
Novario Jaya Perdana

The accuracy of search result using search engine depends on the keywords that are used. Lack of the information provided on the keywords can lead to reduced accuracy of the search result. This means searching information on the internet is a hard work. In this research, a software has been built to create document keywords sequences. The software uses Google Latent Semantic Distance which can extract relevant information from the document. The information is expressed in the form of specific words sequences which could be used as keyword recommendations in search engines. The result shows that the implementation of the method for creating document keyword recommendation achieved high accuracy and could finds the most relevant information in the top search results.



2015 ◽  
pp. 466-489
Author(s):  
K. Palanivel ◽  
S. Kuppuswami

Cloud computing is an emerging computing model which has evolved as a result of the maturity of underlying prerequisite technologies. There are differences in perspective as to when a set of underlying technologies becomes a “cloud” model. In order to categorize cloud computing services, and to expect some level of consistent characteristics to be associated with the services, cloud adopters need a consistent frame of reference. The Cloud Computing Reference Architecture (CCRA) defines a standard reference architecture and consistent frame of reference for comparing cloud services from different service providers when selecting and deploying cloud services to support their mission requirements. Cloud computing offers information retrieval systems, particularly digital libraries and search engines, a wide variety of options for growth and reduction of maintenance needs and encourages efficient resource use. These features are particularly attractive for digital libraries, repositories, and search engines. The dynamic and elastic provisioning features of a cloud infrastructure allow rapid growth in collection size and support a larger user base, while reducing management issues. Hence, the objective of this chapter is to investigate and design reference architecture to Digital Library Systems using cloud computing with scalability in mind. The proposed reference architecture is called as CORADLS. This architecture accelerates the rate at which library users can get easy, efficient, faster and reliable services in the digital environment. Here, the end user does not have to worry about the resource or disk space in cloud computing.



2020 ◽  
pp. 624-650
Author(s):  
Luis Terán

With the introduction of Web 2.0, which includes users as content generators, finding relevant information is even more complex. To tackle this problem of information overload, a number of different techniques have been introduced, including search engines, Semantic Web, and recommender systems, among others. The use of recommender systems for e-Government is a research topic that is intended to improve the interaction among public administrations, citizens, and the private sector through reducing information overload on e-Government services. In this chapter, the use of recommender systems on eParticipation is presented. A brief description of the eGovernment Framework used and the participation levels that are proposed to enhance participation. The highest level of participation is known as eEmpowerment, where the decision-making is placed on the side of citizens. Finally, a set of examples for the different eParticipation types is presented to illustrate the use of recommender systems.



Author(s):  
Christopher Yang ◽  
Kar W. Li

Structural and semantic interoperability have been the focus of digital library research in the early 1990s. Many research works have been done on searching and retrieving objects across variations in protocols, formats, and disciplines. As the World Wide Web has become more popular in the last ten years, information is available in multiple languages in global digital libraries. Users are searching across the language boundary to identify the relevant information that may not be available in their own language. Cross-lingual semantic interoperability has become one of the focuses in digital library research in the late 1990s. In particular, research in cross-lingual information retrieval (CLIR) has been very active in recent conferences on information retrieval, digital libraries, knowledge management, and information systems. The major problem in CLIR is how to build the bridge between the representations of user queries and documents if they are of different languages.



The Dark Web ◽  
2018 ◽  
pp. 359-374
Author(s):  
Dilip Kumar Sharma ◽  
A. K. Sharma

ICT plays a vital role in human development through information extraction and includes computer networks and telecommunication networks. One of the important modules of ICT is computer networks, which are the backbone of the World Wide Web (WWW). Search engines are computer programs that browse and extract information from the WWW in a systematic and automatic manner. This paper examines the three main components of search engines: Extractor, a web crawler which starts with a URL; Analyzer, an indexer that processes words on the web page and stores the resulting index in a database; and Interface Generator, a query handler that understands the need and preferences of the user. This paper concentrates on the information available on the surface web through general web pages and the hidden information behind the query interface, called deep web. This paper emphasizes the Extraction of relevant information to generate the preferred content for the user as the first result of his or her search query. This paper discusses the aspect of deep web with analysis of a few existing deep web search engines.



Generally speaking, horizontal search engines are meant to deal with general web queries. In the context of this chapter, the authors investigated the act of navigational resource identification in the light of horizontal web searching. State-of-the-art navigational resource identification is reluctant to the distinct characteristics of the navigational queries and specific users' treatments toward different searching tasks. Consequently, in this chapter, the authors discussed a new mechanism for navigational resource identification according to previous findings.



Author(s):  
Dilip Kumar Sharma ◽  
A. K. Sharma

ICT plays a vital role in human development through information extraction and includes computer networks and telecommunication networks. One of the important modules of ICT is computer networks, which are the backbone of the World Wide Web (WWW). Search engines are computer programs that browse and extract information from the WWW in a systematic and automatic manner. This paper examines the three main components of search engines: Extractor, a web crawler which starts with a URL; Analyzer, an indexer that processes words on the web page and stores the resulting index in a database; and Interface Generator, a query handler that understands the need and preferences of the user. This paper concentrates on the information available on the surface web through general web pages and the hidden information behind the query interface, called deep web. This paper emphasizes the Extraction of relevant information to generate the preferred content for the user as the first result of his or her search query. This paper discusses the aspect of deep web with analysis of a few existing deep web search engines.



Sign in / Sign up

Export Citation Format

Share Document