scholarly journals Development of the Software for the Electronic Library «Scientifi c Heritage of Russia»

Author(s):  
K. P. Pogorelko

The data structure and software of the Scientific Heritage of Russia electronic library was created in 2007 and currently does not meet the needs of the system. The article describes the decisions made when implementing a new version of the software. These decisions affect both the organization of the database structure and the protocols for interacting with the system. Particular attention is paid to the development of information retrieval tools.

2021 ◽  
Author(s):  
Konstantin Pogorelko

The information system “Scientific Heritage of Russia” has been created in stages since 2007. Currently, the existing software does not meet the needs of the system and complicates its further development. It was decided to implement the new version of the software in the asp.net core crossplatform environment. The article describes the decisions made in the implementation of software and modernization of the data structure. Particular attention is paid to the development of information retrieval tools.


Author(s):  
Mohammed Erritali

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user.  Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word. In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.


2011 ◽  
Vol 9 (3) ◽  
Author(s):  
Nick Joint ◽  
Bob Kemp ◽  
Susan Ashworth

The GAELS project (Glasgow Allied Electronically with Strathclyde) is a library-based project intended to promote a culture shift among engineering researchers at Glasgow and Strathclyde Universities. Our intention is to decrease researchers' dependence on separately held local print collections in favour of collaboratively held networked electronic resources. To support this aim, GAELS (1999) has created a courseware package to teach researchers the different information retrieval skills required to use such networked resources.DOI:10.1080/0968776010090304 


Author(s):  
Charles Rodrigues ◽  
Angel Freddy Godoy Viera

Para medir a atividade científica, tem-se utilizado os indicadores bibliográficos que se baseiam em análise estatística de dados quantitativos encontrados na produção técnica e científica. O objetivo deste estudo foi delinear um panorama da produção científica da temática Tecnologias de Informação e Comunicação em bibliotecas, de modo a identificar: os autores mais produtivos (Lei de Lotka); a evolução histórica do número de publicações e os periódicos que mais publicaram sobre a temática (Lei de Bradford); e as principais abordagens que cobrem o tema (Lei de Zipf). A metodologia seguiu quatro etapas: escolha da Web of Science como base de dados de consulta; configuração dos parâmetros das estratégias de busca e do período de cobertura; depuração dos resultados; e tratamento dos dados de pesquisa. Os resultados mostraram que os primeiros trabalhos indexados na Web of Science datam de 1988 e que, até 2014, foram produzidos 458 artigos. A produção científica se mostrou estável nos últimos anos. Os autores mais produtivos foram Gomez, Fourie e Aharony. Não se observou uma alta concentração em um grupo específico, mas sim uma ampla difusão de autores. Identificou-se que são cinco os periódicos com maior produtividade, responsáveis por um terço do total da produção científica: Electronic Library, Program Electronic Library and Information Systems, Library Hi Tech, Library Trends e Libri. As principais palavras-chave indexadas pela Web of Science foram: Academic Library, Internet, Digital Library, Information Retrieval, Librarians e Mobile Services. São abordagens muito presentes no desenvolvimento de produtos, serviços, softwares e aplicativos na sociedade atual.


2020 ◽  
Author(s):  
Bernhard Rieder

This chapter investigates early attempts in information retrieval to tackle the full text of document collections. Underpinning a large number of contemporary applications, from search to sentiment analysis, the concepts and techniques pioneered by Hans Peter Luhn, Gerard Salton, Karen Spärck Jones, and others involve particular framings of language, meaning, and knowledge. They also introduce some of the fundamental mathematical formalisms and methods running through information ordering, preparing the extension to digital objects other than text documents. The chapter discusses the considerable technical expressivity that comes out of the sprawling landscape of research and experimentation that characterizes the early decades of information retrieval. This includes the emergence of the conceptual construct and intermediate data structure that is fundamental to most algorithmic information ordering: the feature vector.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 276
Author(s):  
Paniz Abedin ◽  
Arnab Ganguly ◽  
Solon P. Pissis ◽  
Sharma V. Thankachan

Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].


Sign in / Sign up

Export Citation Format

Share Document