Information Retrieval
Recently Published Documents





Music is a widely used data format in the explosion of Internet information. Automatically identifying the style of online music in the Internet is an important and hot topic in the field of music information retrieval and music production. Recently, automatic music style recognition has been used in many real life scenes. Due to the emerging of machine learning, it provides a good foundation for automatic music style recognition. This paper adopts machine learning technology to establish an automatic music style recognition system. First, the online music is process by waveform analysis to remove the noises. Second, the denoised music signals are represented as sample entropy features by using empirical model decomposition. Lastly, the extracted features are used to learn a relative margin support vector machine model to predict future music style. The experimental results demonstrate the effectiveness of the proposed framework.

Abdullah Saleh Alqahtani ◽  
P. Saravanan ◽  
M. Maheswari ◽  
Sami Alshmrany

2022 ◽  
Vol 12 (2) ◽  
pp. 706
Pengfei Li ◽  
Yin Zhang ◽  
Bin Zhang

In exploratory search, users sometimes combine two or more issued queries into new queries. We present such a kind of search behavior as query combination behavior. We find that the queries after combination usually can better meet users’ information needs. We also observe that users combine queries for different motivations, which leads to different types of query combination behaviors. Previous work on understanding user exploratory search behaviors has focused on how people reformulate queries, but not on how and why they combine queries. Being able to answer these questions is important for exploring how users search and learn during information retrieval processes and further developing support to assist searchers. In this paper, we first describe a two-layer hierarchical structure for understanding the space of query combination behavior types. We manually classify query combination behavior sessions from AOL and Sogou search engines and explain the relationship from combining queries to success. We then characterize some key aspects of this behavior and propose a classifier that can automatically classify types of query combination behavior using behavioral features. Finally, we summarize our findings and show how search engines can better assist searchers.

PeerJ ◽  
2022 ◽  
Vol 10 ◽  
pp. e12764
Raul Rodriguez-Esteban

Delays in the propagation of scientific discoveries across scientific communities have been an oft-maligned feature of scientific research for introducing a bias towards knowledge that is produced within a scientist’s closest community. The vastness of the scientific literature has been commonly blamed for this phenomenon, despite recent improvements in information retrieval and text mining. Its actual negative impact on scientific progress, however, has never been quantified. This analysis attempts to do so by exploring its effects on biomedical discovery, particularly in the discovery of relations between diseases, genes and chemical compounds. Results indicate that the probability that two scientific facts will enable the discovery of a new fact depends on how far apart these two facts were originally within the scientific landscape. In particular, the probability decreases exponentially with the citation distance. Thus, the direction of scientific progress is distorted based on the location in which each scientific fact is published, representing a path-dependent bias in which originally closely-located discoveries drive the sequence of future discoveries. To counter this bias, scientists should open the scope of their scientific work with modern information retrieval and extraction approaches.

2022 ◽  
Vol 12 (2) ◽  
pp. 628
Fei Yang ◽  
Zhonghui Wang ◽  
Haowen Yan ◽  
Xiaomin Lu

Geometric similarity plays an important role in geographic information retrieval, map matching, and data updating. Many approaches have been developed to calculate the similarity between simple features. However, complex group objects are common in map and spatial database systems. With a micro scene that contains different types of geographic features, calculating similarity is difficult. In addition, few studies have paid attention to the changes in a scene’s geometric similarity in the process of generalization. In this study, we developed a method for measuring the geometric similarity of micro scene generalization based on shape, direction, and position. We calculated shape similarity using the hybrid feature description, and we constructed a direction Voronoi diagram and a position graph to measure the direction similarity and position similarity. The experiments involved similarity calculation and quality evaluation to verify the usability and effectiveness of the proposed method. The experiments showed that this approach can be used to effectively measure the geometric similarity between micro scenes. Moreover, the proposed method accounts for the relationships amongst the geometrical shape, direction, and position of micro scenes during cartographic generalization. The simplification operation leads to obvious changes in position similarity, whereas delete and merge operations lead to changes in direction and position similarity. In the process of generalization, the river + islands scene changed mainly in shape and position, the similarity change in river + lakes occurred due to the direction and location, and the direction similarity of rivers + buildings and roads + buildings changed little.

2022 ◽  
Sebastião Pais ◽  
João Cordeiro ◽  
Muhammad Jamil

Abstract Nowadays, the use of language corpora for many purposes has increased significantly. General corpora exist for numerous languages, but research often needs more specialized corpora. The Web’s rapid growth has significantly improved access to thousands of online documents, highly specialized texts and comparable texts on the same subject covering several languages in electronic form. However, research has continued to concentrate on corpus annotation instead of corpus creation tools. Consequently, many researchers create their corpora, independently solve problems, and generate project-specific systems. The corpus construction is used for many NLP applications, including machine translation, information retrieval, and question-answering. This paper presents a new NLP Corpus and Services in the Cloud called HULTIG-C. HULTIG-C is characterized by various languages that include unique annotations such as keywords set, sentences set, named entity recognition set, and multiword set. Moreover, a framework incorporates the main components for license detection, language identification, boilerplate removal and document deduplication to process the HULTIG-C. Furthermore, this paper presents some potential issues related to constructing multilingual corpora from the Web.

Meng Yuan ◽  
Justin Zobel ◽  
Pauline Lin

AbstractClustering of the contents of a document corpus is used to create sub-corpora with the intention that they are expected to consist of documents that are related to each other. However, while clustering is used in a variety of ways in document applications such as information retrieval, and a range of methods have been applied to the task, there has been relatively little exploration of how well it works in practice. Indeed, given the high dimensionality of the data it is possible that clustering may not always produce meaningful outcomes. In this paper we use a well-known clustering method to explore a variety of techniques, existing and novel, to measure clustering effectiveness. Results with our new, extrinsic techniques based on relevance judgements or retrieved documents demonstrate that retrieval-based information can be used to assess the quality of clustering, and also show that clustering can succeed to some extent at gathering together similar material. Further, they show that intrinsic clustering techniques that have been shown to be informative in other domains do not work for information retrieval. Whether clustering is sufficiently effective to have a significant impact on practical retrieval is unclear, but as the results show our measurement techniques can effectively distinguish between clustering methods.

2022 ◽  
Dwaipayan Roy ◽  
Mandar Mitra ◽  
Philipp Mayr ◽  
Amritap Chowdhury

2022 ◽  
pp. 096100062110675
Abolfazl Asadnia ◽  
Mehrdad CheshmehSohrabi ◽  
Ahmad Shabani ◽  
Asefeh Asemi ◽  
Mohsen Taheri Demneh

Many organizations and businesses are using futurology to keep pace with the ever-increasing changes in the world, as the businesses and organizations need to be updated to achieve organizational and business growth and development. A review of the previous studies has shown that no systematic research has been already conducted on the future of information retrieval systems and the role of library and information science experts in the future of such systems. Therefore, a qualitative study was conducted by reviewing resources, consulting experts, doing interaction analysis, and writing scenarios. The results demonstrated 13 key factors affecting the future of information retrieval systems in the form of two driving forces of social determinism and technological determinism, and four scenarios of Canopus star, Ursa major, Ursa minor, and single star. The results also showed the dominance of technology and social demand and its very important role in the future of information retrieval systems.

Sign in / Sign up

Export Citation Format

Share Document