Query Expansion based on Central Tendency and PRF for Monolingual Retrieval

2016 ◽  
Vol 6 (4) ◽  
pp. 30-50
Author(s):  
Rekha Vaidyanathan ◽  
Sujoy Das ◽  
Namita Srivastava

Query Expansion is the process of selecting relevant words that are closest in meaning and context to that of the keyword(s) of query. In this paper, a statistical method of automatically selecting contextually related words for expansion, after identifying a pattern in their score, is proposed. Words appearing in top 10 relevant document is given a score w.r.t partitions they appear in. Proposed statistical method, identifies a pattern of central tendency in the high scores and selects the right group of words for query expansion. The objective of the method is to keep the expanded query with minimum words (light), and still give statistically significant MAP values compared to the original query. Experimental results show 17-21% improvement of MAP over the original unexpanded query as baseline but achieves a performance similar to that of the state of the art query expansion models - Bo1 and KL. FIRE 2011 Adhoc English and Hindi data for 50 topics each were used for experiments with Terrier as the Retrieval Engine.

Author(s):  
Rekha Vaidyanathan ◽  
Sujoy Das ◽  
Namita Srivastava

Query Expansion is the process of selecting relevant words that are closest in meaning and context to that of the keyword(s) of query. In this paper, a statistical method of automatically selecting contextually related words for expansion, after identifying a pattern in their score, is proposed. Words appearing in top 10 relevant document is given a score w.r.t partitions they appear in. Proposed statistical method, identifies a pattern of central tendency in the high scores and selects the right group of words for query expansion. The objective of the method is to keep the expanded query with minimum words (light), and still give statistically significant MAP values compared to the original query. Experimental results show 17-21% improvement of MAP over the original unexpanded query as baseline but achieves a performance similar to that of the state of the art query expansion models - Bo1 and KL. FIRE 2011 Adhoc English and Hindi data for 50 topics each were used for experiments with Terrier as the Retrieval Engine.


2021 ◽  
Vol 11 (23) ◽  
pp. 11344
Author(s):  
Wei Ke ◽  
Ka-Hou Chan

Paragraph-based datasets are hard to analyze by a simple RNN, because a long sequence always contains lengthy problems of long-term dependencies. In this work, we propose a Multilayer Content-Adaptive Recurrent Unit (CARU) network for paragraph information extraction. In addition, we present a type of CNN-based model as an extractor to explore and capture useful features in the hidden state, which represent the content of the entire paragraph. In particular, we introduce the Chebyshev pooling to connect to the end of the CNN-based extractor instead of using the maximum pooling. This can project the features into a probability distribution so as to provide an interpretable evaluation for the final analysis. Experimental results demonstrate the superiority of the proposed approach, being compared to the state-of-the-art models.


Author(s):  
Rafi Shaik ◽  
H. Surya Prakash Rao

: Hydroxychloroquine (HCQ) is an extremely important drug used for treatment of various ailments. WHO listed it as one of the essential drugs. Utility of hydroxychloroquine (HCQ) as prophylaxis of COVID19, although debated, is well known. We have reviewed synthetic strategies for industrial and academic synthesis of HCQ and its key intermediates like 4,7-dichloroquinoline (4,7-DCQ) and 2-((4-aminopentyl)(ethyl)amino)ethan-1-ol 9 (aka hydroxynovaldiamine; HNDA). The review is expected to provide the right perspective of the state-of-the-art knowledge in this field so that further developments are possible.


Author(s):  
Chia-Hu Chang ◽  
Ja-Ling Wu

With the aid of content-based multimedia analysis, virtual product placement opens up new opportunities for advertisers to effectively monetize the existing videos in an efficient way. In addition, a number of significant and challenging issues are raising accordingly, such as how to less-intrusively insert the contextually relevant advertising message (what) at the right place (where) and the right time (when) with the attractive representation (how) in the videos. In this chapter, domain knowledge in support of delivering and receiving the advertising message is introduced, such as the advertising theory, psychology and computational aesthetics. We briefly review the state of the art techniques for assisting virtual product placement in videos. In addition, we present a framework to serve the virtual spotlighted advertising (ViSA) for virtual product placement and give an explorative study of it. Moreover, observations about the new trend and possible extension in the design space of virtual product placement will also be stated and discussed. We believe that it would inspire the researchers to develop more interesting and applicable multimedia advertising systems for virtual product placement.


Author(s):  
Elena B. Durán ◽  
Margarita Álvarez

Ubiquitous learning features intuitive ways of identifying appropriate learning collaborators and right learning contents and services at the right place and at the right time. Consequently, there are many aspects that must be considered in designing computing applications that support this kind of learning. In this chapter, ubiquitous learning is introduced and characterized, the challenges that must be faced by those in charge of designing and developing such applications are reviewed, and the state of the art of this recently initiated line of research at the Informatics and Information System Research Institute of the National University of Santiago del Estero are presented. The developments achieved to date as well as the future guidelines are also shown.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jiaxi Ye ◽  
Ruilin Li ◽  
Bin Zhang

Directed fuzzing is a practical technique, which concentrates its testing energy on the process toward the target code areas, while costing little on other unconcerned components. It is a promising way to make better use of available resources, especially in testing large-scale programs. However, by observing the state-of-the-art-directed fuzzing engine (AFLGo), we argue that there are two universal limitations, the balance problem between the exploration and the exploitation and the blindness in mutation toward the target code areas. In this paper, we present a new prototype RDFuzz to address these two limitations. In RDFuzz, we first introduce the frequency-guided strategy in the exploration and improve its accuracy by adopting the branch-level instead of the path-level frequency. Then, we introduce the input-distance-based evaluation strategy in the exploitation stage and present an optimized mutation to distinguish and protect the distance sensitive input content. Moreover, an intertwined testing schedule is leveraged to perform the exploration and exploitation in turn. We test RDFuzz on 7 benchmarks, and the experimental results demonstrate that RDFuzz is skilled at driving the program toward the target code areas, and it is not easily stuck by the balance problem of the exploration and the exploitation.


2020 ◽  
Vol 10 (8) ◽  
pp. 2864 ◽  
Author(s):  
Muhammad Asad ◽  
Ahmed Moustafa ◽  
Takayuki Ito

Artificial Intelligence (AI) has been applied to solve various challenges of real-world problems in recent years. However, the emergence of new AI technologies has brought several problems, especially with regard to communication efficiency, security threats and privacy violations. Towards this end, Federated Learning (FL) has received widespread attention due to its ability to facilitate the collaborative training of local learning models without compromising the privacy of data. However, recent studies have shown that FL still consumes considerable amounts of communication resources. These communication resources are vital for updating the learning models. In addition, the privacy of data could still be compromised once sharing the parameters of the local learning models in order to update the global model. Towards this end, we propose a new approach, namely, Federated Optimisation (FedOpt) in order to promote communication efficiency and privacy preservation in FL. In order to implement FedOpt, we design a novel compression algorithm, namely, Sparse Compression Algorithm (SCA) for efficient communication, and then integrate the additively homomorphic encryption with differential privacy to prevent data from being leaked. Thus, the proposed FedOpt smoothly trade-offs communication efficiency and privacy preservation in order to adopt the learning task. The experimental results demonstrate that FedOpt outperforms the state-of-the-art FL approaches. In particular, we consider three different evaluation criteria; model accuracy, communication efficiency and computation overhead. Then, we compare the proposed FedOpt with the baseline configurations and the state-of-the-art approaches, i.e., Federated Averaging (FedAvg) and the paillier-encryption based privacy-preserving deep learning (PPDL) on all these three evaluation criteria. The experimental results show that FedOpt is able to converge within fewer training epochs and a smaller privacy budget.


Physics ◽  
2020 ◽  
Vol 2 (1) ◽  
pp. 49-66 ◽  
Author(s):  
Vyacheslav I. Yukalov

The article presents the state of the art and reviews the literature on the long-standing problem of the possibility for a sample to be at the same time solid and superfluid. Theoretical models, numerical simulations, and experimental results are discussed.


2016 ◽  
Vol 26 (2) ◽  
pp. 99
Author(s):  
Juan Pablo Juan Pablo Angelone

Resumen Sostenida particularmente durante la presidencia de Raúl Alfonsín (1983-1989), la “teoría de los dos demonios es considerada la memoria hegemónica-dominante referida a la última dictadura cívico-militar argentina (1976-1983). A su vez, el Informe de la Comisión Nacional sobre la Desaparición de Personas (CONADEP) el “Nunca Más”, suele ser considerado una expresión de dicha memoria. Según nuestra hipótesis, el “Nunca Más” no suscribe la “teoría de los dos demonios” sino una memoria diferente aunque no antitética. El objetivo del presente trabajo consiste en caracterizar ambos conjuntos de representaciones con el fin de señalar las diferencias entre ellos. Nuestro corpus de análisis incluye como fuentes primarias los escritos y declaraciones de Alfonsín relativos al tema así como también el Prólogo del “Nunca Más” presentado en 1984. Dado que la autoría de este último es atribuida a Ernesto Sabato, presidente de la CONADEP, también se consideran algunas declaraciones del mencionado autor. Dichas fuentes primarias son históricamente contextualizadas a partir del uso de fuentes secundarias, dentro de las cuales incluimos el estado del arte relativo a la “teoría de los dos demonios”. Concluimos que si bien el planteo de Alfonsín y el Prólogo original del “Nunca Más” coinciden en el rechazo a la violencia como medio de expresión política, Alfonsín pone en un plano de igualdad a dos actores: el guerrillerismo izquierdista y el golpismo, mientras que el Prólogo critica tres modalidades de violencia: la guerrilla, el terrorismo paraestatal de derecha, actor no mencionado por Alfonsín, y el terrorismo dictatorial.  Between two demons and three violences: Alfonsín’s administration and the senses of the state terrorism memory in contemporary Argentina Abstract  Particularly held during Raul Alfonsín’s presidency (1983 – 1989), “the theory of the two demons” is considered the dominant-hegemonic memory referred to the last Argentine civic-military dictatorship (1976 – 1983). In turn, the report of the National Commission on the Disappearance of Persons (CONADEP)- “Nunca Más” (Never Again) is usually considered an expression of the aforesaid memory. According to our hypothesis, “Nunca Más” does not subscribe to the “theory of the two demons” but to a different memory – though not antithetical. The aim of the current paper consists of characterizing both groups of representations in order to point out the differences between them. Our corpus of analysis includes as main sources Alfonsín’s documents and statements concerned with the issue, as well as the “Nunca Más” prologue, presented in 1984. Some statements of Ernesto Sabato, CONADEP’s president, are also considered due to the fact that the authorship of the latter work mentioned has been attributed to him. Such primary sources are historically contextualized from the use of secondary sources, which within we include the state of the art relative to “the theory of the two demons”. We conclude that even though Alfonsín’s proposal and the original “Nunca Más” prologue coincide in the rejection of violence as a means of political expression, Alfonsín places in an equal position two actors – the left-wing guerrilla and the coup – while the prologue criticizes 3 violence modalities: the guerrilla, the right-wing semi-official terrorism – actor not mentioned by Alfonsín – and the dictatorial terrorism. 


Sign in / Sign up

Export Citation Format

Share Document