fuzzy search
Recently Published Documents


TOTAL DOCUMENTS

100
(FIVE YEARS 34)

H-INDEX

10
(FIVE YEARS 3)

2022 ◽  
Vol 31 (1) ◽  
pp. 1-37
Author(s):  
Chao Liu ◽  
Xin Xia ◽  
David Lo ◽  
Zhiwe Liu ◽  
Ahmed E. Hassan ◽  
...  

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR)-based models for code search, but they fail to connect the semantic gap between query and code. An early successful deep learning (DL)-based model DeepCS solved this issue by learning the relationship between pairs of code methods and corresponding natural language descriptions. Two major advantages of DeepCS are the capability of understanding irrelevant/noisy keywords and capturing sequential relationships between words in query and code. In this article, we proposed an IR-based model CodeMatcher that inherits the advantages of DeepCS (i.e., the capability of understanding the sequential semantics in important query words), while it can leverage the indexing technique in the IR-based model to accelerate the search response time substantially. CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query. We verified its effectiveness on a large-scale codebase with ~41K repositories. Experimental results showed that CodeMatcher achieves an MRR (a widely used accuracy measure for code search) of 0.60, outperforming DeepCS, CodeHow, and UNIF by 82%, 62%, and 46%, respectively. Our proposed model is over 1.2K times faster than DeepCS. Moreover, CodeMatcher outperforms two existing online search engines (GitHub and Google search) by 46% and 33%, respectively, in terms of MRR. We also observed that: fusing the advantages of IR-based and DL-based models is promising; improving the quality of method naming helps code search, since method name plays an important role in connecting query and code.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Xiaoyan Feng ◽  
Yanfang Zhou

For the purpose of language retrieval for English listening, this paper designs and implements a cross-language information retrieval system for English listening. Different implementation methods of cross-language information retrieval, query, and translation are analyzed. The system adopts cross-language information retrieval technology based on bilingual dictionaries. According to the cross-language retrieval system of the existing bilingual dictionaries and monolingual dictionaries, based on the design and implementation of the fuzzy search dictionary lookup mechanism, the existing dictionary lookup mechanism is constructed and analyzed. Aiming at the problem of translation ambiguity in information retrieval systems based on bilingual dictionaries, a disambiguities elimination algorithm based on cooccurrence technology is proposed. In continuous speech, the speed of different speakers in different contexts is very different. Deviation from normal speech speed often leads to recognition errors, which makes recognition performance decline. Considering that the influence of speech speed on the length of speech units increases or decreases synchronously, and there is a strong correlation between the lengths of adjacent speech units, an adaptive speech speed algorithm is proposed based on the framework of implicit Markov model based on the information of the length of speech units. Experiments on number string and large vocabulary continuous speech recognition show that the algorithm is effective.


2021 ◽  
Author(s):  
Zhipeng Zeng

High Level Synthesis (HLS) has definitely bridged the pathway between the Electronic System Level (ESL) and its respective structural block at the Register Transfer Level (RTL). However, the most critical task during HLS is to assess and find a superior architecture from the design space that meets the design objectives. This thesis introduces a novel mechanism for efficient Design Space Exploration (DSE) based on Priority Facgtor using the Fuzzy search technique to achieve the optimum result. This novel approach is more efficient than traditional DSE approaches and is capable of drastically reducing the number of architectural variants to be assessed for architecture selection. The proposed method, when applied to a number of benchmarks, yielded improved results with remarkable speedup compared to the existing approach. The HLS design flow shown in this thesis uses the proposed approach for DSE with optimization of three parameters, hardware area, execution time and power consumption.


2021 ◽  
Author(s):  
Zhipeng Zeng

High Level Synthesis (HLS) has definitely bridged the pathway between the Electronic System Level (ESL) and its respective structural block at the Register Transfer Level (RTL). However, the most critical task during HLS is to assess and find a superior architecture from the design space that meets the design objectives. This thesis introduces a novel mechanism for efficient Design Space Exploration (DSE) based on Priority Facgtor using the Fuzzy search technique to achieve the optimum result. This novel approach is more efficient than traditional DSE approaches and is capable of drastically reducing the number of architectural variants to be assessed for architecture selection. The proposed method, when applied to a number of benchmarks, yielded improved results with remarkable speedup compared to the existing approach. The HLS design flow shown in this thesis uses the proposed approach for DSE with optimization of three parameters, hardware area, execution time and power consumption.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Evgeny A. Bakin ◽  
Oksana V. Stanevich ◽  
Daria M. Danilenko ◽  
Dmitry A. Lioznov ◽  
Alexander N. Kulikov

Abstract Purpose The COVID-19 pandemic showed an urgent need for decision support systems to help doctors at a time of stress and uncertainty. However, significant differences in hospital conditions, as well as skepticism of doctors about machine learning algorithms, limit their introduction into clinical practice. Our goal was to test and apply the principle of ”patient-like-mine” decision support in rapidly changing conditions of a pandemic. Methods In the developed system we implemented a fuzzy search that allows a doctor to compare their medical case with similar cases recorded in their medical center since the beginning of the pandemic. Various distance metrics were tried for obtaining clinically relevant search results. With the use of R programming language, we designed the first version of the system in approximately a week. A set of features for the comparison of the cases was selected with the use of random forest algorithm implemented in Caret. Shiny package was chosen for the design of GUI. Results The deployed tool allowed doctors to quickly estimate the current conditions of their patients by means of studying the most similar previous cases stored in the local health information system. The extensive testing of the system during the first wave of COVID-19 showed that this approach helps not only to draw a conclusion about the optimal treatment tactics and to train medical staff in real-time but also to optimize patients’ individual testing plans. Conclusions This project points to the possibility of rapid prototyping and effective usage of ”patient-like-mine” search systems at the time of a pandemic caused by a poorly known pathogen.


2021 ◽  
Vol 187 ◽  
pp. 365-370
Author(s):  
Mengmeng Li ◽  
Guijuan Wang ◽  
Suhui Liu ◽  
Jiguo Yu
Keyword(s):  

2020 ◽  
Vol 13 (6) ◽  
pp. 1072-1085 ◽  
Author(s):  
Jing Chen ◽  
Kun He ◽  
Lan Deng ◽  
Quan Yuan ◽  
Ruiying Du ◽  
...  

2020 ◽  
Author(s):  
Evgeny Bakin ◽  
Oksana Stanevich ◽  
Dmitry Lioznov ◽  
Alexander Kulikov

Abstract Purpose. COVID-19 pandemic has shown an urgent need for decision support systems to help doctors in a period of stress and uncertainty. However, significant differences in hospital conditions, as well as skepticism of doctors about machine learning algorithms, limit their introduction into clinical practice. Our goal was to test and apply a principle of "patient-like-mine" decision support in rapidly changing conditions of a pandemic.Methods. In the developed system we implemented a fuzzy search that allows a doctor to compare their medical case with similar cases recorded since the beginning of the pandemic in their medical center. Various distance metrics were tried for obtaining clinically relevant search results. With the use of R programming language, we designed the first version of the system in approximately a week. A set of features for the comparison of the cases was selected with the use of random forest algorithm implemented in Caret. Shiny package was chosen for the design of GUI. Results. The deployed tool allowed doctors to quickly estimate the current conditions of their patients by means of studying the most similar previous cases stored in the local health information system. Extensive testing of the system during the first wave of COVID-19 has shown that this approach helps not only to draw a conclusion about the optimal treatment tactics and to train medical staff in real-time but also to optimize patients’ individual testing plans.Conclusions. This project points to the possibility of rapid prototyping and effective usage of "patient-like-mine" search systems at the time of a pandemic caused by a poorly known pathogen.


Sign in / Sign up

Export Citation Format

Share Document