scholarly journals A Heuristic Ranking of Different Characteristic Mining Based Mathematical Formulae Retrieval Models

The significant difficulty in the present circumstances is how to classify the math related keywords from a given text file and group them in one math file. Through this article a heuristic ranking model was developed and was evaluated on different mathematical formulae retrieval algorithms based on Characteristic mining. Our proposed heuristic ranking model was developed using the performance metrics of exiting retrieval algorithms such as NMF clustering, Levenstein distance, Sequence matcher, Fuzzy-wuzzy and Tensorflow. Performance metrics such as sensitivity, specificity, efficiency, accuracy and retrieval time were used in developing our heuristic ranking model.

Diagnostics ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 109 ◽  
Author(s):  
Ukweh ◽  
Ugbem ◽  
Okeke ◽  
Ekpo

Background: Ultrasound is operator-dependent, and its value and efficacy in fetal morphology assessment in a low-resource setting is poorly understood. We assessed the value and efficacy of fetal morphology ultrasound assessment in a Nigerian setting. Materials and Methods: We surveyed fetal morphology ultrasound performed across five facilities and followed-up each fetus to ascertain the outcome. Fetuses were surveyed in the second trimester (18th–22nd weeks) using the International Society of Ultrasound in Obstetrics and Gynecology (ISUOG) guideline. Clinical and surgical reports were used as references to assess the diagnostic efficacy of ultrasound in livebirths, and autopsy reports to confirm anomalies in terminated pregnancies, spontaneous abortions, intrauterine fetal deaths, and still births. We calculated sensitivity, specificity, positive and negative predictive values, Area under the curve (AUC), Youden index, likelihood ratios, and post-test probabilities. Results: In total, 6520 fetuses of women aged 15–46 years (mean = 31.7 years) were surveyed. The overall sensitivity, specificity, and AUC were 77.1 (95% CI: 68–84.6), 99.5 (95% CI: 99.3–99.7), and 88.3 (95% CI: 83.7–92.2), respectively. Other performance metrics were: positive predictive value, 72.4 (95% CI: 64.7–79.0), negative predictive value, 99.6 (95% CI: 99.5–99.7), and Youden index (77.1%). Abnormality prevalence was 1.67% (95% CI: 1.37–2.01), and the positive and negative likelihood ratios were 254 (95% CI: 107.7–221.4) and 0.23 (95% CI: 0.16–0.33), respectively. The post-test probability for positive test was 72% (95% CI: 65–79). Conclusion: Fetal morphology assessment is valuable in a poor economics setting, however, the variation in the diagnostic efficacy across facilities and the limitations associated with the detection of circulatory system anomalies need to be addressed.


2020 ◽  
Author(s):  
Jason G. Kralj ◽  
Stephanie L. Servetas ◽  
Samuel P. Forry ◽  
Scott A. Jackson

AbstractEvaluating the performance of metagenomics analyses has proven a challenge, due in part to limited ground-truth standards, broad application space, and numerous evaluation methods and metrics. Application of traditional clinical performance metrics (i.e. sensitivity, specificity, etc.) using taxonomic classifiers do not fit the “one-bug-one-test” paradigm. Ultimately, users need methods that evaluate fitness-for-purpose and identify their analyses’ strengths and weaknesses. Within a defined cohort, reporting performance metrics by taxon, rather than by sample, will clarify this evaluation. An estimated limit of detection, positive and negative control samples, and true positive and negative true results are necessary criteria for all investigated taxa. Use of summary metrics should be restricted to comparing results of similar cohorts and data, and should employ harmonic means and continuous products for each performance metric rather than arithmetic mean. Such consideration will ensure meaningful comparisons and evaluation of fitness-for-purpose.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Shujun Ou ◽  
Weija Su ◽  
Yi Liao ◽  
Kapeel Chougule ◽  
Jireh R. A. Agda ◽  
...  

Abstract Background Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. Results We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. Conclusions The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.


2017 ◽  
Vol 7 (1.5) ◽  
pp. 19
Author(s):  
Y. Madhu Sudhana Reddy ◽  
R. S. Ernest Ravindran ◽  
K. Hari Kishore

In this paper, the recent advancement in the Digital Image Processing Aspects in the Diabetic Retinopathy (DR) were been discussed. The major approaches in DR are categorized into four classes namely Preprocessing, Optic Disk Detection, Blood Vessel Extraction and supervised classification. The optic disk, blood vessels and exudates gives more analytical details about the retinal image, segmentation of those features are very important. Further these approaches are classified into finer classes based on the methodologies accomplished in the respective schemes. The details of the database those used for testing the proposed algorithms is also illustrated in this paper. The details of performance metrics such as accuracy, sensitivity, specificity, precision, recall and F-measure are also discussed through their mathematical expressions. 


Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 39
Author(s):  
Zhenyang Li ◽  
Guangluan Xu ◽  
Xiao Liang ◽  
Feng Li ◽  
Lei Wang ◽  
...  

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.


Diagnostics ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 11
Author(s):  
Prasanalakshmi Balaji ◽  
Kumarappan Chidambaram

One of the most dangerous diseases that threaten people is cancer. If diagnosed in earlier stages, cancer, with its life-threatening consequences, has the possibility of eradication. In addition, accuracy in prediction plays a significant role. Hence, developing a reliable model that contributes much towards the medical community in the early diagnosis of biopsy images with perfect accuracy comes to the forefront. This article aims to develop better predictive models using multivariate data and high-resolution diagnostic tools in clinical cancer research. This paper proposes the social spider optimisation (SSO) algorithm-tuned neural network to classify microscopic biopsy images of cancer. The significance of the proposed model relies on the effective tuning of the weights of the neural network classifier by the SSO algorithm. The performance of the proposed strategy is analysed with performance metrics such as accuracy, sensitivity, specificity, and MCC measures, and the attained results are 95.9181%, 94.2515%, 97.125%, and 97.68%, respectively, which shows the effectiveness of the proposed method for cancer disease diagnosis.


2021 ◽  
Author(s):  
Shana J ◽  
Venkatachalam T

Abstract Data mining enables classification of Electrocardiographic (ECG) signals of the heart for diagnosing many cardiac diseases. ECG signals often consist of unwanted noises, speckles and redundant features. An unwanted noise and redundant features always degrade the quality of ECG signal and may lead to loss of accuracy in classification technique. To overcome these challenges, we introduced Optimize Discrete Kernel Vector (ODKV) classifier with an impressive pre-processing in this paper. In order to remove the noises, image processing filter namely the Adaptive Notch Filters (ANF) are initially used to remove Power Line Interference from ECG Signals. Moreover, reducing the redundant features from the ECG signal plays a vital role in diagnosing the cardiac disease. So, Optimize Discrete Kernel Vector (ODKV) classifier is used to reduce the redundant features and also to enhance the classification accuracy of the input ECG signal. Thus, Optimize Discrete Kernel Vector (ODKV) classifier identifies the Q wave, R wave and S wave in the input ECG signal. Finally, performance metrics Sensitivity, Specificity, Accuracy and Mean Square Error (MSE) are calculated and compared with the existing method such as SVM-kNN, ANN-kNN, GB-SVNN, and CNN to prove the enhancement of the classification technique.


Author(s):  
David Edmundson ◽  
Gerald Schaefer

Since there are few open image retrieval toolkits available, researchers in the field are often forced to re-implement existing algorithms in order to perform a comparative evaluation. None of the existing toolkits support retrieval of JPEG images directly in the compressed domain. The authors’ aim is therefore to facilitate the use of compressed domain image retrieval techniques as well as ease retrieval evaluation by fellow researchers. For this purpose, the authors present JIRL, an open source C++ software suite that allows content-based image retrieval in the JPEG compressed domain and provides tools for benchmarking retrieval accuracy and retrieval time. In total, twelve state-of-the-art JPEG retrieval algorithms are implemented, while for each method techniques for compressed domain feature extraction as well as feature comparison are provided in an object-oriented framework. An example image retrieval application is also provided to demonstrate how the library can be used. JIRL is made available to fellow researchers under the LGPL v.2.1 license.


Entropy ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. 1269 ◽  
Author(s):  
Timothy Gottwald ◽  
Gavin Poole ◽  
Earl Taylor ◽  
Weiqi Luo ◽  
Drew Posny ◽  
...  

For millennia humans have benefitted from application of the acute canine sense of smell to hunt, track and find targets of importance. In this report, canines were evaluated for their ability to detect the severe exotic phytobacterial arboreal pathogen Xanthomonas citri pv. citri (Xcc), which is the causal agent of Asiatic citrus canker (Acc). Since Xcc causes only local lesions, infections are non-systemic, limiting the use of serological and molecular diagnostic tools for field-level detection. This necessitates reliance on human visual surveys for Acc symptoms, which is highly inefficient at low disease incidence, and thus for early detection. In simulated orchards the overall combined performance metrics for a pair of canines were 0.9856, 0.9974, 0.9257 and 0.9970, for sensitivity, specificity, precision, and accuracy, respectively, with 1–2 s/tree detection time. Detection of trace Xcc infections on commercial packinghouse fruit resulted in 0.7313, 0.9947, 0.8750, and 0.9821 for the same performance metrics across a range of cartons with 0–10% Xcc-infected fruit despite the noisy, hot and potentially distracting environment. In orchards, the sensitivity of canines increased with lesion incidence, whereas the specificity and overall accuracy was >0.99 across all incidence levels; i.e., false positive rates were uniformly low. Canines also alerted to a range of 1–12-week-old infections with equal accuracy. When trained to either Xcc-infected trees or Xcc axenic cultures, canines inherently detected the homologous and heterologous targets, suggesting they can detect Xcc directly rather than only volatiles produced by the host following infection. Canines were able to detect the Xcc scent signature at very low concentrations (10,000× less than 1 bacterial cell per sample), which implies that the scent signature is composed of bacterial cell volatile organic compound constituents or exudates that occur at concentrations many fold that of the bacterial cells. The results imply that canines can be trained as viable early detectors of Xcc and deployed across citrus orchards, packinghouses, and nurseries.


2017 ◽  
Vol 146 (2) ◽  
pp. 168-176 ◽  
Author(s):  
C. SOUTY ◽  
R. JREICH ◽  
Y. LE STRAT ◽  
C. PELAT ◽  
P. Y. BOËLLE ◽  
...  

SUMMARYInfluenza epidemics are monitored using influenza-like illness (ILI) data reported by health-care professionals. Timely detection of the onset of epidemics is often performed by applying a statistical method on weekly ILI incidence estimates with a large range of methods used worldwide. However, performance evaluation and comparison of these algorithms is hindered by: (1) the absence of a gold standard regarding influenza epidemic periods and (2) the absence of consensual evaluation criteria. As of now, performance evaluations metrics are based only on sensitivity, specificity and timeliness of detection, since definitions are not clear for time-repeated measurements such as weekly epidemic detection. We aimed to evaluate several epidemic detection methods by comparing their alerts to a gold standard determined by international expert consensus. We introduced new performance metrics that meet important objective of influenza surveillance in temperate countries: to detect accurately the start of the single epidemic period each year. Evaluations are presented using ILI incidence in France between 1995 and 2011. We found that the two performance metrics defined allowed discrimination between epidemic detection methods. In the context of performance detection evaluation, other metrics used commonly than the standard could better achieve the needs of real-time influenza surveillance.


Sign in / Sign up

Export Citation Format

Share Document