confidence threshold
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 23)

H-INDEX

6
(FIVE YEARS 2)

2021 ◽  
Vol 163 (1) ◽  
pp. 23
Author(s):  
Kaiming Cui ◽  
Junjie Liu ◽  
Fabo Feng ◽  
Jifeng Liu

Abstract Deep learning techniques have been well explored in the transiting exoplanet field; however, previous work mainly focuses on classification and inspection. In this work, we develop a novel detection algorithm based on a well-proven object detection framework in the computer vision field. Through training the network on the light curves of the confirmed Kepler exoplanets, our model yields about 90% precision and recall for identifying transits with signal-to-noise ratio higher than 6 (set the confidence threshold to 0.6). Giving a slightly lower confidence threshold, recall can reach higher than 95%. We also transfer the trained model to the TESS data and obtain similar performance. The results of our algorithm match the intuition of the human visual perception and make it useful to find single-transiting candidates. Moreover, the parameters of the output bounding boxes can also help to find multiplanet systems. Our network and detection functions are implemented in the Deep-Transit toolkit, which is an open-source Python package hosted on Github and PyPI.


2021 ◽  
Vol 51 (5) ◽  
pp. 58-67
Author(s):  
G. M. Goncharenko ◽  
N. B. Grishina ◽  
M. A. Shishkina ◽  
T. S. Khoroshilova ◽  
O. L. Khalina ◽  
...  

The results of studies on productivity and genotypic structure of cows of the leading lines of Sibiryachka cattle breed, associative links of CSN3, BLG, LALBA, LEP genotypes with economically important traits are presented. Comparative evaluation showed that Reflection Sovering bull cows had the highest milk yield of 6851 kg, fat content of 4.05% and protein content of 3.15%. The Siberian bull lines Frank 937, Uragan 27 and Kursa 1949 which are being shaped are inferior to them in milk yield, fat and protein content with values of 5246-5504 kg, 3.92-3.94%; 3.10-3.12% respectively. The genotypic structure of the herd and the leading lines is identified. The Vis Back Aydiala bull line is characterized by a higher frequency of CSN3AA and LEPTT genotypes by 18.2 and 11.0%, in comparison with the Reflection Sovering line. For other genotypes, the differences do not reach the confidence threshold. The average level of homozygosity for the genes studied varies from 51.2% to 73.4%. The highest homozygosity was found for the CSN3 gene in the Vis Back Aydiala line at 79.6%. The number of effectively acting alleles is 1.66-1.72; the degree of genetic variability is 40.2-42.7%. The cows with CSN3AB genotype had 544.0 kg higher milk yield than homozygous animals for the A allele (p <0.05). The highest milk yield was observed in BLGAA animals - 6790.1 kg, which is 947.2 kg higher than in cows with the alternative BLGВВ genotype (p < 0.01). Animals with the LEPCC genotype outperformed LEPTT cows in milk yield by 718.7 kg. No priority genotypes were identified for the LALBA gene. Also, no connection has been established between genotypes and the quality indicators of milk.


2021 ◽  
Vol 8 ◽  
Author(s):  
S.Y. Wang ◽  
T. Guo

The identification of fatigue crack initiation sites (FCISs) is routinely performed in the field of engineering failure analyses; this process is not only time-consuming but also knowledge-intensive. The emergence of convolutional neural networks (CNNs) has inspired numerous innovative solutions for image analysis problems in interdisciplinary fields. As an explorative study, we trained models based on the principle of transfer learning using three state-of-the-art CNNs, namely VGG-16, ResNet-101, and feature pyramid network (FPN), as feature extractors, and a faster R-CNN as the backbone to establish models for FCISs detection. The models showed application-level detection performance, with the highest precision reaching up to 95.9% at a confidence threshold of 0.6. Among the three models, the ResNet model exhibited the highest accuracy and lowest training cost. The performance of the FPN model closely followed that of the ResNet model with an advantage in terms of the recall.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Abhishek Dixit ◽  
Akhilesh Tiwari ◽  
R. K. Gupta

The present paper proposes a new model for the exploration of hesitated patterns from multiple levels of conceptual hierarchy in the transactional dataset. The usual practice of mining patterns has focused on identifying frequent patterns (i.e., which occur together) in the transactional dataset but uncovers the vital information about the patterns which are almost frequent (but not exactly frequent) called “hesitated patterns.” The proposed model uses the reduced minimum support threshold (contains two values: attractiveness and hesitation) and constant minimum confidence threshold with the top-down progressive deepening approach for generating patterns and utilizing the apriori property. To validate the model, an online purchasing scenario of books through e-commerce-based online shopping platforms such as Amazon has been considered and shown that how the various factors contributed towards building hesitation to purchase a book at the time of purchasing. The present work suggests a novel way for deriving hesitated patterns from multiple levels in the conceptual hierarchy with respect to the target dataset. Moreover, it is observed that the concepts and theories available in the existing related work Lu and Ng (2007) are only focusing on the introductory aspect of vague set theory-based hesitation association rule mining, which is not useful for handling the patterns from multiple levels of granularity, while the proposed model is complete in nature and addresses the very significant and untouched problem of mining “multilevel hesitated patterns” and is certainly useful for exploring the hesitated patterns from multiple levels of granularity based on the considered hesitation status in a transactional dataset. These hesitated patterns can be further utilized by decision makers and business analysts to build the strategy on how to increase the attraction level of such hesitated items (appeared in a particular transaction/set of transactions in a given dataset) to convert their state from hesitated to preferred items.


Energies ◽  
2021 ◽  
Vol 14 (14) ◽  
pp. 4228
Author(s):  
Yan Xu ◽  
Mingyu Wang ◽  
Wen Fan

The fault data of the secondary system of smart substations hide some information that the association analysis algorithm can mine. The convergence speed of the Apriori algorithm and FP-growth algorithm is slow, and there is a lack of indicators to evaluate the correlation of association rules and the method to determine the parameter threshold. In this paper, the H-mine algorithm is used to realize the fast mining of fault data. The algorithm can traverse data faster by using the data structure of the H-struct. This paper also sets the lift and CF value to screen the association rules with good correlation. When setting the three key parameters of association analysis, namely, support threshold, confidence threshold, and lift threshold, an objective function composed of weighted average lift, CF value, and data coverage rate was selected, and the adaptive fireworks algorithm was used to optimize the parameters in the association analysis. In particular, the rule screening strategy is introduced in fault cause analysis in this paper. By eliminating rules with high similarity, derived signals in association rules are eliminated to the greatest extent to improve the readability of rules and ensure easy understanding of results.


2021 ◽  
Vol 14 (11) ◽  
pp. 2642-2654
Author(s):  
Abdulrahman Alsaudi ◽  
Yasser Altowim ◽  
Sharad Mehrotra ◽  
Yaming Yu

Social media analysis over blogs (such as tweets) often requires determining top-k mentions of a certain category (e.g., movies) in a collection (e.g., tweets collected over a given day). Such queries require entity linking (EL) function to be executed that is often expensive. We propose TQEL, a framework that minimizes the joint cost of EL calls and top-k query processing. The paper presents two variants - TQEL-exact and TQEL-approximate that retrieve the exact / approximate top-k results. TQEL-approximate, using a weaker stopping condition, achieves significantly improved performance (with the fraction of the cost of TQEL-exact) while providing strong probabilistic guarantees (over 2 orders of magnitude lower EL calls with 95% confidence threshold compared to TQEL-exact). TQEL-exact itself is orders of magnitude better compared to a naive approach that calls EL functions on the entire dataset.


Author(s):  
Alessandro Rovetta

Background: Alongside the COVID-19 pandemic, government authorities around the world have had to face a growing infodemic capable of causing serious damages to public health and economy. In this context, the use of infoveillance tools has become a primary necessity.Objective: The aim of this study is to test the reliability of a widely used infoveillance tool which is Google Trends. In particular, the paper focuses on the analysis of relative search volumes (RSVs) quantifying their dependence on the day they are collected.Methods: RSVs of the query coronavirus + covid during February 1—December 4, 2020 (period 1), and February 20—May 18, 2020 (period 2), were collected daily by Google Trends from December 8 to 27, 2020. The survey covered Italian regions and cities, and countries and cities worldwide. The search category was set to all categories. Each dataset was analyzed to observe any dependencies of RSVs from the day they were gathered. To do this, by calling i the country, region, or city under investigation and j the day its RSV was collected, a Gaussian distribution Xi=X(σi,x¯i) was used to represent the trend of daily variations of xij=RSVsij. When a missing value was revealed (anomaly), the affected country, region or city was excluded from the analysis. When the anomalies exceeded 20% of the sample size, the whole sample was excluded from the statistical analysis. Pearson and Spearman correlations between RSVs and the number of COVID-19 cases were calculated day by day thus to highlight any variations related to the day RSVs were collected. Welch’s t-test was used to assess the statistical significance of the differences between the average RSVs of the various countries, regions, or cities of a given dataset. Two RSVs were considered statistical confident when t<1.5. A dataset was deemed unreliable if the confident data exceeded 20% (confidence threshold). The percentage increase Δ was used to quantify the difference between two values.Results: Google Trends has been subject to an acceptable quantity of anomalies only as regards the RSVs of Italian regions (0% in both periods 1 and 2) and countries worldwide (9.7% during period 1 and 10.9% during period 2). However, the correlations between RSVs and COVID-19 cases underwent significant variations even in these two datasets (Max |Δ| = + 625% for Italian regions, and Max |Δ|= +175%  for countries worldwide). Furthermore, only RSVs of countries worldwide did not exceed confidence threshold. Finally, the large amount of anomalies registered in Italian and international cities’ RSVs made these datasets unusable for any kind of statistical inference.Conclusion: In the considered timespans, Google Trends has proved to be reliable only for surveys concerning RSVs of countries worldwide. Since RSVs values showed a high dependence on the day they were gathered, it is essential for future research that the authors collect queries’ data for several consecutive days and work with their RSVs averages instead of daily RSVs, trying to minimize the standard errors until an established confidence threshold is respected. Further research is needed to evaluate the effectiveness of this method.


2021 ◽  
Author(s):  
Tavish Eenjes ◽  
Yiheng Hu ◽  
Laszlo Irinyi ◽  
Minh Thuy Hoang ◽  
Lean Smith ◽  
...  

The increased usage of long-read sequencing for metabarcoding has not been matched with public databases suited for error-prone long-reads. We address this gap and present a proof-of-concept study for classifying fungal species using linked machine learning classifiers. We demonstrate its capability for accurate classification using a labelled fungal sequencing dataset of 44 species. We show the advantage of our approach for closely related species over current alignment and k-mer methods and suggest a confidence threshold of 0.85 to maximise accurate target species identification from complex samples of unknown composition. We suggest future use of this approach in medicine, agriculture, and biosecurity.


2021 ◽  
Author(s):  
Juan Chamie-Quintero ◽  
Jennifer A. Hibberd ◽  
David Scheim

Introduction. On May 8, 2020, Peru’s Ministry of Health approved ivermectin (IVM), a drug of Nobel Prize-honored distinction, for inpatient and outpatient treatment of COVID-19. As IVM treatments proceeded in that nation of 33 million residents, excess deaths decreased 14-fold over four months through December 1, 2020, consistent with clinical benefits of IVM for COVID-19 found in several RCTs. But after IVM use was sharply restricted under a new president, excess deaths then increased 13-fold. Methods. To evaluate possible IVM treatment effects suggested by these aggregate trends, excess deaths were analyzed by state for ages ≥ 60 in Peru’s 25 states. To identify potential confounding factors, Google mobility data, population densities, SARS-CoV-2 genetic variations and seropositivity rates were also examined.Results. The 25 states of Peru were grouped by extent of IVM distributions: maximal (mass IVM distributions through operation MOT, a broadside effort led by the army); medium (locally managed IVM distributions); and minimal (restrictive policies in one state, Lima). The mean reduction in excess deaths 30 days after peak deaths was 74% for the maximal IVM distribution group, 53% for the medium group and 25% for Lima. Reduction of excess deaths is correlated with extent of IVM distribution by state with a p value of 0.002 using the Kendall τb test, well below the confidence threshold of 0.05 for an established clinical effect.Conclusion. Mass treatments with IVM, a drug safely used in 3.7 billion doses worldwide since 1987, most likely caused the 14-fold reductions in excess deaths in Peru, prior to their 13-fold increase under reversed IVM policy. This strongly suggests that IVM treatments can likewise effectively complement immunizations to help eradicate COVID-19. The indicated biological mechanism of IVM, competitive binding with SARS-CoV-2 spike protein, is likely non-epitope specific, possibly yielding full efficacy against emerging viral mutant strains.


Sign in / Sign up

Export Citation Format

Share Document