scholarly journals Harnessing population-specific protein truncating variants to improve the annotation of loss-of-function alleles

2020 ◽  
Author(s):  
Rostislav K. Skitchenko ◽  
Julia S. Kornienko ◽  
Evgeniia M. Maksiutenko ◽  
Andrey S. Glotov ◽  
Alexander V. Predeus ◽  
...  

AbstractAccurate annotation of putative loss-of-function (pLoF) variants is an important problem in human genomics and disease, which recently drew substantial attention. Since such variants in disease-related genes are under strong negative selection, their frequency across major ancestral groups is expected to be highly similar. In this study, we tested this assumption by systematically assessing the presence of highly population-specific protein-truncating variants (PTVs) in human genes using latest population-scale data. We discovered an unexpectedly high incidence of population-specific PTVs in all major ancestral groups. This does not conform to a recently proposed model, indicating either systemic differences in disease penetrance in different human populations, or a failure of current annotation criteria to accurately predict the loss-of-function potential of PTVs. We show that low-confidence pLoF variants are enriched in genes with non-uniform PTV count distribution, and developed a computational tool called LoFfeR that can efficiently predict tolerated pLoF variants. To evaluate the performance of LoFfeR, we use a set of known pathogenic and benign PTVs from the ClinVar database, and show that LoFfeR allows for a more accurate annotation of low-confidence pLoF variants compared to existing methods. Notably, only 4.4% of protein-truncating gnomAD SNPs in canonical transcripts can be filtered out using a recommended threshold value of the recently proposed pext score, while up to 10.9% of such variants are filtered using LoFfeR with the same false positive rate. Hence, we believe that LoFfeR can be used for additional filtering of low-confidence pLoF variants in population genomics and medical genetics studies.

2014 ◽  
Author(s):  
Sune Pletscher-Frankild ◽  
Albert Pallejà ◽  
Kalliopi Tsafou ◽  
Janos X Binder ◽  
Lars Juhl Jensen

Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease–gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease–gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a user-friendly web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Jian Kang ◽  
Mei Yang ◽  
Junyao Zhang

We propose using multiple observed features of network traffic to identify new high-distributed low-rate quality of services (QoS) violation so that detection accuracy may be further improved. For the multiple observed features, we chooseF featurein TCP packet header as a microscopic feature and,P featureandD featureof network traffic as macroscopic features. Based on these features, we establishmultistream fused hidden Markov model(MF-HMM) to detect stealthy low-rate denial of service (LDoS) attacks hidden in legitimate network background traffic. In addition, the threshold value is dynamically adjusted by using Kaufman algorithm. Our experiments show that the additive effect of combining multiple features effectively reduces the false-positive rate. The average detection rate of MF-HMM results in a significant 23.39% and 44.64% improvement over typical power spectrum density (PSD) algorithm and nonparametric cumulative sum (CUSUM) algorithm.


2013 ◽  
Vol 5 (2) ◽  
pp. 94-97
Author(s):  
Dr. Vinod Kumar ◽  
Mr Sandeep Agarwal ◽  
Mr Avtar Singh

In this paper, we propose to design a cross-layer based intrusion detection technique for wireless networks. In this technique a combined weight value is computed from the Received Signal Strength (RSS) and Time Taken for RTS-CTS handshake between sender and receiver (TT). Since it is not possible for an attacker to assume the RSS exactly for a sender by a receiver, it is an useful measure for intrusion detection. We propose that we can develop a dynamic profile for the communicating nodes based on their RSS values through monitoring the RSS values periodically for a specific Mobile Station (MS) or a Base Station (BS) from a server. Monitoring observed TT values at the server provides a reliable passive detection mechanism for session hijacking attacks since it is an unspoofable parameter related to its measuring entity. If the weight value is greater than a threshold value, then the corresponding node is considered as an attacker. By suitably adjusting the threshold value and the weight constants, we can reduce the false positive rate, significantly. By simulation results, we show that our proposed technique attains low misdetection ratio and false positive rate while increasing the packet delivery ratio.


2020 ◽  
Vol 117 (24) ◽  
pp. 13626-13636 ◽  
Author(s):  
Antonio Rausell ◽  
Yufei Luo ◽  
Marie Lopez ◽  
Yoann Seeleuthner ◽  
Franck Rapaport ◽  
...  

Humans homozygous or hemizygous for variants predicted to cause a loss of function (LoF) of the corresponding protein do not necessarily present with overt clinical phenotypes. We report here 190 autosomal genes with 207 predicted LoF variants, for which the frequency of homozygous individuals exceeds 1% in at least one human population from five major ancestry groups. No such genes were identified on the X and Y chromosomes. Manual curation revealed that 28 variants (15%) had been misannotated as LoF. Of the 179 remaining variants in 166 genes, only 11 alleles in 11 genes had previously been confirmed experimentally to be LoF. The set of 166 dispensable genes was enriched in olfactory receptor genes (41 genes). The 41 dispensable olfactory receptor genes displayed a relaxation of selective constraints similar to that observed for other olfactory receptor genes. The 125 dispensable nonolfactory receptor genes also displayed a relaxation of selective constraints consistent with greater redundancy. Sixty-two of these 125 genes were found to be dispensable in at least three human populations, suggesting possible evolution toward pseudogenes. Of the 179 LoF variants, 68 could be tested for two neutrality statistics, and 8 displayed robust signals of positive selection. These latter variants included a knownFUT2variant that confers resistance to intestinal viruses, and anAPOL3variant involved in resistance to parasitic infections. Overall, the identification of 166 genes for which a sizeable proportion of humans are homozygous for predicted LoF alleles reveals both redundancies and advantages of such deficiencies for human survival.


2014 ◽  
Vol 519-520 ◽  
pp. 245-249
Author(s):  
Mei Yang ◽  
Jian Kang

In order to maintain high network QoS (quality of service) against new high-distributed low-rate QoS violation, this paper proposes a novel recognition scheme with the consideration of multiple network features in both macro and micro side. This scheme uses Multi-stream Fused Hidden Markov Model (MF-HMM) in automatic low-rate QoS violation recognition for integrating multi-features simultaneously. The multi-features include the I-I-P triple and TCP header control Flag in a data packet at a micro level, and R feature in network flow at a macro level. In addition, based on the successful experience of Load-Shedding, Kaufman algorithm is used to adjust and upgrade threshold value dynamically. Our experiments show that our approach effectively reduces false-positive rate and false-negative rate. Moreover, it has a high recognition rate specifically for new QoS violation by High-Distributed Low-rate Denial of Service attacks.


2020 ◽  
Author(s):  
David S.M. Lee ◽  
Joseph Park ◽  
Andrew Kromer ◽  
Daniel J. Rader ◽  
Marylyn D. Ritchie ◽  
...  

ABSTRACTRibosome-profiling has uncovered pervasive translation in 5’UTRs, however the biological significance of this phenomenon remains unclear. Using genetic variation from 71,702 human genomes, we assess patterns of selection in translated upstream open reading frames (uORFs) in 5’UTRs. We show that uORF variants introducing new stop codons, or strengthening existing stop codons, are under strong negative selection comparable to protein-coding missense variants. Using these variants, we map and validate new gene-disease associations in two independent biobanks containing exome sequencing from 10,900 and 32,268 individuals respectively, and demonstrate their impact on gene expression in human cells. Our results establish new mechanisms relating uORF variation to loss-of-function of downstream genes, and demonstrate that translated uORFs are genetically constrained regulatory elements in 40% of human genes.


2021 ◽  
Author(s):  
Jonas Meisner ◽  
Anders Albrechtsen ◽  
Kristian Hanghøj

1AbstractIdentification of selection signatures between populations is often an important part of a population genetic study. Leveraging high-throughput DNA sequencing larger sample sizes of populations with similar ancestries has become increasingly common. This has led to the need of methods capable of identifying signals of selection in populations with a continuous cline of genetic differentiation. Individuals from continuous populations are inherently challenging to group into meaningful units which is why existing methods rely on principal components analysis for inference of the selection signals. These existing methods require called genotypes as input which is problematic for studies based on low-coverage sequencing data. Here, we present two selections statistics which we have implemented in the PCAngsd framework. These methods account for genotype uncertainty, opening for the opportunity to conduct selection scans in continuous populations from low and/or variable coverage sequencing data. To illustrate their use, we applied the methods to low-coverage sequencing data from human populations of East Asian and European ancestries and show that the implemented selection statistics can control the false positive rate and that they identify the same signatures of selection from low-coverage sequencing data as state-of-the-art software using high quality called genotypes. Moreover, we show that PCAngsd outperform selection statistics obtained from called genotypes from low-coverage sequencing data.


2002 ◽  
Vol 41 (01) ◽  
pp. 37-41 ◽  
Author(s):  
S. Shung-Shung ◽  
S. Yu-Chien ◽  
Y. Mei-Due ◽  
W. Hwei-Chung ◽  
A. Kao

Summary Aim: Even with careful observation, the overall false-positive rate of laparotomy remains 10-15% when acute appendicitis was suspected. Therefore, the clinical efficacy of Tc-99m HMPAO labeled leukocyte (TC-WBC) scan for the diagnosis of acute appendicitis in patients presenting with atypical clinical findings is assessed. Patients and Methods: Eighty patients presenting with acute abdominal pain and possible acute appendicitis but atypical findings were included in this study. After intravenous injection of TC-WBC, serial anterior abdominal/pelvic images at 30, 60, 120 and 240 min with 800k counts were obtained with a gamma camera. Any abnormal localization of radioactivity in the right lower quadrant of the abdomen, equal to or greater than bone marrow activity, was considered as a positive scan. Results: 36 out of 49 patients showing positive TC-WBC scans received appendectomy. They all proved to have positive pathological findings. Five positive TC-WBC were not related to acute appendicitis, because of other pathological lesions. Eight patients were not operated and clinical follow-up after one month revealed no acute abdominal condition. Three of 31 patients with negative TC-WBC scans received appendectomy. They also presented positive pathological findings. The remaining 28 patients did not receive operations and revealed no evidence of appendicitis after at least one month of follow-up. The overall sensitivity, specificity, accuracy, positive and negative predictive values for TC-WBC scan to diagnose acute appendicitis were 92, 78, 86, 82, and 90%, respectively. Conclusion: TC-WBC scan provides a rapid and highly accurate method for the diagnosis of acute appendicitis in patients with equivocal clinical examination. It proved useful in reducing the false-positive rate of laparotomy and shortens the time necessary for clinical observation.


1993 ◽  
Vol 32 (02) ◽  
pp. 175-179 ◽  
Author(s):  
B. Brambati ◽  
T. Chard ◽  
J. G. Grudzinskas ◽  
M. C. M. Macintosh

Abstract:The analysis of the clinical efficiency of a biochemical parameter in the prediction of chromosome anomalies is described, using a database of 475 cases including 30 abnormalities. A comparison was made of two different approaches to the statistical analysis: the use of Gaussian frequency distributions and likelihood ratios, and logistic regression. Both methods computed that for a 5% false-positive rate approximately 60% of anomalies are detected on the basis of maternal age and serum PAPP-A. The logistic regression analysis is appropriate where the outcome variable (chromosome anomaly) is binary and the detection rates refer to the original data only. The likelihood ratio method is used to predict the outcome in the general population. The latter method depends on the data or some transformation of the data fitting a known frequency distribution (Gaussian in this case). The precision of the predicted detection rates is limited by the small sample of abnormals (30 cases). Varying the means and standard deviations (to the limits of their 95% confidence intervals) of the fitted log Gaussian distributions resulted in a detection rate varying between 42% and 79% for a 5% false-positive rate. Thus, although the likelihood ratio method is potentially the better method in determining the usefulness of a test in the general population, larger numbers of abnormal cases are required to stabilise the means and standard deviations of the fitted log Gaussian distributions.


2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


Sign in / Sign up

Export Citation Format

Share Document