Author(s):  
Ze Zhang ◽  
Qingzhao Zhang ◽  
Brandon Nguyen ◽  
Sanjay Sri Vallabh Singapuram ◽  
Z. Morley Mao ◽  
...  

GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Ariel Rokem ◽  
Kendrick Kay

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.


Cancers ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 1045
Author(s):  
Marta B. Lopes ◽  
Eduarda P. Martins ◽  
Susana Vinga ◽  
Bruno M. Costa

Network science has long been recognized as a well-established discipline across many biological domains. In the particular case of cancer genomics, network discovery is challenged by the multitude of available high-dimensional heterogeneous views of data. Glioblastoma (GBM) is an example of such a complex and heterogeneous disease that can be tackled by network science. Identifying the architecture of molecular GBM networks is essential to understanding the information flow and better informing drug development and pre-clinical studies. Here, we review network-based strategies that have been used in the study of GBM, along with the available software implementations for reproducibility and further testing on newly coming datasets. Promising results have been obtained from both bulk and single-cell GBM data, placing network discovery at the forefront of developing a molecularly-informed-based personalized medicine.


2015 ◽  
Vol 6 (1) ◽  
pp. 59-66 ◽  
Author(s):  
Jianbo Wang ◽  
Zhenqing Ye ◽  
Tim H.-M. Huang ◽  
Huidong Shi ◽  
Victor Jin

AbstractAlternative splicing is widely recognized for its roles in regulating genes and creating gene diversity. Consequently the identification and quantification of differentially spliced transcripts is pivotal for transcriptome analysis. Here, we review the currently available computational approaches for the analysis of RNA-sequencing data with a focus on exon-skipping events of alternative splicing and discuss the novelties as well as challenges faced to perform differential splicing analyses. In accordance with operational needs we have classified the software tools, which may be instrumental for a specific analysis based on the experimental objectives and expected outcomes. In addition, we also propose a framework for future directions by pinpointing more extensive experimental validation to assess the accuracy of the software predictions and improvements that would facilitate visualizations, data processing, and downstream analyses along with their associated software implementations.


2010 ◽  
Vol 278 (1704) ◽  
pp. 474-479 ◽  
Author(s):  
Dan Dediu

Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic diversity and a window into the emergence and evolution of language. In particular, it has recently been proposed that linguistic tone —the usage of voice pitch to convey lexical and grammatical meaning—is biased by two genes involved in brain growth and development, ASPM and Microcephalin . This hypothesis predicts that tone is a stable characteristic of language because of its ‘genetic anchoring’. The present paper tests this prediction using a Bayesian phylogenetic framework applied to a large set of linguistic features and language families, using multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices. The results of these different methods and datasets show a large agreement, suggesting that this approach produces reliable estimates of the stability of linguistic data. Moreover, linguistic tone is found to be stable across methods and datasets, providing suggestive support for the hypothesis of genetic influences on its distribution.


2008 ◽  
Vol 8 (4) ◽  
pp. 789-794 ◽  
Author(s):  
J. Vila ◽  
R. Ortiz ◽  
M. Tárraga ◽  
R. Macià ◽  
A. García ◽  
...  

Abstract. This paper presents the development and applications of a software-based quality control system that monitors volcano activity in near-real time. On the premise that external seismic manifestations provide information directly related to the internal status of a volcano, here we analyzed variations in background seismic noise. By continuous analysis of variations in seismic waveforms, we detected clear indications of changes in the internal status. The application of this method to data recorded in Villarrica (Chile) and Tungurahua (Ecuador) volcanoes demonstrates that it is suitable to be used as a forecasting tool. A recent application of this developed software-based quality control to the real-time monitoring of Teide – Pico Viejo volcanic complex (Spain) anticipated external episodes of volcanic activity, thus corroborating the advantages and capacity of the methodology when implemented as an automatic real-time procedure.


Sign in / Sign up

Export Citation Format

Share Document