scholarly journals Network-based cancer gene relationship prediction method reveals perturbations in the cancer gene network

2020 ◽  
Author(s):  
Jiajun Qiu ◽  
Kui Chen ◽  
Chunlong Zhong ◽  
Sihao Zhu ◽  
Xiao Ma

AbstractThe landscape of the gene relationship/network (such as activation, expression, phosphorylation, and binding) in cancer is found different from the general (non-disease) situation, and gene network perturbations are supposed to be the main cause of cancer. Thus, it makes no sense to use a regular gene relationship prediction method to map the cancer gene network. Here, we established a novel prediction method that we dubbed network-based cancer gene relationship (NECARE), which achieved a high performance with a Matthews correlation coefficient (MCC) = 0.71±0.01 and an F1 = 89±0.7%. Then, we investigated the cancer interactome atlas and revealed a large-scale perturbation in the gene network in cancer using NECARE. We found 2287 genes, which were named cancer hub genes, that were enriched with gene interaction perturbations, and over 56% of cancer treatment-related genes were hub genes. We further assessed the association of hub genes with the prognosis of 32 types of cancers and found that hub genes were significantly related to the cancer outcomes. Furthermore, the mutations occurring on residues that bind to macromolecules were overrepresented at cancer hub genes. By coimmunoprecipitation (co-IP), we confirmed that the NECARE prediction method was highly reliable and was 90% accurate. NECARE is available at: https://github.com/JiajunQiu/NECARE.

2015 ◽  
Vol 24 (05) ◽  
pp. 1550074 ◽  
Author(s):  
Ali A. El-Moursy ◽  
Wael S. Afifi ◽  
Fadi N. Sibai ◽  
Salwa M. Nassar

STRIKE is an algorithm which predicts protein–protein interactions (PPIs) and determines that proteins interact if they contain similar substrings of amino acids. Unlike other methods for PPI prediction, STRIKE is able to achieve reasonable improvement over the existing PPI prediction methods. Although its high accuracy as a PPI prediction method, STRIKE consumes a large execution time and hence it is considered to be a compute-intensive application. In this paper, we develop and implement a parallel STRIKE algorithm for high-performance computing (HPC) systems. Using a large-scale cluster, the execution time of the parallel implementation of this bioinformatics algorithm was reduced from about a week on a serial uniprocessor machine to about 16.5 h on 16 computing nodes, down to about 2 h on 128 parallel nodes. Communication overheads between nodes are thoroughly studied.


2021 ◽  
Vol 22 (15) ◽  
pp. 8027
Author(s):  
Yang Yang ◽  
Lianjie Zeng ◽  
Mauno Vihinen

Genetic variations have a multitude of effects on proteins. A substantial number of variations affect protein–solvent interactions, either aggregation or solubility. Aggregation is often related to structural alterations, whereas solubilizable proteins in the solid phase can be made again soluble by dilution. Solubility is a central protein property and when reduced can lead to diseases. We developed a prediction method, PON-Sol2, to identify amino acid substitutions that increase, decrease, or have no effect on the protein solubility. The method is a machine learning tool utilizing gradient boosting algorithm and was trained on a large dataset of variants with different outcomes after the selection of features among a large number of tested properties. The method is fast and has high performance. The normalized correct prediction rate for three states is 0.656, and the normalized GC2 score is 0.312 in 10-fold cross-validation. The corresponding numbers in the blind test were 0.545 and 0.157. The performance was superior in comparison to previous methods. The PON-Sol2 predictor is freely available. It can be used to predict the solubility effects of variants for any organism, even in large-scale projects.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jorge Francisco Cutigi ◽  
Adriane Feijo Evangelista ◽  
Rui Manuel Reis ◽  
Adenilso Simao

AbstractIdentifying significantly mutated genes in cancer is essential for understanding the mechanisms of tumor initiation and progression. This task is a key challenge since large-scale genomic studies have reported an endless number of genes mutated at a shallow frequency. Towards uncovering infrequently mutated genes, gene interaction networks combined with mutation data have been explored. This work proposes Discovering Significant Cancer Genes (DiSCaGe), a computational method for discovering significant genes for cancer. DiSCaGe computes a mutation score for the genes based on the type of mutations they have. The influence received for their neighbors in the network is also considered and obtained through an asymmetric spreading strength applied to a consensus gene network. DiSCaGe produces a ranking of prioritized possible cancer genes. An experimental evaluation with six types of cancer revealed the potential of DiSCaGe for discovering known and possible novel significant cancer genes.


Author(s):  
C.K. Wu ◽  
P. Chang ◽  
N. Godinho

Recently, the use of refractory metal silicides as low resistivity, high temperature and high oxidation resistance gate materials in large scale integrated circuits (LSI) has become an important approach in advanced MOS process development (1). This research is a systematic study on the structure and properties of molybdenum silicide thin film and its applicability to high performance LSI fabrication.


Author(s):  
В.В. ГОРДЕЕВ ◽  
В.Е. ХАЗАНОВ

При выборе типа доильной установки и ее размера необходимо учитывать максимальное планируемое поголовье дойных коров и размер технологической группы, кратность и время одного доения, продолжительность рабочей смены дояров. Анализ технико-экономических показателей наиболее распространенных на сегодняшний день типов доильных установок одинакового технического уровня свидетельствует, что наилучшие удельные показатели имеет установка типа «Карусель» (1), а установка типа «Елочка» (2) требует более высоких затрат труда и средств. Установка «Параллель» (3) занимает промежуточное положение. Из анализа пропускной способности и количества необходимых операторов: установка 2 рекомендована для ферм с поголовьем дойного стада до 600 голов, 3 — не более 1200 дойных коров, 1 — более 1200 дойных коров. «Карусель» — наиболее рациональный, высокопроизводительный, легко автоматизируемый и, следовательно, перспективный способ доения в залах, особенно для крупных молочных ферм. The choice of the proper type and size of milking installations needs to take into account the maximum planned number of dairy cows, the size of a technological group, the number of milkings per day, and the duration of one milking and the operator's working shift. The analysis of technical and economic indicators of currently most common types of milking machines of the same technical level revealed that the Carousel installation had the best specific indicators while the Herringbone installation featured higher labour inputs and cash costs. The Parallel installation was found somewhere in between. In terms of the throughput and the required number of operators Herringbone is recommended for farms with up to 600 dairy cows, Parallel — below 1200 dairy cows, Carousel — above 1200 dairy cows. Carousel was found the most practical, high-performance, easily automated and, therefore, promising milking system for milking parlours, especially on the large-scale dairy farms.


Author(s):  
Mark Endrei ◽  
Chao Jin ◽  
Minh Ngoc Dinh ◽  
David Abramson ◽  
Heidi Poxon ◽  
...  

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.


Radiation ◽  
2021 ◽  
Vol 1 (2) ◽  
pp. 79-94
Author(s):  
Peter K. Rogan ◽  
Eliseos J. Mucaki ◽  
Ben C. Shirley ◽  
Yanxin Li ◽  
Ruth C. Wilkins ◽  
...  

The dicentric chromosome (DC) assay accurately quantifies exposure to radiation; however, manual and semi-automated assignment of DCs has limited its use for a potential large-scale radiation incident. The Automated Dicentric Chromosome Identifier and Dose Estimator (ADCI) software automates unattended DC detection and determines radiation exposures, fulfilling IAEA criteria for triage biodosimetry. This study evaluates the throughput of high-performance ADCI (ADCI-HT) to stratify exposures of populations in 15 simulated population scale radiation exposures. ADCI-HT streamlines dose estimation using a supercomputer by optimal hierarchical scheduling of DC detection for varying numbers of samples and metaphase cell images in parallel on multiple processors. We evaluated processing times and accuracy of estimated exposures across census-defined populations. Image processing of 1744 samples on 16,384 CPUs required 1 h 11 min 23 s and radiation dose estimation based on DC frequencies required 32 sec. Processing of 40,000 samples at 10 exposures from five laboratories required 25 h and met IAEA criteria (dose estimates were within 0.5 Gy; median = 0.07). Geostatistically interpolated radiation exposure contours of simulated nuclear incidents were defined by samples exposed to clinically relevant exposure levels (1 and 2 Gy). Analysis of all exposed individuals with ADCI-HT required 0.6–7.4 days, depending on the population density of the simulation.


Antioxidants ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 843
Author(s):  
Tamara Ortiz ◽  
Federico Argüelles-Arias ◽  
Belén Begines ◽  
Josefa-María García-Montes ◽  
Alejandra Pereira ◽  
...  

The best conservation method for native Chilean berries has been investigated in combination with an implemented large-scale extract of maqui berry, rich in total polyphenols and anthocyanin to be tested in intestinal epithelial and immune cells. The methanolic extract was obtained from lyophilized and analyzed maqui berries using Folin–Ciocalteu to quantify the total polyphenol content, as well as 2,2-diphenyl-1-picrylhydrazyl (DPPH), ferric reducing antioxidant power (FRAP), and oxygen radical absorbance capacity (ORAC) to measure the antioxidant capacity. Determination of maqui’s anthocyanins profile was performed by ultra-high-performance liquid chromatography (UHPLC-MS/MS). Viability, cytotoxicity, and percent oxidation in epithelial colon cells (HT-29) and macrophages cells (RAW 264.7) were evaluated. In conclusion, preservation studies confirmed that the maqui properties and composition in fresh or frozen conditions are preserved and a more efficient and convenient extraction methodology was achieved. In vitro studies of epithelial cells have shown that this extract has a powerful antioxidant strength exhibiting a dose-dependent behavior. When lipopolysaccharide (LPS)-macrophages were activated, noncytotoxic effects were observed, and a relationship between oxidative stress and inflammation response was demonstrated. The maqui extract along with 5-aminosalicylic acid (5-ASA) have a synergistic effect. All of the compiled data pointed out to the use of this extract as a potential nutraceutical agent with physiological benefits for the treatment of inflammatory bowel disease (IBD).


Author(s):  
Jianglin Feng ◽  
Nathan C Sheffield

Abstract Summary Databases of large-scale genome projects now contain thousands of genomic interval datasets. These data are a critical resource for understanding the function of DNA. However, our ability to examine and integrate interval data of this scale is limited. Here, we introduce the integrated genome database (IGD), a method and tool for searching genome interval datasets more than three orders of magnitude faster than existing approaches, while using only one hundredth of the memory. IGD uses a novel linear binning method that allows us to scale analysis to billions of genomic regions. Availability https://github.com/databio/IGD


Sign in / Sign up

Export Citation Format

Share Document