gene subset
Recently Published Documents


TOTAL DOCUMENTS

46
(FIVE YEARS 7)

H-INDEX

8
(FIVE YEARS 2)

2021 ◽  
Vol 11 ◽  
Author(s):  
Jiahao Gao ◽  
Fangdie Ye ◽  
Fang Han ◽  
Xiaoshuang Wang ◽  
Haowen Jiang ◽  
...  

PurposeTo construct a novel radiogenomics biomarker based on hypoxic-gene subset for the accurate prognostic prediction of clear cell renal cell carcinoma (ccRCC).Materials and MethodsInitially, we screened for the desired hypoxic-gene subset by analysis using the GSEA database. Through univariate and multivariate cox regression hazard ratio analysis, survival-related hypoxia genes were identified, and a genomics signature was constructed in the TCGA database. Building on this, a hypoxia-gene related radiogenomics biomarker (prediction of hypoxia-genes signature by contrast-enhanced CT radiomics) was constructed in the TCIA-KIRC database by extracting features in the venous phase of contrast-enhanced CT images, selecting features using the mRMR and LASSO algorithms, and building logistic regression models. Finally, we validated the prognostic capability of the new biomarker for patients with ccRCC in an independent validation cohort at Huashan Hospital of Fudan University, Shanghai, China.ResultsThe hypoxia-related genomics signature consisting of five genes (IFT57, PABPN1, RNF10, RNF19B and UBE2T) was shown to be significantly associated with survival for patients with ccRCC in the TCGA database, delineated by grouping of the signature expression as either low- or high-risk. In the TCIA database, we constructed a radiogenomics biomarker consisting of 13 radiomics features that were optimal predictors of hypoxia-gene signature expression levels (low- or high-risk) in patients at each institution, that demonstrated AUC values of 0.91 and 0.91 in the training and validation groups, respectively. In the independent validation cohort at Huashan Hospital, our radiogenomics biomarker was significantly associated with prognosis in patients with ccRCC (p=0.0059).ConclusionsThe novel prognostic radiogenomics biomarker that was constructed achieved excellent correlation with prognosis in both the cohort of TCGA/TCIA-KIRC database and the independent validation cohort of Huashan hospital patients with ccRCC. It is anticipated that this work may assist in clinical preferential treatment decisions and promote the process of precision theranostics in the future.


2021 ◽  
Vol 16 ◽  
Author(s):  
Yueling Xiong ◽  
Qingqing Li ◽  
Peipei Wang ◽  
Mingquan Ye

Background: Informative gene selection is an essential step in performing tumor classification. However, it is difficult to select informative genes related to tumors from large-scale gene expression profiles because of their characteristics, such as high dimensionality, relatively small samples, and class imbalance, and some genes being superfluous and irrelevant. Objective: Many researchers analyze and process gene expression data to obtain classified gene subsets by using machine learning methods. However, the gene expression profiles of tumors often have massive computational challenges. In addition, when improving feature importance and classification accuracy, cost estimation is often ignored in traditional feature selection algorithms, which makes tumor classification more difficult. Method: In this study, a novel informative gene selection method based on cost-sensitive fast correlation-based feature selection (CS-FCBF) is proposed. Results: First, the symmetric uncertainty index is used to evaluate the correlation between informative genes and class labels, and then a large number of irrelevant and redundant genes are quickly filtered according to importance. Thereby, a candidate gene subset is generated. Second, cost-sensitive learning, which introduces the misclassification cost matrix and support vector machine attribute evaluation, is used to obtain the top-ranked gene subset with minimum misclassification loss. Finally, the candidate gene subset is optimized. Conclusion: This experiment was verified in eight independent tumor datasets. By comparing and analyzing CS-FCBF with another three hybrids of typical gene selection algorithms combined with cost-sensitive learning, we found that the method proposed in this study exhibited a better classification performance with fewer selected genes, which might provide guidance in tumor diagnosis and research.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Giulio Testone ◽  
Anatoly Petrovich Sobolev ◽  
Giovanni Mele ◽  
Chiara Nicolodi ◽  
Maria Gonnella ◽  
...  

AbstractEndive (Cichorium endivia L.), a vegetable consumed as fresh or packaged salads, is mostly cultivated outdoors and known to be sensitive to waterlogging in terms of yield and quality. Phenotypic, metabolic and transcriptomic analyses were used to study variations in curly- (‘Domari’, ‘Myrna’) and smooth-leafed (‘Flester’, ‘Confiance’) cultivars grown in short-term waterlog due to rainfall excess before harvest. After recording loss of head weights in all cultivars (6-35%), which was minimal in ‘Flester’, NMR untargeted profiling revealed variations as influenced by genotype, environment and interactions, and included drop of total carbohydrates (6–50%) and polyols (3–37%), gain of organic acids (2–30%) and phenylpropanoids (98–560%), and cultivar-specific fluctuations of amino acids (−37 to +15%). The analysis of differentially expressed genes showed GO term enrichment consistent with waterlog stress and included the carbohydrate metabolic process. The loss of sucrose, kestose and inulin recurred in all cultivars and the sucrose-inulin route was investigated by covering over 50 genes of sucrose branch and key inulin synthesis (fructosyltransferases) and catabolism (fructan exohydrolases) genes. The lowered expression of a sucrose gene subset together with that of SUCROSE:SUCROSE-1-FRUCTOSYLTRANSFERASE (1-SST) may have accounted for sucrose and kestose contents drop in the leaves of waterlogged plants. Two anti-correlated modules harbouring candidate hub-genes, including 1-SST, were identified by weighted gene correlation network analysis, and proposed to control positively and negatively kestose levels. In silico analysis further pointed at transcription factors of GATA, DOF, WRKY types as putative regulators of 1-SST.


2020 ◽  
Author(s):  
Qian Gao ◽  
Huifang Zhang ◽  
Ximei Que ◽  
Yanfeng Xi ◽  
Tong Wang

Abstract Background Gene expression profiling (GEP) is considered as gold standard for cell-of-origin classification of diffuse large B-cell lymphoma (DLBCL). The high dimensionality of GEP limits its application in clinical practice. Penalized regression was commonly used to determine the optimal gene subset for classification in high dimensional gene data. However, the results of penalized regression methods were affected by the tuning parameters.Results To solve the instability of penalized regression methods, we proposed a strategy to measure the importance of variables with an aggregated index. This strategy was applied to six penalized methods to identify a small gene subset for DLBCL classification. Using a training dataset of 350 DLBCL patients, we identified six genes (MYBL1, TNFRSF13B, MAML3, CYB5R2, BATF, and S1PR2) as the optimal gene subset for DLBCL classification. The AUC was 0.9986 (95%CI 0.9967–1) and discrimination slope (DS) was 0.9442 (95%CI 0.9203–0.9661) in the training dataset. The discriminative performances were further validated in the external dataset with an AUC of 0.9455 (95%CI 0.9298–0.9612) and DS of 0.6211 (95%CI 0.5824–0.6591). Additionally, the calibration and clinical usefulness were apt in both datasets. Subgroups of patients characterized by these six genes showed significantly different prognosis. Furthermore, model comparisons demonstrated that the six-gene model outperformed models constructed by typical penalized regression methods.Conclusions The six genes had considerable clinical usefulness in DLBCL classification and prognosis. Penalized variable importance analysis is an efficient strategy to identify an optimal gene subset with good predictive performance.


Author(s):  
Qingfeng Zhao ◽  
Yulin Zhang

In this paper, we propose a novel ensemble gene selection method to obtain a gene subset. Then we provide a reverse construction method of gene network derived from expression profile data of the gene subset. The uncertainty coefficient based on information entropy are used to define the existence of logical relations among these genes. If the uncertainty coefficient between some genes exceeds predefined thresholds, the gene nodes will be connected by directed edges. Thus, a gene network is generated, which we define as gene logical network. This method is applied to the breast cancer data including control group and experimental group, with comparisons of the 2nd-order logic type distribution, average degree as well as average path length of the networks. It is found that these structures with different networks are quite distinct. By the comparison of the degree difference between control group and experimental group, the key genes are picked up. By defining the dynamics evolution rules of state transition based on the logical regulation among the key genes in the network, the dynamic behaviors for normal breast cells and cells with cancer of different stages are simulated numerically. Some of them are highly related to the development of breast cancer through literature inquiry. The study may provide a useful revelation to the biological mechanism in the formation and development of cancer.


2019 ◽  
Vol 14 (4) ◽  
pp. 353-358 ◽  
Author(s):  
Mohamed Nisper Fathima Fajila

Background: Cancer subtype identification is an active research field which helps in the diagnosis of various cancers with proper treatments. Leukemia is one such cancer with various subtypes. High throughput technologies such as Deoxyribo Nucleic Acid (DNA) microarray are highly active in the field of cancer detection and classification alternatively. Objective: Yet, a precise analysis is important in microarray data applications as microarray experiments provide huge amount of data. Gene selection techniques promote microarray usage in the field of medicine. The objective of gene selection is to select a small subset of genes, which are the most informative in classification. associations while known disease-lncRNA associations are required only. Method: In this study, multi-objective evolutionary algorithm is used for gene subset selection in Leukemia classification. An initial redundant and irrelevant gene removal is followed by multiobjective evolutionary based gene subset selection. Gene subset selection highly influences the perfect classification. Thus, selecting the appropriate algorithm for subset selection is important. Results: The performance of the proposed method is compared against the standard genetic algorithm and evolutionary algorithm. Three Leukemia microarray datasets were used to evaluate the performance of the proposed method. Perfect classification was achieved for all the datasets only with few significant genes using the proposed approach. Conclusion: Thus, it is obvious that the proposed study perfectly classifies Leukemia with only few significant genes.</P>


Sign in / Sign up

Export Citation Format

Share Document