The method for breast cancer grade prediction and pathway analysis based on improved multiple kernel learning

2017 ◽  
Vol 15 (01) ◽  
pp. 1650037 ◽  
Author(s):  
Tianci Song ◽  
Yan Wang ◽  
Wei Du ◽  
Sha Cao ◽  
Yuan Tian ◽  
...  

Breast cancer histologic grade represents the morphological assessment of the tumor’s malignancy and aggressiveness, which is vital in clinically planning treatment and estimating prognosis for patients. Therefore, the prediction of breast cancer grade can markedly elevate the detection of early breast cancer and efficiently guide its treatment. With the advent of high-throughput profiling technology, a large number of data of different types are rapidly generated, and each data provides its unique biological insight. Although many researches focused on cancer grade prediction, hardly most of them attempted to integrate multiple data types, by which we cannot only improve and boost results obtained from learning method, but also have a good understanding or explanation of biological issues. In this paper, we take advantage of a sophisticated supervised learning method called multiple kernel learning (MKL) to design a breast cancer grading predictor fusing heterogeneous data for classification of breast cancer histopathology. Furthermore, we modify our model by involving biological pathway information. The new model can evaluate the significance of various pathways in which differential expression genes fall between different breast cancer grades. The merits of the novel model are lucubration in bridging between omics data and various phenotypes of breast cancer grades, and providing an auxiliary method integrating omics data of cancer mechanism research. In experiments, the proposed method outperforms other state-of-the-art methods and has abundant biological interpretation in explaining differences between breast cancer grades.

Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 200 ◽  
Author(s):  
Mingxin Tao ◽  
Tianci Song ◽  
Wei Du ◽  
Siyu Han ◽  
Chunman Zuo ◽  
...  

It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.


Author(s):  
Peiyan Wang ◽  
Dongfeng Cai

Multiple kernel learning (MKL) aims at learning an optimal combination of base kernels with which an appropriate hypothesis is determined on the training data. MKL has its flexibility featured by automated kernel learning, and also reflects the fact that typical learning problems often involve multiple and heterogeneous data sources. Target kernel is one of the most important parts of many MKL methods. These methods find the kernel weights by maximizing the similarity or alignment between weighted kernel and target kernel. The existing target kernels implement a global manner, which (1) defines the same target value for closer and farther sample pairs, and inappropriately neglects the variation of samples; (2) is independent of training data, and is hardly approximated by base kernels. As a result, maximizing the similarity to the global target kernel could make these pre-specified kernels less effectively utilized, further reducing the classification performance. In this paper, instead of defining a global target kernel, a localized target kernel is calculated for each sample pair from the training data, which is flexible and able to well handle the sample variations. A new target kernel named empirical target kernel is proposed in this research to implement this idea, and three corresponding algorithms are designed to efficiently utilize the proposed empirical target kernel. Experiments are conducted on four challenging MKL problems. The results show that our algorithms outperform other methods, verifying the effectiveness and superiority of the proposed methods.


2017 ◽  
Author(s):  
Nisar Wani ◽  
Khalid Raza

ABSTRACTComputer aided diagnosis is gradually making its way into the domain of medical research and clinical diagnosis. With field of radiology and diagnostic imaging producing petabytes of image data. Machine learning tools, particularly kernel based algorithms seem to be an obvious choice to process and analyze this high dimensional and heterogeneous data. In this chapter, after presenting a breif description about nature of medical images, image features and basics in machine learning and kernel methods, we present the application of multiple kernel learning algorithms for medical image analysis.


Sign in / Sign up

Export Citation Format

Share Document