Machine Learning for Gastric Cancer Detection

2020 ◽  
Vol 9 (2) ◽  
pp. 48-58
Author(s):  
Abraham Pouliakis ◽  
Periklis Foukas ◽  
Konstantinos Triantafyllou ◽  
Niki Margari ◽  
Efrossyni Karakitsou ◽  
...  

The objective of this study is the investigation of the potential value of a logistic regression model for the classification of gastric cytological data. The model was based on the morphological features of cell nuclei. The aim was the discrimination of benign from malignant nuclei and subsequently patients. Cytological images of gastric smears were analyzed by an image analysis system capable to extract cell nuclear features. Measurements from 50% of the patients were selected as a training set for model creation, while the measurements from the remaining patients were used as test set to validate the results. Furthermore, a model for the classification of individual patients, based on the classification of their cell nuclei has been developed. This approach set gave a correct classification at the level of 90% on the training and test sets on the nuclear level. Concluding the application of morphometric feature selection in combination with logistic regression may offer useful and complementary information about the potential of malignancy of gastric nuclei and patient cases.

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 3044-3044
Author(s):  
David Haan ◽  
Anna Bergamaschi ◽  
Yuhong Ning ◽  
William Gibb ◽  
Michael Kesling ◽  
...  

3044 Background: Epigenomics assays have recently become popular tools for identification of molecular biomarkers, both in tissue and in plasma. In particular 5-hydroxymethyl-cytosine (5hmC) method, has been shown to enable the epigenomic regulation of gene expression and subsequent gene activity, with different patterns, across several tumor and normal tissues types. In this study we show that 5hmC profiles enable discrete classification of tumor and normal tissue for breast, colorectal, lung ovary and pancreas. Such classification was also recapitulated in cfDNA from patient with breast, colorectal, lung, ovarian and pancreatic cancers. Methods: DNA was isolated from 176 fresh frozen tissues from breast, colorectal, lung, ovary and pancreas (44 per tumor per tissue type and up to 11 tumor tissues for each stage (I-IV)) and up to 10 normal tissues per tissue type. cfDNA was isolated from plasma from 783 non-cancer individuals and 569 cancer patients. Plasma-isolated cfDNA and tumor genomic DNA, were enriched for the 5hmC fraction using chemical labelling, sequenced, and aligned to a reference genome to construct features sets of 5hmC patterns. Results: 5hmC multinomial logistic regression analysis was employed across tumor and normal tissues and identified a set of specific and discrete tumor and normal tissue gene-based features. This indicates that we can classify samples regardless of source, with a high degree of accuracy, based on tissue of origin and also distinguish between normal and tumor status.Next, we employed a stacked ensemble machine learning algorithm combining multiple logistic regression models across diverse feature sets to the cfDNA dataset composed of 783 non cancers and 569 cancers comprising 67 breast, 118 colorectal, 210 Lung, 71 ovarian and 100 pancreatic cancers. We identified a genomic signature that enable the classification of non-cancer versus cancers with an outer fold cross validation sensitivity of 49% (CI 45%-53%) at 99% specificity. Further, individual cancer outer fold cross validation sensitivity at 99% specificity, was measured as follows: breast 30% (CI 119% -42%); colorectal 41% (CI 32%-50%); lung 49% (CI 42%-56%); ovarian 72% (CI 60-82%); pancreatic 56% (CI 46%-66%). Conclusions: This study demonstrates that 5hmC profiles can distinguish cancer and normal tissues based on their origin. Further, 5hmC changes in cfDNA enables detection of the several cancer types: breast, colorectal, lung, ovarian and pancreatic cancers. Our technology provides a non-invasive tool for cancer detection with low risk sample collection enabling improved compliance than current screening methods. Among other utilities, we believe our technology could be applied to asymptomatic high-risk individuals thus enabling enrichment for those subjects that most need a diagnostic imaging follow up.


1998 ◽  
Vol 18 (2) ◽  
pp. 229-235 ◽  
Author(s):  
◽  
Jack V. Tu ◽  
Milton C. Weinstein ◽  
Barbara J. McNeil ◽  
C. David Naylor

Objective. To compare the abilities of artificial neural network and logistic regression models to predict the risk of in-hospital mortality after coronary artery bypass graft (CABG) surgery. Methods. Neural network and logistic regression models were developed using a training set of 4,782 patients undergoing CABG surgery in Ontario, Canada, in 1991, and they were validated in two test sets of 5,309 and 5,517 patients having CABG surgery in 1992 and 1993, respectively. Results. The probabilities predicted from a fully trained neural network were similar to those of a “saturated” regression model, with both models detecting all possible interactions in the training set and validating poorly in the two test sets. A second neural network was developed by cross-validating a network against a new set of data and terminating network training early to create a more generalizable model. A simple “main effects” regression model without any interaction terms was also developed. Both of these models validated well, with areas under the receiver operating characteristic curves of 0.78 and 0.77 (p > 0.10) in the 1993 test set. The predictions from the two models were very highly correlated (r = 0.95). Conclusions. Artificial neural networks and logistic regression models learn similar relationships between patient characteristics and mortality after CABG surgery.


1997 ◽  
Vol 87 (2) ◽  
pp. 203-211 ◽  
Author(s):  
P.J.D. Weeks ◽  
I.D. Gauld ◽  
K.J. Gaston ◽  
M.A. O'Neill

AbstractIn this paper we describe a semi-automated digital image analysis system which is capable of discriminating five closely related species of Ichneumonidae. Specimens were distinguished by differences in their wings. The system functions by (a) extracting the significant variation (principal components) among a training set of images of the same species, (b) using these principal components to efficiently represent the morphology of wings of that species, and (c) exploiting the fact that images of the same species will share characteristic principal components, while images of different species will not. Such an approach allows the construction of modular species classifiers, to which like species correlate strongly, while dissimilar species do not. A recognition accuracy of 94% was achieved when the system was tested on 175 images of wings of the five ichneumonids. The wing images were caricatured to accentuate their venation and pigmentation patterns.


2019 ◽  
Vol 8 (4) ◽  
pp. 38-54 ◽  
Author(s):  
Abraham Pouliakis ◽  
Niki Margari ◽  
Effrosyni Karakitsou ◽  
George Valasoulis ◽  
Nektarios Koufopoulos ◽  
...  

Objective of this study is to investigate the potential of an artificial intelligence (AI) technique, based on competitive learning, for the discrimination of benign from malignant endometrial nuclei and lesions. For this purpose, 416 liquid-based cytological smears with histological confirmation were collected, each smear corresponded to one patient. From each smear was extracted nuclear morphometric features by the application of an image analysis system. Subsequently nuclei measurement from 50% of the cases were used to train the AI system to classify each individual nucleus as benign or malignant. The remaining measurement, from the unused 50% of the cases, were used for AI system performance evaluation. Based on the results of nucleus classification the patients were discriminated as having benign or malignant disease by a secondary subsystem specifically trained for this purpose. Based on the results it was conclude that AI based computerized systems have the potential for the classification of both endometrial nuclei and lesions.


2012 ◽  
Vol 35 (4) ◽  
pp. 297-303
Author(s):  
Magdalena Styczeń ◽  
Joanna Szpor ◽  
Sergiusz Demczuk ◽  
Krzysztof Okoń

Background: Marginal zone lymphomas are indolent B-cell lymphomas associated with autoimmunity and chronic inflammation. The two most frequent variants are mucosa associated lymphoid tissues marginal zone lymphomas and splenic marginal zone lymphomas. The aim of the study was to determine if it is possible to classify splenic and gastric lymphomas according to karyometric features.Methods: The material consisted of 16 splenic and 14 gastric lymphomas. The measurements were done with the AnalySIS image analysis system. In each case at least 100 nuclei were selected, and 19 different geometric parameters were measured.Results: On statistical analysis, the nuclei of splenic and gastric lymphomas showed differences in most parameters, but significant overlap of the values was present. Neural networks were trained and used for classification of the data. By this method, the nuclei were properly classified with a sensitivity of 0.75 and specificity of 0.71. In addition, in all the cases the majority of the nuclei were properly classified, thus allowing correct classification of all the cases into “splenic” or “gastric”.Conclusion: These results support the view that mucosa-associated lymphoid tissue lymphomas and splenic marginal-zone lymphomas are separate entities.


2020 ◽  
Vol 8 (5) ◽  
pp. 4685-4690

Logistic regression is most popular techniques incorporated in traditional statistics. Usually, this regression is applicable when the dependent variable is of categorical binary in nature. In the field of Statistics and Machine learning, classification of data is critical to discriminate to which set of clusters a new observation belongs, in the base of training set of a data containing observation whose group relationship is known. In this paper, we are focusing on the concepts of Logistic regression and classification tree. A large data taken from UCI (Machine learning Repository) incorporated for this research work. The aim of study is to distinguish the results obtained from Logistic regression and decision tree. At the end, decision tree gives better results than Logistic regression.


1979 ◽  
Vol 27 (1) ◽  
pp. 613-620 ◽  
Author(s):  
J H Tucker

An experimental computer/image analysis system has been used to investigate cytology automation techniques based on nuclear DNA measurement and morphological artefact rejector tests. The system automatically measures and normalizes the integrated optical density of cell nuclei in specially prepared cervical cytology specimens, and selects any objects with abnormally high values for further analysis. These are then analyzed by morphological and densitometric tests designed to eliminate false positive signals caused by non-nuclear artefacts. The coordinates of the remaining abnormal nuclei are recorded so that they can subsequently be relocated and examined by a cytotechnician. Preliminary results are given showing the measurement accuracy of the system and the performance of the artefact rejection tests.


Sign in / Sign up

Export Citation Format

Share Document