A Survey of Crowdsourcing in Medical Image Analysis

Human Computation ◽

10.15346/hc.v7i1.1 ◽

2020 ◽

Vol 7 ◽

pp. 1-26

Author(s):

Silas Nyboe Ørting ◽

Andrew Doyle ◽

Arno Van Hilten ◽

Matthias Hirth ◽

Oana Inel ◽

...

Keyword(s):

Machine Learning ◽

Image Processing ◽

Image Analysis ◽

Medical Image ◽

Large Scale ◽

Medical Image Analysis ◽

Machine Learning Algorithms ◽

Imaging Analysis ◽

High Quality ◽

Comprehensive Literature Review

Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.

Download Full-text

Federated Learning and Differential Privacy for Medical Image Analysis

10.21203/rs.3.rs-1005694/v1 ◽

2021 ◽

Author(s):

Mohammed Adnan ◽

Shivam Kalra ◽

Jesse C. Cresswell ◽

Graham W. Taylor ◽

Hamid Tizhoosh

Keyword(s):

Machine Learning ◽

Image Analysis ◽

Medical Image ◽

Large Scale ◽

Differential Privacy ◽

Medical Image Analysis ◽

External Validation ◽

The Cancer Genome Atlas ◽

Distributed Training ◽

Histopathology Images

Abstract The artificial intelligence revolution has been spurred forward by the availability of large-scale datasets. In contrast, the paucity of large-scale medical datasets hinders the application of machine learning in healthcare. The lack of publicly available multi-centric and diverse datasets mainly stems from confidentiality and privacy concerns around sharing medical data. To demonstrate a feasible path forward in medical image imaging, we conduct a case study of applying a differentially private federated learning framework for analysis of histopathology images, the largest and perhaps most complex medical images. We study the effects of IID and non-IID distributions along with the number of healthcare providers, i.e., hospitals and clinics, and the individual dataset sizes, using The Cancer Genome Atlas (TCGA) dataset, a public repository, to simulate a distributed environment. We empirically compare the performance of private, distributed training to conventional training and demonstrate that distributed training can achieve similar performance with strong privacy guarantees. We also study the effect of different source domains for histopathology images by evaluating the performance using external validation. Our work indicates that differentially private federated learning is a viable and reliable framework for the collaborative development of machine learning models in medical image analysis.

Download Full-text

57 Precision neoantigen discovery using novel algorithms and expanded HLA-ligandome datasets

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2020-sitc2020.0057 ◽

2020 ◽

Vol 8 (Suppl 3) ◽

pp. A62-A62

Author(s):

Dattatreya Mellacheruvu ◽

Rachel Pyke ◽

Charles Abbott ◽

Nick Phillips ◽

Sejal Desai ◽

...

Keyword(s):

Machine Learning ◽

Cell Lines ◽

Antigen Processing ◽

Large Scale ◽

Prediction Models ◽

K562 Cells ◽

Machine Learning Algorithms ◽

Training Data ◽

High Quality ◽

Tissue Samples

BackgroundAccurately identified neoantigens can be effective therapeutic agents in both adjuvant and neoadjuvant settings. A key challenge for neoantigen discovery has been the availability of accurate prediction models for MHC peptide presentation. We have shown previously that our proprietary model based on (i) large-scale, in-house mono-allelic data, (ii) custom features that model antigen processing, and (iii) advanced machine learning algorithms has strong performance. We have extended upon our work by systematically integrating large quantities of high-quality, publicly available data, implementing new modelling algorithms, and rigorously testing our models. These extensions lead to substantial improvements in performance and generalizability. Our algorithm, named Systematic HLA Epitope Ranking Pan Algorithm (SHERPA™), is integrated into the ImmunoID NeXT Platform®, our immuno-genomics and transcriptomics platform specifically designed to enable the development of immunotherapies.MethodsIn-house immunopeptidomic data was generated using stably transfected HLA-null K562 cells lines that express a single HLA allele of interest, followed by immunoprecipitation using W6/32 antibody and LC-MS/MS. Public immunopeptidomics data was downloaded from repositories such as MassIVE and processed uniformly using in-house pipelines to generate peptide lists filtered at 1% false discovery rate. Other metrics (features) were either extracted from source data or generated internally by re-processing samples utilizing the ImmunoID NeXT Platform.ResultsWe have generated large-scale and high-quality immunopeptidomics data by using approximately 60 mono-allelic cell lines that unambiguously assign peptides to their presenting alleles to create our primary models. Briefly, our primary ‘binding’ algorithm models MHC-peptide binding using peptide and binding pockets while our primary ‘presentation’ model uses additional features to model antigen processing and presentation. Both primary models have significantly higher precision across all recall values in multiple test data sets, including mono-allelic cell lines and multi-allelic tissue samples. To further improve the performance of our model, we expanded the diversity of our training set using high-quality, publicly available mono-allelic immunopeptidomics data. Furthermore, multi-allelic data was integrated by resolving peptide-to-allele mappings using our primary models. We then trained a new model using the expanded training data and a new composite machine learning architecture. The resulting secondary model further improves performance and generalizability across several tissue samples.ConclusionsImproving technologies for neoantigen discovery is critical for many therapeutic applications, including personalized neoantigen vaccines, and neoantigen-based biomarkers for immunotherapies. Our new and improved algorithm (SHERPA) has significantly higher performance compared to a state-of-the-art public algorithm and furthers this objective.

Download Full-text

TransMed: Transformers Advance Multi-Modal Medical Image Classification

Diagnostics ◽

10.3390/diagnostics11081384 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1384

Author(s):

Yin Dai ◽

Yifan Gao ◽

Fayu Liu

Keyword(s):

Image Analysis ◽

Image Classification ◽

Long Range ◽

Medical Image ◽

Large Scale ◽

Medical Images ◽

Medical Image Analysis ◽

Lesion Detection ◽

Tumor Segmentation ◽

Medical Image Classification

Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor segmentation, and lesion detection. CNN has great advantages in extracting local features of images. However, due to the locality of convolution operation, it cannot deal with long-range relationships well. Recently, transformers have been applied to computer vision and achieved remarkable success in large-scale datasets. Compared with natural images, multi-modal medical images have explicit and important long-range dependencies, and effective multi-modal fusion strategies can greatly improve the performance of deep models. This prompts us to study transformer-based structures and apply them to multi-modal medical images. Existing transformer-based network architectures require large-scale datasets to achieve better performance. However, medical imaging datasets are relatively small, which makes it difficult to apply pure transformers to medical image analysis. Therefore, we propose TransMed for multi-modal medical image classification. TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images and establish long-range dependencies between modalities. We evaluated our model on two datasets, parotid gland tumors classification and knee injury classification. Combining our contributions, we achieve an improvement of 10.1% and 1.9% in average accuracy, respectively, outperforming other state-of-the-art CNN-based models. The results of the proposed method are promising and have tremendous potential to be applied to a large number of medical image analysis tasks. To our best knowledge, this is the first work to apply transformers to multi-modal medical image classification.

Download Full-text

Review on Machine Learning Models Used in Medical Image Analysis

SSRN Electronic Journal ◽

10.2139/ssrn.3793999 ◽

2020 ◽

Author(s):

Jubin Dipakkumar Kothari

Keyword(s):

Machine Learning ◽

Image Analysis ◽

Medical Image ◽

Medical Image Analysis ◽

Learning Models ◽

Machine Learning Models

Download Full-text

An Unsupervised Machine Learning Approach for Medical Image Analysis

Advances in Intelligent Systems and Computing - Advances in Information and Communication ◽

10.1007/978-3-030-73103-8_58 ◽

2021 ◽

pp. 813-830

Author(s):

Mauro Mazzei

Keyword(s):

Machine Learning ◽

Image Analysis ◽

Medical Image ◽

Medical Image Analysis ◽

Learning Approach ◽

Unsupervised Machine Learning ◽

Machine Learning Approach

Download Full-text

Resolving Clinicians’ Queries Across a Grid’s Infrastructure

Methods of Information in Medicine ◽

10.1055/s-0038-1633936 ◽

2005 ◽

Vol 44 (02) ◽

pp. 149-153 ◽

Cited By ~ 2

Author(s):

F. Estrella ◽

C. del Frate ◽

T. Hauer ◽

M. Odeh ◽

D. Rogulin ◽

...

Keyword(s):

Image Analysis ◽

Data Storage ◽

Medical Image ◽

Large Scale ◽

Medical Image Analysis ◽

Large Data ◽

Computing Power ◽

European Database ◽

Order Of Magnitude ◽

Network Speed

Summary Objectives: The past decade has witnessed order of magnitude increases in computing power, data storage capacity and network speed, giving birth to applications which may handle large data volumes of increased complexity, distributed over the internet. Methods: Medical image analysis is one of the areas for which this unique opportunity likely brings revolutionary advances both for the scientist’s research study and the clinician’s everyday work. Grids [1] computing promises to resolve many of the difficulties in facilitating medical image analysis to allow radiologists to collaborate without having to co-locate. Results: The EU-funded MammoGrid project [2] aims to investigate the feasibility of developing a Grid-enabled European database of mammograms and provide an information infrastructure which federates multiple mammogram databases. This will enable clinicians to develop new common, collaborative and co-operative approaches to the analysis of mammographic data. Conclusion: This paper focuses on one of the key requirements for large-scale distributed mammogram analysis: resolving queries across a grid-connected federation of images.

Download Full-text

Survey of Image Processing Techniques in Medical Image Analysis: Challenges and Methodologies

Advances in Intelligent Systems and Computing - Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2016) ◽

10.1007/978-3-319-60618-7_45 ◽

2017 ◽

pp. 460-471 ◽

Cited By ~ 1

Author(s):

P. Chinmayi ◽

L. Agilandeeswari ◽

M. Prabukumar

Keyword(s):

Image Processing ◽

Image Analysis ◽

Medical Image ◽

Medical Image Analysis ◽

Image Processing Techniques ◽

Processing Techniques

Download Full-text

Erratum to: Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies

GigaScience ◽

10.1093/gigascience/giy043 ◽

2018 ◽

Vol 7 (7) ◽

Author(s):

Jonathan A Atkinson ◽

Guillaume Lobet ◽

Manuel Noll ◽

Patrick E Meyer ◽

Marcus Griffiths ◽

...

Keyword(s):

Machine Learning ◽

Image Analysis ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Automated Image Analysis ◽

Genetic Studies ◽

Analysis Techniques ◽

Image Analysis Techniques

Download Full-text

Efficient CNN for Lung Cancer Detection

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2921.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 3499-3505

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Image Analysis ◽

Cancer Detection ◽

Medical Image ◽

Medical Image Analysis ◽

Filter Width ◽

Early Disease Detection ◽

Fully Connected ◽

Lung Cancer Detection

The machine learning based solutions for medical image analysis are successful in detection of wide variety of anomalies in imaging procedures. The aim of the medical image analysis systems based on machine learning methods is to improve the accuracy and minimize the detection time. The aim in turn contributes to early disease detection and extending the patient life. This paper presents an efficient CNN (EFFI-CNN) for Lung cancer detection. EFFI-CNN consists of seven CNN layers (i.e. Convolution layer, Max-Pool layer, Convolution layer, Max-Pool layer, fully connected layer, fully connected layer and Soft-Max layer). EFFI-CNN uses lung CT scan images from LIDC-IDRI and Mendeley data sets. EFFI-CNN has a unique combination of CNN layers with parameters (Depth, Height, Width, filter Height and filter width).

Download Full-text