Training a deep learning model for single-cell segmentation without manual annotation

AbstractAdvances in the artificial neural network have made machine learning techniques increasingly more important in image analysis tasks. Recently, convolutional neural networks (CNN) have been applied to the problem of cell segmentation from microscopy images. However, previous methods used a supervised training paradigm in order to create an accurate segmentation model. This strategy requires a large amount of manually labeled cellular images, in which accurate segmentations at pixel level were produced by human operators. Generating training data is expensive and a major hindrance in the wider adoption of machine learning based methods for cell segmentation. Here we present an alternative strategy that trains CNNs without any human-labeled data. We show that our method is able to produce accurate segmentation models, and is applicable to both fluorescence and bright-field images, and requires little to no prior knowledge of the signal characteristics.

Download Full-text

Unsupervised deep learning method for cell segmentation

10.1101/2021.05.17.444529 ◽

2021 ◽

Author(s):

Nizam Ud Din ◽

Ji Yu

Keyword(s):

Machine Learning ◽

Training Data ◽

Machine Learning Techniques ◽

Cell Segmentation ◽

Learning Techniques ◽

Signal Characteristics ◽

Unsupervised Deep Learning ◽

Human Operators ◽

Microscopy Images ◽

Segmentation Models

Advances in the artificial neural network have made machine learning techniques increasingly more important in image analysis tasks. More recently, convolutional neural networks (CNN) have been applied to the problem of cell segmentation from microscopy images. However, previous methods used a supervised training paradigm in order to create an accurate segmentation model. This strategy requires a large amount of manually labeled cellular images, in which accurate segmentations at pixel level were produced by human operators. Generating training data is expensive and a major hindrance in the wider adoption of machine learning based methods for cell segmentation. Here we present an alternative strategy that uses unsupervised learning to train CNNs without any human-labeled data. We show that our method is able to produce accurate segmentation models. More importantly, the algorithm is applicable to both fluorescence and bright-field images, requiring no prior knowledge of signal characteristics and requires no tuning of parameters.

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

A sentiment analysis system for social media using machine learning techniques: Social enablement

Digital Scholarship in the Humanities ◽

10.1093/llc/fqy037 ◽

2018 ◽

Vol 34 (3) ◽

pp. 569-581 ◽

Cited By ~ 1

Author(s):

Sujata Rani ◽

Parteek Kumar

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Media Analysis ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Tool ◽

Data Set ◽

Learning Techniques

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.

Download Full-text

Artificially Generated Training Data-sets for Supervised Machine Learning Techniques in Magnetic Resonance Imaging: An Example in Myocardial Segmentation

2019 Computing in Cardiology Conference (CinC) ◽

10.22489/cinc.2019.220 ◽

2019 ◽

Author(s):

Christos Xanthis ◽

Kostas Haris ◽

Dimitrios Filos ◽

Anthony Aletras

Keyword(s):

Magnetic Resonance Imaging ◽

Machine Learning ◽

Magnetic Resonance ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Sets ◽

Resonance Imaging ◽

Learning Techniques ◽

Myocardial Segmentation

Download Full-text

Application of multi-sensor unmanned aerial system for identification of hydrothermal alteration zones

10.5194/egusphere-egu2020-12546 ◽

2020 ◽

Author(s):

Yosoon Choi ◽

Jieun Baek ◽

Jangwon Suh ◽

Sung-Min Kim

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Training Data ◽

Sensor Data ◽

Machine Learning Techniques ◽

Integrated Analysis ◽

Unmanned Aerial System ◽

Data Sets ◽

Learning Techniques ◽

Hydrothermal Alteration Zones

<p>In this study, we proposed a method to utilize a multi-sensor Unmanned Aerial System (UAS) for exploration of hydrothermal alteration zones. This study selected an area (10m &#215; 20m) composed mainly of the andesite and located on the coast, with wide outcrops and well-developed structural and mineralization elements. Multi-sensor (visible, multispectral, thermal, magnetic) data were acquired in the study area using UAS, and were studied using machine learning techniques. For utilizing the machine learning techniques, we applied the stratified random method to sample 1000 training data in the hydrothermal zone and 1000 training data in the non-hydrothermal zone identified through the field survey. The 2000 training data sets created for supervised learning were first classified into 1500 for training and 500 for testing. Then, 1500 for training were classified into 1200 for training and 300 for validation. The training and validation data for machine learning were generated in five sets to enable cross-validation. Five types of machine learning techniques were applied to the training data sets: k-Nearest Neighbors (k-NN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Network (DNN). As a result of integrated analysis of multi-sensor data using five types of machine learning techniques, RF and SVM techniques showed high classification accuracy of about 90%. Moreover, performing integrated analysis using multi-sensor data showed relatively higher classification accuracy in all five machine learning techniques than analyzing magnetic sensing data or single optical sensing data only.</p>

Download Full-text

Arabic tweets sentiment analysis – a hybrid scheme

Journal of Information Science ◽

10.1177/0165551515610513 ◽

2016 ◽

Vol 42 (6) ◽

pp. 782-797 ◽

Cited By ~ 42

Author(s):

Haifa K. Aldayel ◽

Aqil M. Azmi

Keyword(s):

Machine Learning ◽

Saudi Arabia ◽

Hybrid Approach ◽

Training Data ◽

Machine Learning Techniques ◽

Good Source ◽

Learning Classifier ◽

Learning Techniques ◽

Semantic Orientation ◽

F Measure

The fact that people freely express their opinions and ideas in no more than 140 characters makes Twitter one of the most prevalent social networking websites in the world. Being popular in Saudi Arabia, we believe that tweets are a good source to capture the public’s sentiment, especially since the country is in a fractious region. Going over the challenges and the difficulties that the Arabic tweets present – using Saudi Arabia as a basis – we propose our solution. A typical problem is the practice of tweeting in dialectical Arabic. Based on our observation we recommend a hybrid approach that combines semantic orientation and machine learning techniques. Through this approach, the lexical-based classifier will label the training data, a time-consuming task often prepared manually. The output of the lexical classifier will be used as training data for the SVM machine learning classifier. The experiments show that our hybrid approach improved the F-measure of the lexical classifier by 5.76% while the accuracy jumped by 16.41%, achieving an overall F-measure and accuracy of 84 and 84.01% respectively.

Download Full-text

Development and Optimization of VGF-GaAs Crystal Growth Process Using Data Mining and Machine Learning Techniques

Crystals ◽

10.3390/cryst11101218 ◽

2021 ◽

Vol 11 (10) ◽

pp. 1218

Author(s):

Natasha Dropka ◽

Klaus Böttcher ◽

Martin Holena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Crystal Growth ◽

Decision Trees ◽

Growth Process ◽

Training Data ◽

Machine Learning Techniques ◽

Interface Position ◽

Crystal Growth Process ◽

Learning Techniques

The aim of this study was to assess the ability of the various data mining and supervised machine learning techniques: correlation analysis, k-means clustering, principal component analysis and decision trees (regression and classification), to derive, optimize and understand the factors influencing VGF-GaAs growth. Training data were generated by Computational Fluid Dynamics (CFD) simulations and consisted of 130 datasets with 6 inputs (growth rate and power of 5 heaters) and 5 outputs (interface position and deflection, and temperatures at various positions in GaAs). Data mining results confirmed a good dispersion of the training data without the feasibility of a dimensionality reduction. Data clustering was observed in relation to the position of the crystallization front relative to the side heaters. Based on the statistical performance criteria and training results, decision trees identified the most decisive inputs and their ranges for a favorable interface shape and to keep GaAs temperature beyond limits for heavy arsenic evaporation. Decision trees are a recommendable machine learning technique with short training times and acceptable predictive accuracy based on small volume of CFD training data, capable of providing guidelines for understanding the crystal growth process, which is a prerequisite for the growth of low-cost, high-quality bulk crystals.

Download Full-text

A Survey on Diagnosis and Analysis of Diabetic Retinopathy using Feature Selection

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207132 ◽

2020 ◽

pp. 170-176

Author(s):

Amalu Michael ◽

Deepa S S

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Diabetic Retinopathy ◽

High Ratio ◽

Training Data ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Diabetic Eye Disease ◽

Selection Mechanisms ◽

Feature Selection Techniques

Diabetic retinopathy is one of the common forms of diabetic eye disease. DR occurs due to a high ratio of glucose in the blood, which causes alterations in the retinal vessels. Machine learning may be a broad multidisciplinary field that has its roots in statistics, algebra, data processing, and information analytics, etc. Machine learning is used to discover patterns from medical data and provide an efficient way to predict diseases.ML is an application of artificial intelligence it collects information from training data. There are several machine learning techniques are used for the diagnosis of diabetic retinopathy. This paper mainly focuses on the survey of such techniques and also various feature selection mechanisms. This study provides the basic categorization of feature selection techniques and discussing their use.

Download Full-text

Rotor Unbalance Kind and Severity Identification by Current Signature Analysis with Adaptative Update to Multiclass Machine Learning Algorithms

Studies in Engineering and Technology ◽

10.11114/set.v8i1.5213 ◽

2021 ◽

Vol 8 (1) ◽

pp. 28

Author(s):

S. L. Ávila ◽

H. M. Schaberle ◽

S. Youssef ◽

F. S. Pacheco ◽

C. A. Penz

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Signature Analysis ◽

Data Set ◽

Learning Techniques ◽

Environmental Variations ◽

Current Signature

The health of a rotating electric machine can be evaluated by monitoring electrical and mechanical parameters. As more information is available, it easier can become the diagnosis of the machine operational condition. We built a laboratory test bench to study rotor unbalance issues according to ISO standards. Using the electric stator current harmonic analysis, this paper presents a comparison study among Support-Vector Machines, Decision Tree classifies, and One-vs-One strategy to identify rotor unbalance kind and severity problem – a nonlinear multiclass task. Moreover, we propose a methodology to update the classifier for dealing better with changes produced by environmental variations and natural machinery usage. The adaptative update means to update the training data set with an amount of recent data, saving the entire original historical data. It is relevant for engineering maintenance. Our results show that the current signature analysis is appropriate to identify the type and severity of the rotor unbalance problem. Moreover, we show that machine learning techniques can be effective for an industrial application.

Download Full-text

Classification and Generation of Microscopy Images With Plasmodium Falciparum Via Artificial Neural Networks

10.21203/rs.3.rs-46179/v1 ◽

2020 ◽

Author(s):

Rija Tonny Christian Ramarolahy ◽

Esther Opoku Gyasi ◽

Alessandro Crimi

Keyword(s):

Machine Learning ◽

Blood Smear ◽

Data Augmentation ◽

Small Sample ◽

Automated Detection ◽

Machine Learning Techniques ◽

Malaria Parasites ◽

Learning Techniques ◽

Synthetic Images ◽

Microscopy Images

Abstract Background: Recent studies use machine-learning techniques to detect parasites in microscopy images automatically. However, these tools are trained and tested in specific datasets. Indeed, even if over-fitting is avoided during the improvements of computer vision applications, large differences are expected. Differences might be related to settings of camera (exposure, white balance settings, etc) and different blood film slides preparation. Moreover, generative adversial networks offer new opportunities in microscopy: data homogenization, and increase of images in case of imbalanced or small sample size. Methods: Taking into consideration all those aspects, in this paper, we describe a more complete view including both detection and generating synthetic images: i) an automated detection used to detect malaria parasites on stained blood smear images using machine learning techniques testing several datasets. ii) investigate transfer learning and further testing in different unseen datasets having different staining, microscope, resolution, etc. iii) a generative approach to create synthetic images which can deceive experts. Results: The tested architecture achieved 0.98 and 0.95 area under the ROC curve in classifying images with respectively thin and thick smear. Moreover, the generated images proved to be very similar to the original and difficult to be distinguished by an expert microscopist, which identified correcly the real data for one dataset but had 50\% misclassification for another dataset of images. Conclusion: The proposed deep-learning architecture performed well on a classification task for malaria parasites classification. The automated detection for malaria can help the technician to reduce their work and do not need any presence of experts. Moreover, generative networks can also be applied to blood smear images to generate useful images for microscopists. Opening new ways to data augmentation, translation and homogenization.

Download Full-text