scholarly journals Sickle cell segmentation and classification for thalassemia aid diagnosis

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1185
Author(s):  
Yen-Siang Leow ◽  
Kok-Why Ng ◽  
Yih-Jian Yoong ◽  
Seng-Beng Ng

Background: Thalassemia is a hereditary blood disease in which abnormal red blood cells (RBCs) carry insufficient oxygen throughout the body. Conventional methods of thalassemia detection through a complete blood count (CBC) test and peripheral blood smear image still possess a lot of weaknesses. Methods: This paper proposes a hybrid segmentation method to segment the RBCs. It incorporates adaptive thresholding and canny edge method to segment the RBCs. Morphological operations are performed to clean the leftovers. Shape and texture features are extracted using the segmented masks and the gray level co-occurrence matrix. Data imbalance treatment is used for solving the imbalance cell type class in distribution. In the data resampling layer, the synthetic minority oversampling technique (SMOTE), adaptive synthetic sampling (ADASYN), and random over sampling (ROS) are performed and evaluated using the decision tree and logistic regression. In the classification layer, the decision tree, random forest classifier and support vector machine (SVM) are assessed and compared for the best performance in classification. Results:The proposed method outperforms the other methods in the image segmentation layer with the structural similarity index measure (SSIM) of 89.88%. In the data resampling layer, ADASYN is employed as it is more accurate than the SMOTE and ROS. The random forest classifier is chosen at the classification layer as it is more accurate than the decision tree and support vector machine (SVM). Conclusions:The proposed method is tested on the latest dataset of erythrocyteIDB3 and it solves the issues of imbalanced data due to the insufficient cell classes.

2021 ◽  
Vol 23 (08) ◽  
pp. 532-537
Author(s):  
Cherlakola Abhinav Reddy ◽  
◽  
Sai Nitesh Gadiraju ◽  
Dr. Samala Nagaraj ◽  
◽  
...  

Online media has progressively obtained integral to the route billions of individuals experience news and occasions, frequently bypassing writers—the conventional guardians of breaking news. Occasions,in reality, make a relating spike of posts (tweets) on Twitter. This projects a great deal of significance on the validity of data found via online media stages like Twitter. We have utilized different managed learning techniques like Naïve Bayes, Decision Trees, and Support Vector Machines on the information to separate tweets among genuine and counterfeit news. For our AI models, we have utilized tweet and client highlights as our indicators. We accomplished a precision of 88% utilizing the Random Forest classifier and 88% utilizing the Decision tree. Notwithstanding, we accept that breaking down client records would build the accuracy of our models.


Sebatik ◽  
2020 ◽  
Vol 24 (2) ◽  
Author(s):  
Anifuddin Azis

Indonesia merupakan negara dengan keanekaragaman hayati terbesar kedua di dunia setelah Brazil. Indonesia memiliki sekitar 25.000 spesies tumbuhan dan 400.000 jenis hewan dan ikan. Diperkirakan 8.500 spesies ikan hidup di perairan Indonesia atau merupakan 45% dari jumlah spesies yang ada di dunia, dengan sekitar 7.000an adalah spesies ikan laut. Untuk menentukan berapa jumlah spesies tersebut dibutuhkan suatu keahlian di bidang taksonomi. Dalam pelaksanaannya mengidentifikasi suatu jenis ikan bukanlah hal yang mudah karena memerlukan suatu metode dan peralatan tertentu, juga pustaka mengenai taksonomi. Pemrosesan video atau citra pada data ekosistem perairan yang dilakukan secara otomatis mulai dikembangkan. Dalam pengembangannya, proses deteksi dan identifikasi spesies ikan menjadi suatu tantangan dibandingkan dengan deteksi dan identifikasi pada objek yang lain. Metode deep learning yang berhasil dalam melakukan klasifikasi objek pada citra mampu untuk menganalisa data secara langsung tanpa adanya ekstraksi fitur pada data secara khusus. Sistem tersebut memiliki parameter atau bobot yang berfungsi sebagai ektraksi fitur maupun sebagai pengklasifikasi. Data yang diproses menghasilkan output yang diharapkan semirip mungkin dengan data output yang sesungguhnya.  CNN merupakan arsitektur deep learning yang mampu mereduksi dimensi pada data tanpa menghilangkan ciri atau fitur pada data tersebut. Pada penelitian ini akan dikembangkan model hybrid CNN (Convolutional Neural Networks) untuk mengekstraksi fitur dan beberapa algoritma klasifikasi untuk mengidentifikasi spesies ikan. Algoritma klasifikasi yang digunakan pada penelitian ini adalah : Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree, K-Nearest Neighbor (KNN),  Random Forest, Backpropagation.


2021 ◽  
Vol 12 (2) ◽  
pp. 28-55
Author(s):  
Fabiano Rodrigues ◽  
Francisco Aparecido Rodrigues ◽  
Thelma Valéria Rocha Rodrigues

Este estudo analisa resultados obtidos com modelos de machine learning para predição do sucesso de startups. Como proxy de sucesso considera-se a perspectiva do investidor, na qual a aquisição da startup ou realização de IPO (Initial Public Offering) são formas de recuperação do investimento. A revisão da literatura aborda startups e veículos de financiamento, estudos anteriores sobre predição do sucesso de startups via modelos de machine learning, e trade-offs entre técnicas de machine learning. Na parte empírica, foi realizada uma pesquisa quantitativa baseada em dados secundários oriundos da plataforma americana Crunchbase, com startups de 171 países. O design de pesquisa estabeleceu como filtro startups fundadas entre junho/2010 e junho/2015, e uma janela de predição entre junho/2015 e junho/2020 para prever o sucesso das startups. A amostra utilizada, após etapa de pré-processamento dos dados, foi de 18.571 startups. Foram utilizados seis modelos de classificação binária para a predição: Regressão Logística, Decision Tree, Random Forest, Extreme Gradiente Boosting, Support Vector Machine e Rede Neural. Ao final, os modelos Random Forest e Extreme Gradient Boosting apresentaram os melhores desempenhos na tarefa de classificação. Este artigo, envolvendo machine learning e startups, contribui para áreas de pesquisa híbridas ao mesclar os campos da Administração e Ciência de Dados. Além disso, contribui para investidores com uma ferramenta de mapeamento inicial de startups na busca de targets com maior probabilidade de sucesso.   


2018 ◽  
Vol Volume 11 ◽  
pp. 253-269 ◽  
Author(s):  
Yunfei He ◽  
Jun Ma ◽  
An Wang ◽  
Weiheng Wang ◽  
Shengchang Luo ◽  
...  

2021 ◽  
Author(s):  
Hemalatha N ◽  
Akhil Wilson ◽  
Akhil Thankachan

Plastic pollution is one of the challenging problems in the environment. But a life without plastic we cannot imagine. This paper deals with the prediction of plastic degrading microbes using Machine Learning. Here we have used Decision Tree, Random Forest, Support vector Machine and K Nearest Neighbor algorithms in order to predict the plastic degrading microbes. Among the four classifiers, Random Forest model gave the best accuracy of 99.1%.


MATICS ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 21-27
Author(s):  
Via Ardianto Nugroho ◽  
Derry Pramono Adi ◽  
Achmad Teguh Wibowo ◽  
MY Teguh Sulistyono ◽  
Agustinus Bimo Gumelar

Pada industri jasa pelayanan peti kemas, Terminal Nilam merupakan pelanggan dari PT. BIMA, yang secara khusus bergerak dibidang jasa perbaikan dan perawatan alat berat. Terminal ini menjadi sentral tempat untuk melakukan aktifitas bongkar muat peti kemas domestik yang memiliki empat buah container crane untuk melayani dua kapal. Proses perawatan alat berat seperti container crane yang selama ini beroperasi, agaknya kurang memperhatikan data pengelompokkan atau klasifikasi jenis perawatan yang dibutuhkan oleh alat berat tersebut. Di kemudian hari, alat berat dapat menunjukkan kinerja yang tidak maksimal bahkan dapat berujung pada kecelakaan kerja. Selain itu, kelalaian perawatan container crane juga dapat menyebabkan pembengkakan biaya perawatan lanjut. Target produksi bongkar muat dapat berkurang dan juga keterlambatan jadwal kapal sandar sangat mungkin terjadi. Metode pembelajaran menggunakan mesin atau biasa disebut dengan Machine Learning (ML), dengan mudah dapat melenyapkan kemungkinan-kemungkinan tersebut. ML dalam penelitian ini, kami rancang agar bekerja dengan mengidentifikasi lalu mengelompokkan jenis perawatan container crane yang sesuai, yaitu ringan atau berat. Metode ML yang pilih untuk digunakan dalam penelitian ini yaitu Random Forest, Support Vector Machine, k-Nearest Neighbor, Naïve Bayes, Logistic Regression, J48, dan Decision Tree. Penelitian ini menunjukkan keberhasilan ML model tree dalam melakukan pembelajaran jenis data perawatan container crane (numerik dan kategoris), dengan J48 menunjukkan performa terbaik dengan nilai akurasi dan nilai ROC-AUC mencapai 99,1%. Pertimbangan klasifikasi kami lakukan dengan mengacu kepada tanggal terakhir perawatan, hour meter, breakdown, shutdown, dan sparepart.


2021 ◽  
Vol 2096 (1) ◽  
pp. 012190
Author(s):  
E V Bunyaeva ◽  
I V Kuznetsov ◽  
Y V Ponomarchuk ◽  
P S Timosh

Abstract The paper considers comparative analysis results of the machine learning methods used for the gesture recognition based on the surface single-channel electromyography (sEMG) data. The data were processed using multilayer perceptron, support vector machine, decision tree ensemble (Random Forest) and logistic regression for the chosen four gesture types. The conclusion was derived on the analysis efficiency of these methods using commonly recommended accuracy metrics.


2020 ◽  
Vol 5 (1) ◽  
pp. 88-95
Author(s):  
Álvaro Farias Pinheiro ◽  
João Alberto Da Silva Amaral ◽  
Geraldo Torres Galindo Neto ◽  
José Nilo Martins Sampaio ◽  
Wedson Lino Soares

Application of data mining (DM) techniques to optimize the process of collection of Active Debt (AD) of the State of Pernambuco, Brazil. We apply the following data mining techniques: Decision Tree (DT), Logistic regression (LR), Nayve bayes (NB), Support vector machine (SVM), also applied to the Random Forest technique which is considered an essemble method. We observed that the RF technique obtained better results than all the techniques of classification, reaching higher values in all metrics analyzed. We note that the creation of a data mining model to choose which debts can succeed in the collection process can bring benefits to the pernambuco government. With the application of RF technique, we obtained indexes above 85% in the evaluation of the metrics.


Sign in / Sign up

Export Citation Format

Share Document