fold cross validation
Recently Published Documents


TOTAL DOCUMENTS

870
(FIVE YEARS 615)

H-INDEX

25
(FIVE YEARS 11)

Author(s):  
Nermeen Elmenabawy ◽  
Mervat El-Seddek ◽  
Hossam El-Din Moustafa ◽  
Ahmed Elnakib

A pipelined framework is proposed for accurate, automated, simultaneous segmentation of the liver as well as the hepatic tumors from computed tomography (CT) images. The introduced framework composed of three pipelined levels. First, two different transfers deep convolutional neural networks (CNN) are applied to get high-level compact features of CT images. Second, a pixel-wise classifier is used to obtain two output-classified maps for each CNN model. Finally, a fusion neural network (FNN) is used to integrate the two maps. Experimentations performed on the MICCAI’2017 database of the liver tumor segmentation (LITS) challenge, result in a dice similarity coefficient (DSC) of 93.5% for the segmentation of the liver and of 74.40% for the segmentation of the lesion, using a 5-fold cross-validation scheme. Comparative results with the state-of-the-art techniques on the same data show the competing performance of the proposed framework for simultaneous liver and tumor segmentation.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 649
Author(s):  
David Ferreira ◽  
Samuel Silva ◽  
Francisco Curado ◽  
António Teixeira

Speech is our most natural and efficient form of communication and offers a strong potential to improve how we interact with machines. However, speech communication can sometimes be limited by environmental (e.g., ambient noise), contextual (e.g., need for privacy), or health conditions (e.g., laryngectomy), preventing the consideration of audible speech. In this regard, silent speech interfaces (SSI) have been proposed as an alternative, considering technologies that do not require the production of acoustic signals (e.g., electromyography and video). Unfortunately, despite their plentitude, many still face limitations regarding their everyday use, e.g., being intrusive, non-portable, or raising technical (e.g., lighting conditions for video) or privacy concerns. In line with this necessity, this article explores the consideration of contactless continuous-wave radar to assess its potential for SSI development. A corpus of 13 European Portuguese words was acquired for four speakers and three of them enrolled in a second acquisition session, three months later. Regarding the speaker-dependent models, trained and tested with data from each speaker while using 5-fold cross-validation, average accuracies of 84.50% and 88.00% were respectively obtained from Bagging (BAG) and Linear Regression (LR) classifiers, respectively. Additionally, recognition accuracies of 81.79% and 81.80% were also, respectively, achieved for the session and speaker-independent experiments, establishing promising grounds for further exploring this technology towards silent speech recognition.


2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Mahesh Babu Mariappan ◽  
Kanniga Devi ◽  
Yegnanarayanan Venkataraman ◽  
Ming K. Lim ◽  
Panneerselvam Theivendren

PurposeThis paper aims to address the pressing problem of prediction concerning shipment times of therapeutics, diagnostics and vaccines during the ongoing COVID-19 pandemic using a novel artificial intelligence (AI) and machine learning (ML) approach.Design/methodology/approachThe present study used organic real-world therapeutic supplies data of over 3 million shipments collected during the COVID-19 pandemic through a large real-world e-pharmacy. The researchers built various ML multiclass classification models, namely, random forest (RF), extra trees (XRT), decision tree (DT), multilayer perceptron (MLP), XGBoost (XGB), CatBoost (CB), linear stochastic gradient descent (SGD) and the linear Naïve Bayes (NB) and trained them on striped datasets of (source, destination, shipper) triplets. The study stacked the base models and built stacked meta-models. Subsequently, the researchers built a model zoo with a combination of the base models and stacked meta-models trained on these striped datasets. The study used 10-fold cross-validation (CV) for performance evaluation.FindingsThe findings reveal that the turn-around-time provided by therapeutic supply logistics providers is only 62.91% accurate when compared to reality. In contrast, the solution provided in this study is up to 93.5% accurate compared to reality, resulting in up to 48.62% improvement, with a clear trend of more historic data and better performance growing each week.Research limitations/implicationsThe implication of the study has shown the efficacy of ML model zoo with a combination of base models and stacked meta-models trained on striped datasets of (source, destination and shipper) triplets for predicting the shipment times of therapeutics, diagnostics and vaccines in the e-pharmacy supply chain.Originality/valueThe novelty of the study is on the real-world e-pharmacy supply chain under post-COVID-19 lockdown conditions and has come up with a novel ML ensemble stacking based model zoo to make predictions on the shipment times of therapeutics. Through this work, it is assumed that there will be greater adoption of AI and ML techniques in shipment time prediction of therapeutics in the logistics industry in the pandemic situations.


Stat ◽  
2022 ◽  
Author(s):  
Jerzy Wieczorek ◽  
Cole Guerin ◽  
Thomas McMahon

2022 ◽  
Vol 23 (1) ◽  
pp. 68-81
Author(s):  
Syahroni Hidayat ◽  
Muhammad Tajuddin ◽  
Siti Agrippina Alodia Yusuf ◽  
Jihadil Qudsi ◽  
Nenet Natasudian Jaya

Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%. ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%.


2022 ◽  
Vol 10 (2) ◽  
pp. 217
Author(s):  
I Wayan Santiyasa ◽  
Gede Putra Aditya Brahmantha ◽  
I Wayan Supriana ◽  
I GA Gede Arya Kadyanan ◽  
I Ketut Gede Suhartana ◽  
...  

At this time, information is very easy to obtain, information can spread quickly to all corners of society. However, the information that spreaded are not all true, there is false information or what is commonly called hoax which of course is also easily spread by the public, the public only thinks that all the information circulating on the internet is true. From every news published on the internet, it cannot be known directly that the news is a hoax or valid one. The test uses 740 random contents / issue data that has been verified by an institution, where 370 contents are hoaxes and 370 contents are valid. The test uses the K-Nearest Neighbor algorithm, before the classification process is performed, the preprocessing stage is performed first and uses the TF-IDF equation to get the weight of each feature, then classified using K-Nearest Neighbor and the test results is evaluated using 10-Fold Cross Validation. The test uses the k value with a value of 2 to 10. The optimal use of the k value in the implementation is obtained at a value of k = 4 with precision, recall, and F-Measure results of 0.764856, 0.757583, and 0.751944 respectively and an accuracy of 75.4%


2022 ◽  
Vol 8 ◽  
Author(s):  
Bin Wang ◽  
Xiong Han ◽  
Zongya Zhao ◽  
Na Wang ◽  
Pan Zhao ◽  
...  

Objective: Antiseizure medicine (ASM) is the first choice for patients with epilepsy. The choice of ASM is determined by the type of epilepsy or epileptic syndrome, which may not be suitable for certain patients. This initial choice of a particular drug affects the long-term prognosis of patients, so it is critical to select the appropriate ASMs based on the individual characteristics of a patient at the early stage of the disease. The purpose of this study is to develop a personalized prediction model to predict the probability of achieving seizure control in patients with focal epilepsy, which will help in providing a more precise initial medication to patients.Methods: Based on response to oxcarbazepine (OXC), enrolled patients were divided into two groups: seizure-free (52 patients), not seizure-free (NSF) (22 patients). We created models to predict patients' response to OXC monotherapy by combining Electroencephalogram (EEG) complexities and 15 clinical features. The prediction models were gradient boosting decision tree-Kolmogorov complexity (GBDT-KC) and gradient boosting decision tree-Lempel-Ziv complexity (GBDT-LZC). We also constructed two additional prediction models, support vector machine-Kolmogorov complexity (SVM-KC) and SVM-LZC, and these two models were compared with the GBDT models. The performance of the models was evaluated by calculating the accuracy, precision, recall, F1-score, sensitivity, specificity, and area under the curve (AUC) of these models.Results: The mean accuracy, precision, recall, F1-score, sensitivity, specificity, AUC of GBDT-LZC model after five-fold cross-validation were 81%, 84%, 91%, 87%, 91%, 64%, 81%, respectively. The average accuracy, precision, recall, F1-score, sensitivity, specificity, AUC of GBDT-KC model with five-fold cross-validation were 82%, 84%, 92%, 88%, 83%, 92%, 83%, respectively. We used the rank of absolute weights to separately calculate the features that have the most significant impact on the classification of the two models.Conclusion: (1) The GBDT-KC model has the potential to be used in the clinic to predict seizure-free with OXC monotherapy. (2). Electroencephalogram complexity, especially Kolmogorov complexity (KC) may be a potential biomarker in predicting the treatment efficacy of OXC in newly diagnosed patients with focal epilepsy.


Chemosensors ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 18
Author(s):  
Nuno Ferreiro ◽  
Nuno Rodrigues ◽  
Ana C. A. Veloso ◽  
Conceição Fernandes ◽  
Helga Paiva ◽  
...  

The impact of the covering vegetable oil (sunflower oil, refined olive oil and extra virgin olive oil, EVOO) on the physicochemical and sensory profiles of canned tuna (Katsuwonus pelamis species) was evaluated, using analytical techniques and a sensory panel. The results showed that canned tuna covered with EVOO possesses a higher content of total phenols and an enhanced antioxidant capacity. This covering medium also increased the appreciated redness-yellowness color of the canned tuna, which showed a higher chromatic and intense color. Olfactory and kinesthetic sensations were significantly dependent on the type of oil used as covering medium. Tuna succulence and adhesiveness were promoted by the use of EVOO, which also contributed to decreasing the tuna-related aroma sensations. The tuna sensory data could be successfully used to identify the type of vegetable oil used. Moreover, a potentiometric electronic tongue allowed discriminating between the canned tuna samples according to the vegetable oil used (mean sensitivity of 96 ± 8%; repeated K-fold cross-validation) and the fruity intensity of the EVOO (mean sensitivity of 100%; repeated K-fold cross-validation). Thus, the taste sensor device could be a practical tool to verify the authenticity of the declared covering medium in canned tuna and to perceive the differences in consumers' taste.


Author(s):  
Acep Saepulrohman ◽  
Sudin Saepudin ◽  
Dudih Gustian

Teknologi informasi dan komunikasi saat ini sangat berkembang pesat, salah satunya Aplikasi Chat atau pesan instan seperti WhatsApp, Line dan Telegram. Pada bulan Oktober 2020, mayoritas pengguna aplikasi pesan instan adalah pengguna aplikasi WhatsApp, dengan total 2 miliar pengguna. Sekalipun aplikasi whatsapp tersebut masuk dalam peringkat teratas dan mendapat skor tertinggi, akan tetapi hal tersebut tidak dapat dijadikan tolak ukur kepuasan karena masih terdapat pandangan yang negatif terhadap aplikasi whatsapp, sebagian pengguna menganggap bahwa whatsapp seringkali eror pada saat digunakan, kemudian masalah lain yang muncul seperti jaringan yang digunakan pengguna tidak stabil. Untuk melakukan analisis mengenai hal tersebut diperlukan pendekatan analisis sentimen guna mengkategorikan komentar pengguna menjadi positif atau negatif. Penelitian ini menggunakan algoritma Naïve Bayes dengan Support Vector Machine dalam menganalisa komentar positif dan negatif terhadap kepuasan pengguna aplikasi Whatsapp di Google Play Store. Dari hasil pengujian yang dilakukan terhadap 1500 data komentar pengguna, evaluasi model menggunakan 10 Fold Cross Validation menunjukan bahwa tingkat keakurasian untuk kepuasan pengguna aplikasi whatsapp berdasarkan algoritma Naïve Bayes adalah sebesar 70,40% dan Support Vector Machine sebesar 77,00%, sedangkan nilai AUC Naïve Bayes sebesar 0,585 dan Support Vector Machine adalah  0,876. Dari hasil tersebut algoritma Support Vector Machine dapat digunakan untuk penelitian dengan karakteristik  data yang sama.


2021 ◽  
Vol 2 (2) ◽  
pp. 112-122
Author(s):  
Novanto Yudistira ◽  
Aldi Fianda Putra

Serangan jantung atau dalam medis bernama Myocardial Infarction atau infark miokard adalah gangguan jantung yang sangat serius. Dalam pendeteksian ini menggunakan komplikasi-komplikasi yang diderita oleh pasien. Algoritma yang akan dievaluasi yaitu Naive Bayes, Decision Tree, dan Support Vector Machine. Namun tidak serta merta dapat dilakukan evaluasi. Sebelum mengevaluasi ketiga algoritma ini dilakukan perbaikan dataset, karena pada dataset ini sendiri terdapat data yang kosong. Perbaikan dilakukan dengan cara mengimputasikan data dimana nilai diperkirakan berdasarkan rata-rata dari anggota klaster pada kelas yang sama. Setelah melakukan imputasi data, maka dapat dilakukan normalisasi dengan metode MinMax dengan tujuan agar rentang fitur terutama data numerik kontinu tidak terlalu besar. Setelah pemrosesan data awal dilakukan maka barulah kita dapat melakukan evaluasi dengan menggunakan metode K-fold Cross Validation. Namun lagi-lagi ditemukan kesalahan yakni data latih yang digunakan ternyata tidak seimbang. Oleh sebab itu dilakukan oversampling pada data agar data menjadi seimbang. Setelah seimbang maka kita dapat melakukan evaluasi kembali dan diperolehlah algoritma yang cocok untuk mengklasifikasikan data seperti dataset Myocardial Infarction Complications adalah algoritma Decision Tree dengan akurasi 98%, diikuti algoritma Support Vector Machine dengan akurasi 91% dan Naïve Bayes dengan akurasi paling rendah yakni 49%.


Sign in / Sign up

Export Citation Format

Share Document