A Machine Learning Method for Detecting the Trace of Seam Carving

Elektronika ir Elektrotechnika ◽

10.5755/j02.eie.29050 ◽

2021 ◽

Author(s):

Zehra Karapinar Senturk ◽

Devrim Akgun

Keyword(s):

Local Binary Pattern ◽

Cross Validation ◽

Detection Method ◽

State Of The Art ◽

Evaluation Process ◽

Support Vector ◽

Blind Detection ◽

Seam Carving ◽

Image Retargeting ◽

Fold Cross Validation

Image retargeting is a manipulation approach for resizing the images while aiming to keep the image distortion at a low level. Detecting image retargeting is of importance in image forensics or sometimes of importance in checking the originality. The aim of this paper is to introduce a new blind detection method for identifying retargeted images based on seam carving. For this purpose, a new method based on stripes at various numbers, Local Binary Pattern (LBP) transform, and energy map is introduced. The sub-images were obtained from square root of the energy map of LBP transform in the form of stripes for the feature extraction and these were evaluated in terms of several statistical features. The features extracted both from the natural and the seam carved images were used to train a Support Vector Machine (SVM) as a binary classifier. Experimental results were obtained using four-fold cross validation to improve the validity of the results during the evaluation process. According to the experiments, the proposed method produces improved accuracies when compared with the state-of-the-art solutions for the image retargeting detection based on seam carving.

Download Full-text

Pengenalan Wajah Manusia berbasis Algoritma Local Binary Pattern

Emitor: Jurnal Teknik Elektro ◽

10.23917/emitor.v17i2.6232 ◽

2017 ◽

Vol 17 (2) ◽

pp. 29-38

Author(s):

Ratih Purwati ◽

Gunawan Ariyanto

Keyword(s):

Computer Vision ◽

Support Vector Machine ◽

Face Recognition ◽

Local Binary Pattern ◽

Cross Validation ◽

Support Vector ◽

Fold Cross Validation

Face Recognition merupakan teknologi komputer untuk mengidentifikasi wajah manusia melalui gambar digital yang tersimpan di database. Wajah manusia dapat berubah bentuk sesuai dengan ekspresi yang dimilikinya. Wajah manusia dapat berubah bentuk sesuai dengan eskpresi yang dimilikinya. Ekspresi wajah manusia memiliki kemiripan satu sama lain sehingga untuk mengenali suatu ekspresi adalah kepunyaan siapa akan sedikit sulit. Pengenalan wajah terus menjadi topik aktif di zaman sekarang pada penelitian bidang computer vision. Penggunaan wajah manusia sering kita jumpai pada fitur-fitur aplikasi media sosial seperti Snapchat, Snapgram dari Instagram dan banyak aplikasi sosial media lainnya yang menggunakan teknologi tersebut. Pada penelitian ini dilakukan analisa pengenalan ekpresi wajah manusia dengan pendekatan fitur alogaritma Local Binary Pattern dan mencari pengembangan alogaritma dasar Local Binary Pattern yang paling optimal dengan cara menggabungkan metode Hisogram Equalization, Support Vector Machine, dan K-fold cross validation sehingga dapat meningkatkan pengenalan gambar wajah manusia pada hasil yang terbaik. Penelitian ini menginput beberapa database wajah manusia seperti JAFFE yang merupakan gambar wajah manusia wanita jepang yang berjumlah 10 orang dengan 7 ekspresi emosional seperti marah, sedih, bahagia, jijik, kaget, takut dan netral ke dalam sistem. YALE yaitu merupakan gambar wajah manusia orang Amerika. Serta menggunakan dataset CALTECH yang merupakan gambar manusia yang terdiri dari 450 gambar dengan ukuran 896 x 592 piksel dan disimpan dalam format JPEG. Kemudian data tersebut di sesuaikan dengan bentuk tekstur wajah masing-masing. Dari hasil penggabungan ketiga metode diatas dan percobaan-percobaan yang sudah dilakukan, didapatkan hasil yang paling optimal dalam pengenalan wajah manusia yaitu menggunakan dataset JAFFE dengan resolusi 92 x 112 piksel dan dengan tingkat penggunaan processor yang tinggi dapat mempengaruhi waktu kecepatan komputasi dalam proses menjalankan sistem sehingga menghasilkan prediksi yang lebih tepat.

Download Full-text

Combination of Support Vector Machine and K-Fold cross-validation for prediction of long-term degradation of the compressive strength of marine concrete

International Journal of Computational Physics Series ◽

10.29167/a1i1p120-130 ◽

2018 ◽

Vol 1 (1) ◽

pp. 120-130 ◽

Cited By ~ 1

Author(s):

Chunxiang Qian ◽

Wence Kang ◽

Hao Ling ◽

Hua Dong ◽

Chengyao Liang ◽

...

Keyword(s):

Support Vector Machine ◽

Environmental Factors ◽

Cross Validation ◽

Concrete Strength ◽

Simulation Method ◽

Support Vector ◽

Svm Model ◽

Artificial Neural Network Ann ◽

Influence Degree ◽

Fold Cross Validation

Support Vector Machine (SVM) model optimized by K-Fold cross-validation was built to predict and evaluate the degradation of concrete strength in a complicated marine environment. Meanwhile, several mathematical models, such as Artificial Neural Network (ANN) and Decision Tree (DT), were also built and compared with SVM to determine which one could make the most accurate predictions. The material factors and environmental factors that influence the results were considered. The materials factors mainly involved the original concrete strength, the amount of cement replaced by fly ash and slag. The environmental factors consisted of the concentration of Mg2+, SO42-, Cl-, temperature and exposing time. It was concluded from the prediction results that the optimized SVM model appeared to perform better than other models in predicting the concrete strength. Based on SVM model, a simulation method of variables limitation was used to determine the sensitivity of various factors and the influence degree of these factors on the degradation of concrete strength.

Download Full-text

Illumination Invariant Face Recognition

Jurnal Telekomunikasi dan Komputer ◽

10.22441/incomtech.v10i3.8466 ◽

2020 ◽

Vol 10 (3) ◽

pp. 129

Author(s):

Regina Lionnie ◽

Mochamad Miftakhul Huda ◽

Mudrik Alaydrus

Keyword(s):

Face Recognition ◽

Local Binary Pattern ◽

Cross Validation ◽

Graph Matching ◽

Illumination Invariant ◽

Fold Cross Validation

Face recognition adalah bidang penelitian yang selalu menjadi topik penelitian dengan peminatan yang sangat besar. Berbagai potensial pengembangan aplikasi, dari sistem keamanan individu hingga untuk sistem control dan sistem surveillance. Algoritma pengenalan wajah telah diusulkan oleh banyak peneliti. Metode pengenalan wajah dengan performa yang baik seperti eigenfaces, fisherfaces, jaringan saraf tiruan, elastic bunch graph matching, laplacian faces, dan lainnya. Performa dari algoritma ini awalnya diuji pada gambar wajah yang dikumpulkan di bawah lingkungan kontrol yang baik pada kondisi studio dan pencahayaan yang diatur, dan karenanya, sebagian besar mengalami kesulitan dalam mengatasi gambar alami, yang dapat ditangkap di bawah kondisi pencahayaan, pose, dan ekspresi wajah yang sangat bervariasi. Situasi menjadi lebih menantang ketika kombinasi variasi ini harus ditangani secara bersamaan. Kondisi pencahayaan berbeda menimbulkan hambatan vital dalam sistem pengenalan karena mereka sangat mempengaruhi penampilan gambar wajah dan meningkatkan variasi antar kelas. Pada penelitian ini, telah dibangun sistem pengenalan wajah menggunakan Local Binary Pattern (LBP) dengan total gambar pada basis data sebanyak 400 gambar yang diambil dari 25 kelas/responden. Menggunakan 2-fold cross validation dan jarak Euclidean, presisi tertinggi yang diraih system adalah sebesar 87,98% dengan variasi ekualisasi histogram tanpa menggunakan LBP.

Download Full-text

Abstract 473: Identification of Apolipoproteins Using Feature Selection Technique

Arteriosclerosis Thrombosis and Vascular Biology ◽

10.1161/atvb.36.suppl_1.473 ◽

2016 ◽

Vol 36 (suppl_1) ◽

Author(s):

Hua Tang ◽

Hao Lin

Keyword(s):

Support Vector Machine ◽

Cross Validation ◽

Support Vector ◽

Feature Subset ◽

Risk Markers ◽

Dipeptide Composition ◽

Accurate Identification ◽

Feature Selection Technique ◽

Physiological Importance ◽

Fold Cross Validation

Objective: Apolipoproteins are of great physiological importance and are associated with different diseases such as dyslipidemia, thrombogenesis and angiocardiopathy. Apolipoproteins have therefore emerged as key risk markers and important research targets yet the types of apolipoproteins has not been fully elucidated. Accurate identification of the apoliproproteins is very crucial to the comprehension of cardiovascular diseases and drug design. The aim of this study is to develop a powerful model to precisely identify apolipoproteins. Approach and Results: We manually collected a non-redundant dataset of 53 apoliproproteins and 136 non-apoliproproteins with the sequence identify of less than 40% from UniProt. After formulating the protein sequence samples with g -gap dipeptide composition (here g =1~10), the analysis of various (ANOVA) was adopted to find out the best feature subset which can achieve the best accuracy. Support Vector Machine (SVM) was then used to perform classification. The predictive model was evaluated using a five-fold cross-validation which yielded a sensitivity of 96.2%, a specificity of 99.3%, and an accuracy of 98.4%. The study indicated that the proposed method could be a feasible means of conducting preliminary analyses of apoliproproteins. Conclusion: We demonstrated that apoliproproteins can be predicted from their primary sequences. Also we discovered the special dipeptide distribution in apoliproproteins. These findings open new perspectives to improve apoliproproteins prediction by considering the specific dipeptides. We expect that these findings will help to improve drug development in anti-angiocardiopathy disease. Key words: Apoliproproteins Angiocardiopathy Support Vector Machine

Download Full-text

The Animal Classification: An Evaluation of Different Transfer Learning Pipeline

Mekatronika ◽

10.15282/mekatronika.v3i1.6680 ◽

2021 ◽

Vol 3 (1) ◽

pp. 27-31

Author(s):

Ken-ji Ee ◽

Ahmad Fakhri Bin Ab. Nasir ◽

Anwar P. P. Abdul Majeed ◽

Mohd Azraai Mohd Razman ◽

Nur Hafieza Ismail

Keyword(s):

Transfer Learning ◽

Classification System ◽

Cross Validation ◽

Support Vector ◽

Svm Classifier ◽

Average Classification Accuracy ◽

Validation Technique ◽

Search Approach ◽

Fold Cross Validation

The animal classification system is a technology to classify the animal class (type) automatically and useful in many applications. There are many types of learning models applied to this technology recently. Nonetheless, it is worth noting that the extraction of the features and the classification of the animal features is non-trivial, particularly in the deep learning approach for a successful animal classification system. The use of Transfer Learning (TL) has been demonstrated to be a powerful tool in the extraction of essential features. However, the employment of such a method towards animal classification applications are somewhat limited. The present study aims to determine a suitable TL-conventional classifier pipeline for animal classification. The VGG16 and VGG19 were used in extracting features and then coupled with either k-Nearest Neighbour (k-NN) or Support Vector Machine (SVM) classifier. Prior to that, a total of 4000 images were gathered consisting of a total of five classes which are cows, goats, buffalos, dogs, and cats. The data was split into the ratio of 80:20 for train and test. The classifiers hyper parameters are tuned by the Grids Search approach that utilises the five-fold cross-validation technique. It was demonstrated from the study that the best TL pipeline identified is the VGG16 along with an optimised SVM, as it was able to yield an average classification accuracy of 0.975. The findings of the present investigation could facilitate animal classification application, i.e. for monitoring animals in wildlife.

Download Full-text

A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs

BMC Bioinformatics ◽

10.1186/s12859-020-03906-7 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Yubin Xiao ◽

Zheng Xiao ◽

Xiang Feng ◽

Zhiping Chen ◽

Linai Kuang ◽

...

Keyword(s):

Computational Model ◽

Cross Validation ◽

State Of The Art ◽

Prediction Methods ◽

Good Prediction ◽

Average Case ◽

Comparison Results ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

Download Full-text

Predictor Selection for Bacterial Vaginosis Diagnosis Using Decision Tree and Relief Algorithms

Applied Sciences ◽

10.3390/app10093291 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3291

Author(s):

Jesús F. Pérez-Gómez ◽

Juana Canul-Reich ◽

José Hernández-Torruco ◽

Betania Hernández-Ocaña

Keyword(s):

Feature Selection ◽

Decision Tree ◽

Bacterial Vaginosis ◽

Cross Validation ◽

Performance Comparison ◽

Support Vector ◽

Ongoing Research ◽

Selection For ◽

Comparison Of The Results ◽

Fold Cross Validation

Requiring only a few relevant characteristics from patients when diagnosing bacterial vaginosis is highly useful for physicians as it makes it less time consuming to collect these data. This would result in having a dataset of patients that can be more accurately diagnosed using only a subset of informative or relevant features in contrast to using the entire set of features. As such, this is a feature selection (FS) problem. In this work, decision tree and Relief algorithms were used as feature selectors. Experiments were conducted on a real dataset for bacterial vaginosis with 396 instances and 252 features/attributes. The dataset was obtained from universities located in Baltimore and Atlanta. The FS algorithms utilized feature rankings, from which the top fifteen features formed a new dataset that was used as input for both support vector machine (SVM) and logistic regression (LR) algorithms for classification. For performance evaluation, averages of 30 runs of 10-fold cross-validation were reported, along with balanced accuracy, sensitivity, and specificity as performance measures. A performance comparison of the results was made between using the total number of features against using the top fifteen. These results found similar attributes from our rankings compared to those reported in the literature. This study is part of ongoing research that is investigating a range of feature selection and classification methods.

Download Full-text

Global observation-based climatology of precipitation occurrence and peak intensity

10.5194/egusphere-egu2020-7837 ◽

2020 ◽

Author(s):

Hylke Beck ◽

Seth Westra ◽

Eric Wood

Keyword(s):

Land Surface ◽

Regression Models ◽

Cross Validation ◽

Climate Models ◽

Daily Precipitation ◽

State Of The Art ◽

Coefficient Of Determination ◽

Peak Intensity ◽

Uncertainty Estimates ◽

Fold Cross Validation

We introduce a unique set of global observation-based climatologies of daily precipitation (P) occurrence (related to the lower tail of the P distribution) and peak intensity (related to the upper tail of the P distribution). The climatologies were produced using Random Forest (RF) regression models trained with an unprecedented collection of daily P observations from 93,138 stations worldwide. Five-fold cross-validation was used to evaluate the generalizability of the approach and to quantify uncertainty globally. The RF models were found to provide highly satisfactory performance, yielding cross-validation coefficient of determination (R2) values from 0.74 for the 15-year return-period daily P intensity to 0.86 for the >0.5 mm d-1 daily P occurrence. The performance of the RF models was consistently superior to that of state-of-the-art reanalysis (ERA5) and satellite (IMERG) products. The highest P intensities over land were found along the western equatorial coast of Africa, in India, and along coastal areas of Southeast Asia. Using a 0.5 mm d-1 threshold, P was estimated to occur 23.2 % of days on average over the global land surface (excluding Antarctica). The climatologies including uncertainty estimates will be released as the Precipitation DISTribution (PDIST) dataset via www.gloh2o.org/pdist. We expect the dataset to be useful for numerous purposes, such as the evaluation of climate models, the bias correction of gridded P datasets, and the design of hydraulic structures in poorly gauged regions.

Download Full-text

Analisis Sentimen Pada Maskapai Penerbangan di Platform Twitter Menggunakan Algoritma Support Vector Machine (SVM)

Teknika ◽

10.34148/teknika.v10i1.311 ◽

2021 ◽

Vol 10 (1) ◽

pp. 18-26

Author(s):

Hendry Cipta Husada ◽

Adi Suryaputra Paramita

Keyword(s):

Machine Learning ◽

Social Media ◽

Support Vector Machine ◽

Cross Validation ◽

Support Vector ◽

Learning Approach ◽

Social Media Platform ◽

Machine Learning Approach ◽

Media Platform ◽

Fold Cross Validation

Perkembangan teknologi saat ini telah memberikan kemudahan bagi banyak orang dalam mendapatkan dan menyebarkan informasi di berbagai social media platform. Twitter merupakan salah satu media yang kerap digunakan untuk menyampaikan opini sebagai bentuk reaksi seseorang atas suatu hal. Opini yang terdapat di Twitter dapat digunakan perusahaan maskapai penerbangan sebagai parameter kunci untuk mengetahui tingkat kepuasan publik sekaligus bahan evaluasi bagi perusahaan. Berdasarkan hal tersebut, diperlukan sebuah metode yang dapat secara otomatis melakukan klasifikasi opini ke dalam kategori positif, negatif, atau netral melalui proses analisis sentimen. Proses analisis sentimen dilakukan dengan proses data preprocessing, pembobotan kata menggunakan metode TF-IDF, penerapan algoritma, dan pembahasan atas hasil klasifikasi. Klasifikasi opini dilakukan dengan machine learning approach memanfaatkan algoritma multi-class Support Vector Machine (SVM). Data yang digunakan dalam penelitian ini adalah opini dalam bahasa Inggris dari para pengguna Twitter terhadap maskapai penerbangan. Berdasarkan pengujian yang telah dilakukan, hasil klasifikasi terbaik diperoleh menggunakan SVM kernel RBF pada nilai parameter 𝐶(complexity) = 10 dan 𝛾(gamma) = 1, dengan nilai accuracy sebesar 84,37% dan 80,41% ketika menggunakan 10-fold cross validation.

Download Full-text

A Novel Computational Model for Predicting Potential LncRNA-Disease Associations based on Both Direct and Indirect Features of LncRNA-Disease Pairs

10.21203/rs.2.18937/v3 ◽

2020 ◽

Author(s):

Yubin Xiao ◽

Zheng Xiao ◽

Xiang Feng ◽

Zhiping Chen ◽

Linai Kuang ◽

...

Keyword(s):

Computational Model ◽

Cross Validation ◽

State Of The Art ◽

Prediction Methods ◽

Good Prediction ◽

Average Case ◽

Comparison Results ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background: Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well.Results: In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (5-fold CV), 10-Fold Cross Validation (10-fold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in 5-fold CV, 10-fold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA.Conclusion: The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

Download Full-text