A Universal Steganalysis Algorithm for JPEG Image Based on Selective SVMs Ensemble

Universal steganalysis include feature extraction and steganalyzer design. Most universal steganalysis use Support Vector Machine (SVM) as steganalyzer. However, most SVM-based universal steganalysis are not to be very much effective at lower embedding rates. The reason why selective SVMs ensemble improve the generalization ability was analyzed, and an algorithm to select a part of individual SVMs according to their difference to build the ensemble classifier was proposed, which based on the selected ensemble theory-Many could be better than all. In this paper, the selective SVMs ensemble algorithm was used to construct a strong steganalyzer to improve the performance of steganographic detection. The twenty five experiments on the benchmark with 2000 different types of images show that: for popular steganography methods, and under different conditions of embedding rate, the average detection rate of proposed steganalysis method outperforms the maximum average detection rate for the steganalysis method based on single SVM with improving by 3.05%-12.05%; and for the steganalysis method based on BaggingSVM with improving by 0.2%-1.3%.

Download Full-text

Deep Learning Methods for Classification of Certain Abnormalities in Echocardiography

Electronics ◽

10.3390/electronics10040495 ◽

2021 ◽

Vol 10 (4) ◽

pp. 495

Author(s):

Imayanmosha Wahlang ◽

Arnab Kumar Maji ◽

Goutam Saha ◽

Prasun Chakrabarti ◽

Michal Jasinski ◽

...

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Support Vector ◽

Variational Autoencoder ◽

Different Types ◽

Static Images ◽

Long Short Term Memory ◽

2D And 3D ◽

Better Than

This article experiments with deep learning methodologies in echocardiogram (echo), a promising and vigorously researched technique in the preponderance field. This paper involves two different kinds of classification in the echo. Firstly, classification into normal (absence of abnormalities) or abnormal (presence of abnormalities) has been done, using 2D echo images, 3D Doppler images, and videographic images. Secondly, based on different types of regurgitation, namely, Mitral Regurgitation (MR), Aortic Regurgitation (AR), Tricuspid Regurgitation (TR), and a combination of the three types of regurgitation are classified using videographic echo images. Two deep-learning methodologies are used for these purposes, a Recurrent Neural Network (RNN) based methodology (Long Short Term Memory (LSTM)) and an Autoencoder based methodology (Variational AutoEncoder (VAE)). The use of videographic images distinguished this work from the existing work using SVM (Support Vector Machine) and also application of deep-learning methodologies is the first of many in this particular field. It was found that deep-learning methodologies perform better than SVM methodology in normal or abnormal classification. Overall, VAE performs better in 2D and 3D Doppler images (static images) while LSTM performs better in the case of videographic images.

Download Full-text

An Empirical Study of Different Approaches for Protein Classification

The Scientific World JOURNAL ◽

10.1155/2014/236717 ◽

2014 ◽

Vol 2014 ◽

pp. 1-17 ◽

Cited By ~ 33

Author(s):

Loris Nanni ◽

Alessandra Lumini ◽

Sheryl Brahnam

Keyword(s):

Tertiary Structure ◽

State Of The Art ◽

Support Vector ◽

Protein Classification ◽

Matrix Representations ◽

New Methods ◽

Multiple Datasets ◽

Different Types ◽

Scoring Matrix ◽

Better Than

Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art.

Download Full-text

Predicting hyperlipidemia using enhanced ensemble classifier

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.10693 ◽

2018 ◽

Vol 7 (3) ◽

pp. 1114

Author(s):

Lakshmi K S ◽

G Vadivu ◽

Suja Subramanian

Keyword(s):

Ensemble Classifier ◽

Ensemble Classification ◽

Support Vector ◽

Bayes Classifier ◽

Health Records ◽

Knn Classifier ◽

Classifier Performance ◽

Decision Tree Method ◽

Tree Method ◽

Better Than

Advancement in medical technology has resulted in bulk creation of electronic medical health records. These health records contain valuable data which are not fully utilized. Efficient usage of data mining techniques helps in discovering potentially relevant facts from medical records. Classification plays an important role in disease prediction. In this paper we developed a prediction model for predicting hyperlipidemia based on ensemble classification. Support Vector Machine, Naïve Bayes Classifier, KNN Classifier and Decision Tree method are combined for developing the ensemble classifier. Performance of each classifier is evaluated separately. An overall accuracy of 97.07% has been obtained by using ensemble approach which is better than the performance of each classifier.

Download Full-text

A Computational Method for the Identification of Endolysins and Autolysins

Protein and Peptide Letters ◽

10.2174/0929866526666191002104735 ◽

2020 ◽

Vol 27 (4) ◽

pp. 329-336 ◽

Cited By ~ 1

Author(s):

Lei Xu ◽

Guangmin Liang ◽

Baowen Chen ◽

Xu Tan ◽

Huaikun Xiang ◽

...

Keyword(s):

Support Vector Machine ◽

Cell Wall ◽

Experimental Results ◽

Computational Method ◽

Lytic Enzyme ◽

Support Vector ◽

Lytic Enzymes ◽

Data Set ◽

Optimal Feature ◽

Better Than

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.

Download Full-text

Design, Development and Comparison of Heuristic Driven Algorithms Based on the Crossed Domain Products’ Reviews for User’s Summarization

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190626110342 ◽

2020 ◽

Vol 13 (5) ◽

pp. 884-892

Author(s):

Sartaj Ahmad ◽

Ashutosh Gupta ◽

Neeraj Kumar Gupta

Keyword(s):

Decision Making ◽

Online Shopping ◽

Recent Time ◽

Design Development ◽

Different Types ◽

Feature Based ◽

Use Of Internet ◽

Day By Day ◽

Better Than

Background: In recent time, people love online shopping but before any shopping feedbacks or reviews always required. These feedbacks help customers in decision making for buying any product or availing any service. In the country like India this trend of online shopping is increasing very rapidly because awareness and the use of internet which is increasing day by day. As result numbers of customers and their feedbacks are also increasing. It is creating a problem that how to read all reviews manually. So there should be some computerized mechanism that provides customers a summary without spending time in reading feedbacks. Besides big number of reviews another problem is that reviews are not structured. Objective: In this paper, we try to design, implement and compare two algorithms with manual approach for the crossed domain Product’s reviews. Methods: Lexicon based model is used and different types of reviews are tested and analyzed to check the performance of these algorithms. Results: Algorithm based on opinions and feature based opinions are designed, implemented, applied and compared with the manual results and it is found that algorithm # 2 is performing better than algorithm # 1 and near to manual results. Conclusion: Algorithm # 2 is found better on the different product’s reviews and still to be applied on other product’s reviews to enhance its scope. Finally, it will be helpful to automate existing manual process.

Download Full-text

Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

International Journal of Natural Computing Research ◽

10.4018/ijncr.2018010104 ◽

2018 ◽

Vol 7 (1) ◽

pp. 57-72

Author(s):

H.P. Vinutha ◽

Poornima Basavaraju

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Rate ◽

Information Gain ◽

False Positive Rate ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Chi Square ◽

Traffic Pattern ◽

Data Mining Algorithms

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

Download Full-text

NLOS Multipath Classification of GNSS Signal Correlation Output Using Machine Learning

Sensors ◽

10.3390/s21072503 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2503

Author(s):

Taro Suzuki ◽

Yoshiharu Amano

Keyword(s):

Machine Learning ◽

Satellite System ◽

Training Data ◽

Support Vector ◽

Positioning Errors ◽

Automated Method ◽

Global Navigation Satellite ◽

Better Than ◽

Signal Correlation

This paper proposes a method for detecting non-line-of-sight (NLOS) multipath, which causes large positioning errors in a global navigation satellite system (GNSS). We use GNSS signal correlation output, which is the most primitive GNSS signal processing output, to detect NLOS multipath based on machine learning. The shape of the multi-correlator outputs is distorted due to the NLOS multipath. The features of the shape of the multi-correlator are used to discriminate the NLOS multipath. We implement two supervised learning methods, a support vector machine (SVM) and a neural network (NN), and compare their performance. In addition, we also propose an automated method of collecting training data for LOS and NLOS signals of machine learning. The evaluation of the proposed NLOS detection method in an urban environment confirmed that NN was better than SVM, and 97.7% of NLOS signals were correctly discriminated.

Download Full-text

Fighting Together against the Pandemic: Learning Multiple Models on Tomography Images for COVID-19 Diagnosis

AI ◽

10.3390/ai2020016 ◽

2021 ◽

Vol 2 (2) ◽

pp. 261-273

Author(s):

Mario Manzo ◽

Simone Pellino

Keyword(s):

Network Architecture ◽

State Of The Art ◽

Ensemble Classification ◽

Effective Vaccine ◽

Rt Pcr ◽

Neural Network Architecture ◽

Experimental Phase ◽

Different Types ◽

Polymerase Chain ◽

Better Than

COVID-19 has been a great challenge for humanity since the year 2020. The whole world has made a huge effort to find an effective vaccine in order to save those not yet infected. The alternative solution is early diagnosis, carried out through real-time polymerase chain reaction (RT-PCR) tests or thorax Computer Tomography (CT) scan images. Deep learning algorithms, specifically convolutional neural networks, represent a methodology for image analysis. They optimize the classification design task, which is essential for an automatic approach with different types of images, including medical. In this paper, we adopt a pretrained deep convolutional neural network architecture in order to diagnose COVID-19 disease from CT images. Our idea is inspired by what the whole of humanity is achieving, as the set of multiple contributions is better than any single one for the fight against the pandemic. First, we adapt, and subsequently retrain for our assumption, some neural architectures that have been adopted in other application domains. Secondly, we combine the knowledge extracted from images by the neural architectures in an ensemble classification context. Our experimental phase is performed on a CT image dataset, and the results obtained show the effectiveness of the proposed approach with respect to the state-of-the-art competitors.

Download Full-text

Zonation of Landslide Susceptibility in Ruijin, Jiangxi, China

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18115906 ◽

2021 ◽

Vol 18 (11) ◽

pp. 5906

Author(s):

Xiaoting Zhou ◽

Weicheng Wu ◽

Ziyu Lin ◽

Guiliang Zhang ◽

Renxiang Chen ◽

...

Keyword(s):

Environmental Factors ◽

Landslide Susceptibility ◽

Urban Areas ◽

Support Vector ◽

Susceptibility Map ◽

Human Society ◽

Learning Approaches ◽

Prevention Measures ◽

Landslide Occurrence ◽

Better Than

Landslides are one of the major geohazards threatening human society. The objective of this study was to conduct a landslide hazard susceptibility assessment for Ruijin, Jiangxi, China, and to provide technical support to the local government for implementing disaster reduction and prevention measures. Machine learning approaches, e.g., random forests (RFs) and support vector machines (SVMs) were employed and multiple geo-environmental factors such as land cover, NDVI, landform, rainfall, lithology, and proximity to faults, roads, and rivers, etc., were utilized to achieve our purposes. For categorical factors, three processing approaches were proposed: simple numerical labeling (SNL), weight assignment (WA)-based and frequency ratio (FR)-based. Then 19 geo-environmental factors were respectively converted into raster to constitute three 19-band datasets, i.e., DS1, DS2, and DS3 from three different processes. Then, 155 observed landslides that occurred in the past decades were vectorized, among which 70% were randomly selected to compose a training set (TS1) and the remaining 30% to form a validation set (VS1). A number of non-landslide (no-risk) samples distributed in the whole study area were identified in low slope (<1–3°) zones such as urban areas and croplands, and also added to the TS1 and VS1 in the same ratio. For comparison, we used the FR approach to identify the no-risk samples in both flat and non-flat areas, and merged them into the field-observed landslides to constitute another pair of training and validation sets (TS2 and VS2) using the same ratio of 7:3. The RF algorithm was applied to model the probability of the landslide occurrence using DS1, DS2, and DS3 as predictive variables and TS1 and TS2 for training to obtain the SNL-based, WA-based, and FR-based RF models, respectively. Verified against VS1 and VS2, the three models have similar overall accuracy (OA) and Kappa coefficient (KC), which are 89.61%, 91.47%, and 94.54%, and 0.7926, 0.8299, and 0.8908, respectively. All of them are much better than the three models obtained by SVM algorithm with OA of 81.79%, 82.86%, and 83%, and KC of 0.6337, 0.655, and 0.660. New case verification with the recent 26 landslide events of 2017–2020 revealed that the landslide susceptibility map from WA-based RF modeling was able to properly identify the high and very high susceptibility zones where 23 new landslides had occurred, and performed better than the SNL-based and FR-based RF modeling, though the latter has a slightly higher OA and KC. Hence, we concluded that all three RF models achieve reasonable risk prediction, but WA-based and FR-based RF modeling deserves a recommendation for application elsewhere. The results of this study may serve as reference for the local authorities in prevention and early warning of landslide hazards.

Download Full-text

Variety Identification of Orchids Using Fourier Transform Infrared Spectroscopy Combined with Stacked Sparse Auto-Encoder

Molecules ◽

10.3390/molecules24132506 ◽

2019 ◽

Vol 24 (13) ◽

pp. 2506 ◽

Cited By ~ 1

Author(s):

Yunfeng Chen ◽

Yue Chen ◽

Xuping Feng ◽

Xufeng Yang ◽

Jinnuo Zhang ◽

...

Keyword(s):

Fourier Transform ◽

Principal Component ◽

Fourier Transform Infrared ◽

Spectroscopic Technique ◽

Variety Identification ◽

Support Vector ◽

K Nearest Neighbors ◽

Discriminant Models ◽

The Fourier Transform ◽

Better Than

The feasibility of using the fourier transform infrared (FTIR) spectroscopic technique with a stacked sparse auto-encoder (SSAE) to identify orchid varieties was studied. Spectral data of 13 orchids varieties covering the spectral range of 4000–550 cm−1 were acquired to establish discriminant models and to select optimal spectral variables. K nearest neighbors (KNN), support vector machine (SVM), and SSAE models were built using full spectra. The SSAE model performed better than the KNN and SVM models and obtained a classification accuracy 99.4% in the calibration set and 97.9% in the prediction set. Then, three algorithms, principal component analysis loading (PCA-loading), competitive adaptive reweighted sampling (CARS), and stacked sparse auto-encoder guided backward (SSAE-GB), were used to select 39, 300, and 38 optimal wavenumbers, respectively. The KNN and SVM models were built based on optimal wavenumbers. Most of the optimal wavenumbers-based models performed slightly better than the all wavenumbers-based models. The performance of the SSAE-GB was better than the other two from the perspective of the accuracy of the discriminant models and the number of optimal wavenumbers. The results of this study showed that the FTIR spectroscopic technique combined with the SSAE algorithm could be adopted in the identification of the orchid varieties.

Download Full-text