An Expanded Feature Extraction of E-Mail Header for Spam Recognition

2013 ◽  
Vol 846-847 ◽  
pp. 1672-1675 ◽  
Author(s):  
Yuan Ning Liu ◽  
Ye Han ◽  
Xiao Dong Zhu ◽  
Fei He ◽  
Li Yan Wei

Currently a spam filtering method is extracting attributes from e-mail header and using machine learning methods to classify the sample sets. But as time goes on, spammers transform different ways to send spam, which result in a great change of spam's header. So the attributes defined in the past could not deal with this change sufficiently. This paper extracted attributes from all possible forged header fields to expand the feature sets, then used the rough set theory to classify the sample sets. Experiment validated more attributes including in feature sets may lead to greater performance, in terms of higher recall and precision, lower fake recognition than other algorithms.

2014 ◽  
Vol 543-547 ◽  
pp. 2017-2023
Author(s):  
Qing Guan ◽  
Jian He Guan

The technique of a new extension of fuzzy rough theory using partition of interval set-valued is proposed for granular computing during knowledge discovery in this paper. The natural intervals of attribute values in decision system to be transformed into multiple sub-interval of [0,1]are given by normalization. And some characteristics of interval set-valued of decision systems in fuzzy rough set theory are discussed. The correctness and effectiveness of the approach are shown in experiments. The approach presented in this paper can also be used as a data preprocessing step for other symbolic knowledge discovery or machine learning methods other than rough set theory.


2021 ◽  
Author(s):  
Ghazaala Yasmin ◽  
ASIT KUMAR DAS ◽  
Janmenjoy Nayak ◽  
S Vimal ◽  
Soumi Dutta

Abstract Speech is one of the most delicate medium through which gender of the speakers can easily be identified. Though the related research has shown very good progress in machine learning but recently, deep learning has imparted a very good research area to explore the deficiency of gender discrimination using traditional machine learning techniques. In deep learning techniques, the speech features are automatically generated by the reinforcement learning from the raw data which have more discriminating power than the human generated features. But in some practical situations like gender recognition, it is observed that combination of both types of features sometimes provides comparatively better performance. In the proposed work, we have initially extracted and selected some informative and precise acoustic features relevant to gender recognition using entropy based information theory and Rough Set Theory (RST). Next, the audio speech signals are directly fed into the deep neural network model consists of Convolution Neural Network (CNN) and Gated Recurrent Unit network (GRUN) for extracting features useful for gender recognition. The RST selects precise and informative features, CNN extracts the locally encoded important features, and GRUN reduces the vanishing gradient and exploding gradient problems. Finally, a hybrid gender recognition system is developed combining both generated feature vectors. The developed model has been tested with five bench mark and a simulated dataset to evaluate its performance and it is observed that combined feature vector provides more effective gender recognition system specially when transgender is considered as a gender type together with male and female.


2011 ◽  
Vol 230-232 ◽  
pp. 625-628
Author(s):  
Lei Shi ◽  
Xin Ming Ma ◽  
Xiao Hong Hu

E-bussiness has grown rapidly in the last decade and massive amount of data on customer purchases, browsing pattern and preferences has been generated. Classification of electronic data plays a pivotal role to mine the valuable information and thus has become one of the most important applications of E-bussiness. Support Vector Machines are popular and powerful machine learning techniques, and they offer state-of-the-art performance. Rough set theory is a formal mathematical tool to deal with incomplete or imprecise information and one of its important applications is feature selection. In this paper, rough set theory and support vector machines are combined to construct a classification model to classify the data of E-bussiness effectively.


2021 ◽  
Vol 23 (4) ◽  
pp. 695-708
Author(s):  
Katarzyna Antosz ◽  
Małgorzata Jasiulewicz-Kaczmarek ◽  
Łukasz Paśko ◽  
Chao Zhang ◽  
Shaoping Wang

Lean maintenance concept is crucial to increase the reliability and availability of maintenance equipment in the manufacturing companies. Due the elimination of losses in maintenance processes this concept reduce the number of unplanned downtime and unexpected failures, simultaneously influence a company’s operational and economic performance. Despite the widespread use of lean maintenance, there is no structured approach to support the choice of methods and tools used for the maintenance function improvement. Therefore, in this paper by using machine learning methods and rough set theory a new approach was proposed. This approach supports the decision makers in the selection of methods and tools for the effective implementation of Lean Maintenance.


Author(s):  
Hiroshi Sakai ◽  
◽  
Kazuhiro Koba ◽  
Michinori Nakata ◽  

Rough set theory has been mainly applied to data with categorical values. In order to handle data with numerical values in this theory, a familiar concept of ‘wildcards’ was employed, and a new framework of rough sets based rule generation has been proposed. Two characters @ and # were introduced into this framework, and numerical patterns were also defined for numerical values. The concepts of ‘coarse’ and ‘fine’ for rules were explicitly defined according to numerical patterns. This paper enhances the previous framework, and describes the implementation of an utility program. This utility program is applied to the data in UCI Machine Learning Repository, and some useful rules are obtained.


2019 ◽  
Vol 11 (23) ◽  
pp. 6803
Author(s):  
Jiwoo Kim ◽  
Sanghun Shin ◽  
Hee Soo Lee ◽  
Kyong Joo Oh

An initial public offering (IPO) is a type of public offering in which a company’s shares are sold to institutional and individual investors. While the majority of studies on IPOs have focused on the efficiency of raising capital and price adequacy in IPOs, studies on portfolio allocation strategies for IPO stocks are relatively scarce. This paper develops a machine learning investment strategy for IPO stocks based on rough set theory and a genetic algorithm (GA-rough set theory). To reduce issues of information asymmetry, we use nonfinancial data that are publicly available to individual and institutional investors in the IPO process. Based on the rule sets generated from the training sets, we conduct 120 tests with various conditions involving the target days and the partition of the training and testing sets, and we find excess returns of the constructed portfolios compared to the benchmark portfolios. Investors in IPO stocks can formulate more efficient investment strategies using our system. In this sense, the system developed in this paper contributes to the efficiency of financial markets and helps achieve sustained economic growth.


2020 ◽  
Vol 1529 ◽  
pp. 052048
Author(s):  
Touhid Mohammad Hossain ◽  
Junzo Wataada ◽  
Maman Hermana ◽  
Izzatdin A Aziz

2015 ◽  
Vol 142 (1-4) ◽  
pp. 53-86 ◽  
Author(s):  
Sarah Vluymans ◽  
Lynn D’eer ◽  
Yvan Saeys ◽  
Chris Cornelis

Sign in / Sign up

Export Citation Format

Share Document