Garment Categorization Using Data Mining Techniques

The apparel industry houses a huge amount and variety of data. At every step of the supply chain, data is collected and stored by each supply chain actor. This data, when used intelligently, can help with solving a good deal of problems for the industry. In this regard, this article is devoted to the application of data mining on the industry’s product data, i.e., data related to a garment, such as fabric, trim, print, shape, and form. The purpose of this article is to use data mining and symmetry-based learning techniques on product data to create a classification model that consists of two subsystems: (1) for predicting the garment category and (2) for predicting the garment sub-category. Classification techniques, such as Decision Trees, Naïve Bayes, Random Forest, and Bayesian Forest were applied to the ‘Deep Fashion’ open-source database. The data contain three garment categories, 50 garment sub-categories, and 1000 garment attributes. The two subsystems were first trained individually and then integrated using soft classification. It was observed that the performance of the random forest classifier was comparatively better, with an accuracy of 86%, 73%, 82%, and 90%, respectively, for the garment category, and sub-categories of upper body garment, lower body garment, and whole-body garment.

Download Full-text

Proximate Breast Cancer Factors Using Data Mining Classification Techniques

International Journal of Big Data and Analytics in Healthcare ◽

10.4018/ijbdah.2019010104 ◽

2019 ◽

Vol 4 (1) ◽

pp. 47-56

Author(s):

Alice Constance Mensah ◽

Isaac Ofori Asare

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Random Forest ◽

Cancer Patients ◽

Classification Model ◽

Tree Model ◽

Cancer Data ◽

Data Mining Approach ◽

Learning Techniques ◽

Using Data

Breast cancer is the most common of all cancers and is the leading cause of cancer deaths in women worldwide. The classification of breast cancer data can be useful to predict the outcome of some diseases or discover the genetic behavior of tumors. Data mining technology helps in classifying cancer patients and this technique helps to identify potential cancer patients by simply analyzing the data. This study examines the determinant factors of breast cancer and measures the breast cancer patient data to build a useful classification model using a data mining approach. In this study of 2397 women, 1022 (42.64%) were diagnosed with breast cancer. Among the four main learning techniques such as: Random Forest, Naive Bayes, Classification and Regression Model (CART), and Boosted Tree model were used for the study. The Random Forest technique had the better accuracy value of 0.9892(95%CI,0.9832 -0.9935) and a sensitivity value of about 92%. This means that the Random Forest learning model is the best model to classify and predict breast cancer based on associated factors.

Download Full-text

Hybrid classification model to detect advanced intrusions using data mining techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.4.10031 ◽

2018 ◽

Vol 7 (2.4) ◽

pp. 10

Author(s):

V Mala ◽

K Meena

Keyword(s):

Data Mining ◽

Learning Algorithm ◽

Detection System ◽

Classification Model ◽

Detection Methods ◽

Data Mining Techniques ◽

Detection Systems ◽

Intruder Detection ◽

Hybrid Classification ◽

Using Data

Traditional signature based approach fails in detecting advanced malwares like stuxnet, flame, duqu etc. Signature based comparison and correlation are not up to the mark in detecting such attacks. Hence, there is crucial to detect these kinds of attacks as early as possible. In this research, a novel data mining based approach were applied to detect such attacks. The main innovation lies on Misuse signature detection systems based on supervised learning algorithm. In learning phase, labeled examples of network packets systems calls are (gave) provided, on or after which algorithm can learn about the attack which is fast and reliable to known. In order to detect advanced attacks, unsupervised learning methodologies were employed to detect the presence of zero day/ new attacks. The main objective is to review, different intruder detection methods. To study the role of Data Mining techniques used in intruder detection system. Hybrid –classification model is utilized to detect advanced attacks.

Download Full-text

Predicting heart ailment in patients with varying number of features using data mining techniques

International Journal of Informatics and Communication Technology (IJ-ICT) ◽

10.11591/ijict.v8i1.pp56-62 ◽

2019 ◽

Vol 8 (1) ◽

pp. 56

Author(s):

T R Stella Mary ◽

Shoney Sebastian

Keyword(s):

Data Mining ◽

Heart Disease ◽

Random Forest ◽

Naive Bayes ◽

Heart Diseases ◽

Naïve Bayes ◽

Bayes Classifier ◽

Data Mining Techniques ◽

Using Data ◽

Almost All

<span>Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>

Download Full-text

Performance Analysis and Prediction Student Performance to build effective student Using Data Mining Techniques

UHD Journal of Science and Technology ◽

10.21928/uhdjst.v3n2y2019.pp10-15 ◽

2019 ◽

Vol 3 (2) ◽

pp. 10

Author(s):

Ardalan Husin Awlla

Keyword(s):

Data Mining ◽

Performance Analysis ◽

Student Performance ◽

Fraud Detection ◽

Classification Model ◽

Data Mining Techniques ◽

Incipient Stage ◽

The Everyday ◽

Using Data

In this period of computerization, schooling has additionally remodeled itself and is not restrained to old lecture technique. The everyday quest is on to discover better approaches to make it more successful and productive for students. These days, masses of data are gathered in educational databases, however it stays unutilized. To be able to get required advantages from such major information, effective tools are required. Data mining is a developing capable tool for examination and expectation. It is effectively applied in the field of fraud detection, marketing, promoting, forecast and loan assessment. However, it is in incipient stage in the area of education. In this paper, data mining techniques have been applied to construct a classification model to predict the performance of students.

Download Full-text

A Framework for Modeling Efficient Demand Forecasting Using Data Mining in Supply Chain of Food Products Export Industry

Advances in Intelligent and Soft Computing - Proceedings of the 6th CIRP-Sponsored International Conference on Digital Enterprise Technology ◽

10.1007/978-3-642-10430-5_106 ◽

2010 ◽

pp. 1387-1397

Author(s):

Pongsak Holimchayachotikul ◽

Nuanlaor Phanruangrong

Keyword(s):

Data Mining ◽

Supply Chain ◽

Demand Forecasting ◽

Food Products ◽

Export Industry ◽

Using Data

Download Full-text

Email Worm Detection Using Data Mining

Techniques and Applications for Advanced Information Privacy and Security ◽

10.4018/978-1-60566-210-7.ch002 ◽

2011 ◽

pp. 20-34

Author(s):

Mohammad M. Masud ◽

Latifur Khan ◽

Bhavani Thuraisingham

Keyword(s):

Data Mining ◽

Feature Selection ◽

Principal Component ◽

Classification Model ◽

Support Vector ◽

Two Phase ◽

Feature Selection Technique ◽

Worm Detection ◽

Phase Selection ◽

Using Data

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.

Download Full-text

Analysis and Prediction of Higher Education Learners’ Mindset Using Data Mining Tool and Techniques

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9074 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4344-4349

Author(s):

Safira Begum ◽

Sunita S. Padmannavar

Keyword(s):

Data Mining ◽

Undergraduate Students ◽

Classification Model ◽

Training Dataset ◽

Text Analytics ◽

Descriptive Research ◽

Making Sense ◽

Rural And Urban ◽

Mining Tool ◽

Using Data

With advent of smart-phones and internet first approach, large amounts of data is generated and collected everyday which is considered as Big Data. Analyzing and making sense out of such data is very important but challenging as well, due to its complexity. The knowledge is hidden in the data and can be extracted through Data mining techniques. The purpose of this descriptive research study is to evaluate and predict the mindset of rural and urban students with text analytics and visualization capabilities in the Orange tool. As part of the study a group of undergraduate students from different regional backgrounds participated in a survey for analyzing their mindset being growth mindset and fixed mindset along with a range in-between. The respondents’ data was used as training dataset for the classification model, which was then used to train the prediction model in the Orange tool. The results showed 75% accuracy in predicting the mindset.

Download Full-text

Predicting Heart Ailment in Patients with Varying number of Features using Data Mining Techniques

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i4.pp2675-2681 ◽

2019 ◽

Vol 9 (4) ◽

pp. 2675

Author(s):

T R Stella Mary ◽

Shoney Sebastian

Keyword(s):

Data Mining ◽

Heart Disease ◽

Random Forest ◽

Naive Bayes ◽

Heart Diseases ◽

Naïve Bayes ◽

Bayes Classifier ◽

Data Mining Techniques ◽

Using Data ◽

Almost All

<span lang="EN-US">Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>

Download Full-text

Differential privacy based classification model for mining medical data stream using adaptive random forest

Acta Universitatis Sapientiae Informatica ◽

10.2478/ausi-2021-0001 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-20

Author(s):

Hayder K. Fatlawi ◽

Attila Kiss

Keyword(s):

Data Mining ◽

Random Forest ◽

Data Stream ◽

Differential Privacy ◽

Medical Data ◽

The Other ◽

Classification Model ◽

Mining Operations ◽

Typical Data ◽

Stable Performance

Abstract Most typical data mining techniques are developed based on training the batch data which makes the task of mining the data stream represent a significant challenge. On the other hand, providing a mechanism to perform data mining operations without revealing the patient’s identity has increasing importance in the data mining field. In this work, a classification model with differential privacy is proposed for mining the medical data stream using Adaptive Random Forest (ARF). The experimental results of applying the proposed model on four medical datasets show that ARF mostly has a more stable performance over the other six techniques.

Download Full-text

Prognosis on Stratification of Breast Cancer using Data Mining Models

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3406.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 650-653

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Random Forest ◽

Prediction Model ◽

Prediction Models ◽

Sequential Minimal Optimization ◽

Fine Needle Aspirate ◽

Fine Needle ◽

Physical Attributes ◽

Using Data

Breast cancer classification can be useful for discovering the genetic behavior of tumors and envision the outcome of some diseases. Through this paper we are predicting the noxious behavior of a tumor. The prediction models used are Random Forest, Naïve Bayes, IBK (Instance Based Learner), SMO (Sequential minimal optimization), and Multi Class Classifier. This prediction model which can potentially be used as a biomarker of breast cancer is based on physical attributes of a breast mass and which is gathered from digitized image of Fine Needle Aspirate (FNA). These can be helpful in prediction and reduction of invasive tumors

Download Full-text