scholarly journals Garment Categorization Using Data Mining Techniques

Symmetry ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 984
Author(s):  
Sheenam Jain ◽  
Vijay Kumar

The apparel industry houses a huge amount and variety of data. At every step of the supply chain, data is collected and stored by each supply chain actor. This data, when used intelligently, can help with solving a good deal of problems for the industry. In this regard, this article is devoted to the application of data mining on the industry’s product data, i.e., data related to a garment, such as fabric, trim, print, shape, and form. The purpose of this article is to use data mining and symmetry-based learning techniques on product data to create a classification model that consists of two subsystems: (1) for predicting the garment category and (2) for predicting the garment sub-category. Classification techniques, such as Decision Trees, Naïve Bayes, Random Forest, and Bayesian Forest were applied to the ‘Deep Fashion’ open-source database. The data contain three garment categories, 50 garment sub-categories, and 1000 garment attributes. The two subsystems were first trained individually and then integrated using soft classification. It was observed that the performance of the random forest classifier was comparatively better, with an accuracy of 86%, 73%, 82%, and 90%, respectively, for the garment category, and sub-categories of upper body garment, lower body garment, and whole-body garment.

Author(s):  
Alice Constance Mensah ◽  
Isaac Ofori Asare

Breast cancer is the most common of all cancers and is the leading cause of cancer deaths in women worldwide. The classification of breast cancer data can be useful to predict the outcome of some diseases or discover the genetic behavior of tumors. Data mining technology helps in classifying cancer patients and this technique helps to identify potential cancer patients by simply analyzing the data. This study examines the determinant factors of breast cancer and measures the breast cancer patient data to build a useful classification model using a data mining approach. In this study of 2397 women, 1022 (42.64%) were diagnosed with breast cancer. Among the four main learning techniques such as: Random Forest, Naive Bayes, Classification and Regression Model (CART), and Boosted Tree model were used for the study. The Random Forest technique had the better accuracy value of 0.9892(95%CI,0.9832 -0.9935) and a sensitivity value of about 92%. This means that the Random Forest learning model is the best model to classify and predict breast cancer based on associated factors.


2018 ◽  
Vol 7 (2.4) ◽  
pp. 10
Author(s):  
V Mala ◽  
K Meena

Traditional signature based approach fails in detecting advanced malwares like stuxnet, flame, duqu etc. Signature based comparison and correlation are not up to the mark in detecting such attacks. Hence, there is crucial to detect these kinds of attacks as early as possible. In this research, a novel data mining based approach were applied to detect such attacks. The main innovation lies on Misuse signature detection systems based on supervised learning algorithm. In learning phase, labeled examples of network packets systems calls are (gave) provided, on or after which algorithm can learn about the attack which is fast and reliable to known. In order to detect advanced attacks, unsupervised learning methodologies were employed to detect the presence of zero day/ new attacks. The main objective is to review, different intruder detection methods. To study the role of Data Mining techniques used in intruder detection system. Hybrid –classification model is utilized to detect advanced attacks.


Author(s):  
T R Stella Mary ◽  
Shoney Sebastian

<span>Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>


2019 ◽  
Vol 3 (2) ◽  
pp. 10
Author(s):  
Ardalan Husin Awlla

In this period of computerization, schooling has additionally remodeled itself and is not restrained to old lecture technique. The everyday quest is on to discover better approaches to make it more successful and productive for students. These days, masses of data are gathered in educational databases, however it stays unutilized. To be able to get required advantages from such major information, effective tools are required. Data mining is a developing capable tool for examination and expectation. It is effectively applied in the field of fraud detection, marketing, promoting, forecast and loan assessment. However, it is in incipient stage in the area of education. In this paper, data mining techniques have been applied to construct a classification model to predict the performance of students.


Author(s):  
Mohammad M. Masud ◽  
Latifur Khan ◽  
Bhavani Thuraisingham

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.


2020 ◽  
Vol 17 (9) ◽  
pp. 4344-4349
Author(s):  
Safira Begum ◽  
Sunita S. Padmannavar

With advent of smart-phones and internet first approach, large amounts of data is generated and collected everyday which is considered as Big Data. Analyzing and making sense out of such data is very important but challenging as well, due to its complexity. The knowledge is hidden in the data and can be extracted through Data mining techniques. The purpose of this descriptive research study is to evaluate and predict the mindset of rural and urban students with text analytics and visualization capabilities in the Orange tool. As part of the study a group of undergraduate students from different regional backgrounds participated in a survey for analyzing their mindset being growth mindset and fixed mindset along with a range in-between. The respondents’ data was used as training dataset for the classification model, which was then used to train the prediction model in the Orange tool. The results showed 75% accuracy in predicting the mindset.


Author(s):  
T R Stella Mary ◽  
Shoney Sebastian

<span lang="EN-US">Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>


2021 ◽  
Vol 13 (1) ◽  
pp. 1-20
Author(s):  
Hayder K. Fatlawi ◽  
Attila Kiss

Abstract Most typical data mining techniques are developed based on training the batch data which makes the task of mining the data stream represent a significant challenge. On the other hand, providing a mechanism to perform data mining operations without revealing the patient’s identity has increasing importance in the data mining field. In this work, a classification model with differential privacy is proposed for mining the medical data stream using Adaptive Random Forest (ARF). The experimental results of applying the proposed model on four medical datasets show that ARF mostly has a more stable performance over the other six techniques.


Breast cancer classification can be useful for discovering the genetic behavior of tumors and envision the outcome of some diseases. Through this paper we are predicting the noxious behavior of a tumor. The prediction models used are Random Forest, Naïve Bayes, IBK (Instance Based Learner), SMO (Sequential minimal optimization), and Multi Class Classifier. This prediction model which can potentially be used as a biomarker of breast cancer is based on physical attributes of a breast mass and which is gathered from digitized image of Fine Needle Aspirate (FNA). These can be helpful in prediction and reduction of invasive tumors


Sign in / Sign up

Export Citation Format

Share Document