Feature extraction and prediction of Dengue Outbreaks

Author(s):  
Kunal Parikh ◽  
Tanvi Makadia ◽  
Harshil Patel

Dengue is unquestionably one of the biggest health concerns in India and for many other developing countries. Unfortunately, many people have lost their lives because of it. Every year, approximately 390 million dengue infections occur around the world among which 500,000 people are seriously infected and 25,000 people have died annually. Many factors could cause dengue such as temperature, humidity, precipitation, inadequate public health, and many others. In this paper, we are proposing a method to perform predictive analytics on dengue’s dataset using KNN: a machine-learning algorithm. This analysis would help in the prediction of future cases and we could save the lives of many.

2021 ◽  
Vol 6 (22) ◽  
pp. 51-59
Author(s):  
Mustazzihim Suhaidi ◽  
Rabiah Abdul Kadir ◽  
Sabrina Tiun

Extracting features from input data is vital for successful classification and machine learning tasks. Classification is the process of declaring an object into one of the predefined categories. Many different feature selection and feature extraction methods exist, and they are being widely used. Feature extraction, obviously, is a transformation of large input data into a low dimensional feature vector, which is an input to classification or a machine learning algorithm. The task of feature extraction has major challenges, which will be discussed in this paper. The challenge is to learn and extract knowledge from text datasets to make correct decisions. The objective of this paper is to give an overview of methods used in feature extraction for various applications, with a dataset containing a collection of texts taken from social media.


Author(s):  
Sercan Demirci ◽  
Durmuş Özkan Şahin ◽  
Ibrahim Halil Toprak

Skin cancer, which is one of the most common types of cancer in the world, is a malignant growth seen on the skin due to various reasons. There was an increase in the number of the cases of skin cancer nearly 200% between 2004-2009. Since the ozone layer is depleting, harmful rays reflected from the sun cannot be filtered. In this case, the likelihood of skin cancer will increase over the years and pose more risks for human beings. Early diagnosis is very significant as in all types of cancers. In this study, a mobile application is developed in order to detect whether the skin spots photographed by using the machine learning technique for early diagnosis have a suspicion of skin cancer. Thus, an auxiliary decision support system is developed that can be used both by the clinicians and individuals. For cases that are predicted to have a risk higher than a certain rate by the machine learning algorithm, early diagnosis could be initiated for the patients by consulting a physician when the case is considered to have a higher risk by machine learning algorithm.


2020 ◽  
Vol 44 (1) ◽  
pp. 231-269
Author(s):  
Rong Chen

Abstract Plural marking reaches most corners of languages. When a noun occurs with another linguistic element, which is called associate in this paper, plural marking on the two-component structure has four logically possible patterns: doubly unmarked, noun-marked, associate-marked and doubly marked. These four patterns do not distribute homogeneously in the world’s languages, because they are motivated by two competing motivations iconicity and economy. Some patterns are preferred over others, and this preference is consistently found in languages across the world. In other words, there exists a universal distribution of the four plural marking patterns. Furthermore, holding the view that plural marking on associates expresses plurality of nouns, I propose a hypothetical universal which uses the number of pluralized associates to predict plural marking on nouns. A data set collected from a sample of 100 languages is used to test the hypothetical universal, by employing the machine learning algorithm logistic regression.


2020 ◽  
Vol 1 (2) ◽  
pp. 1-6
Author(s):  
Shamik Kumar Roy ◽  
Sahitya Mondal

Climate change and Environmental Hazards has been burning issues all around the world. Air Pollution is a major contribution to the Environmental Pollution. Using Big Data and machine learning algorithm to formulate a solution to this burning global issue with an idea that applies techniques of IoT (Internet of Things) and Data Analytics to predict and prevent air pollution substantially. In this paper the main concern is to judge different works which are related to the air pollution and prevention mechanism which will definitely help the researchers for this domain.


Author(s):  
Nilesh Kumar Sahu ◽  
Manorama Patnaik ◽  
Itu Snigdh

The precision of any machine learning algorithm depends on the data set, its suitability, and its volume. Therefore, data and its characteristics have currently become the predominant components of any predictive or precision-based domain like machine learning. Feature engineering refers to the process of changing and preparing this input data so that it is ready for training machine learning models. Several features such as categorical, numerical, mixed, date, and time are to be considered for feature extraction in feature engineering. Datasets containing characteristics such as cardinality, missing data, and rare labels for categorical features, distribution, outliers, and magnitude are currently considered as features. This chapter discusses various data types and their techniques for applying to feature engineering. This chapter also focuses on the implementation of various data techniques for feature extraction.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 656-656
Author(s):  
Youngjun Kim ◽  
Uchechukuwu David ◽  
Yeonsik Noh

Abstract New surface electromyography (sEMG) feature extraction approach combined with Empirical Mode Decomposition (EMD) and Dispersion Entropy (DisEn) is proposed for classifying aggressive and normal behaviors from sEMG data. In this study, we used the sEMG physical action dataset from the UC Irvine Machine Learning repository. The raw sEMG was decomposed with EMD to obtain a set of Intrinsic Mode Functions (IMF). The IMF, which includes the most discriminant feature for each action, was selected based on the analysis by Hibert Transform (HT) in the time-frequency domain. Next, the DisEn of the selected IMF was calculated as a corresponding feature. Finally, the DisEn value was tested using five different classifiers, such as LDA, Quadratic DA, k-NN, SVM, and Extreme Learning Machine (ELM) for the classification task. Among these ML algorithms, we achieved classification accuracy, sensitivity, and specificity with ELM as 98.44%, 100%, and 96.72%, respectively.


Sign in / Sign up

Export Citation Format

Share Document