Analytical method for selection an informative set of features with limited resources in the pattern recognition problem

Feature selection is one of the most important issues in Data Mining and Pattern Recognition. Correctly selected features or a set of features in the final report determines the success of further work, in particular, the solution of the classification and forecasting problem. This work is devoted to the development and study of an analytical method for determining informative attribute sets (IAS) taking into account the resource for criteria based on the use of the scattering measure of classified objects. The areas of existence of the solution are determined. Statements and properties are proved for the Fisher type informativeness criterion, using which the proposed analytical method for determining IAS guarantees the optimality of results in the sense of maximizing the selected functional. The relevance of choosing this type of informativeness criterion is substantiated. The universality of the method with respect to the type of features is shown. An algorithm for implementing this method is presented. In addition, the paper discussed the dynamics of the growth of information in the world, problems associated with big data, as well as problems and tasks of data preprocessing. The relevance of reducing the dimension of the attribute space for the implementation of data processing and visualization without unnecessary difficulties is substantiated. The disadvantages of existing methods and algorithms for choosing an informative set of attributes are shown.

Download Full-text

Big Data: Technologies, Challenges, Data Analytics and Management: A Review

IMS Manthan (The Journal of Innovations) ◽

10.18701/imsmanthan.v12i01.10341 ◽

2017 ◽

Vol 12 (01) ◽

Author(s):

Shweta Kaushik

Keyword(s):

Data Mining ◽

Big Data ◽

Data Analytics ◽

High Speed ◽

Information Base ◽

Information Order ◽

The World ◽

Big Data Technologies ◽

Business Profitability ◽

The Web

Internet assumes an essential part in giving different learning sources to the world, which encourages numerous applications to give quality support of the customers. As the years go on the web is over-burden with parcel of data and it turns out to be difficult to extricate the applicable data from the web. This offers path to the advancement of the Big Data and the volume of the information continues expanding quickly step by step. Enormous Data has increased much consideration from the scholarly world and the IT business. In the advanced and figuring world, data is produced and gathered at a rate that quickly surpasses the limit go. Data mining procedures are utilized to locate the concealed data from the huge information. This Technique is utilized store, oversee, and investigate high speed of information and this information can be in any shape organized or unstructured frame. It is hard to handle substantial volume of information utilizing information base strategy like RDBMS. From one perspective, Big Data is amazingly important to deliver efficiency in organizations and transformative achievements in logical controls, which give us a considerable measure of chances to make incredible advances in many fields. There is most likely the future rivalries in business profitability and advances will without a doubt merge into the Big Data investigations. Then again, Big Data likewise emerges with many difficulties, for example, troubles in information catch, information stockpiling, information investigation and information perception. In this paper we concentrate on the audit of Big Data, its information order techniques and the way it can be mined utilizing different mining strategies.

Download Full-text

Expanding Data Mining Power with System Dynamics

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch166 ◽

2008 ◽

pp. 2688-2696

Author(s):

Edilberto Casado

Keyword(s):

Data Mining ◽

Decision Making ◽

Pattern Recognition ◽

System Dynamics ◽

Knowledge Discovery ◽

Strategic Decision ◽

Knowledge Discovery In Databases ◽

Strategic Decision Making ◽

The World ◽

Mathematical Techniques

Business intelligence (BI) is a key topic in business today, since it is focused on strategic decision making and on the search of value from business activities through empowering a “forward-thinking” view of the world. From this perspective, one of the most valuable concepts within BI is the “knowledge discovery in databases” or “data mining,” defined as “the process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques” (SPSS, 1997).

Download Full-text

Expanding Data Mining Power with System Dynamics

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch204 ◽

2005 ◽

pp. 1155-1161

Author(s):

Edilberto Casado

Keyword(s):

Data Mining ◽

Decision Making ◽

Pattern Recognition ◽

System Dynamics ◽

Knowledge Discovery ◽

Strategic Decision ◽

Knowledge Discovery In Databases ◽

Strategic Decision Making ◽

The World ◽

Mathematical Techniques

Download Full-text

LDRD 99-ERI-010 Final Report: Sapphire: Scalable Pattern Recognition for Large-Scale Scientific Data Mining

10.2172/15003138 ◽

2002 ◽

Author(s):

C Kamath

Keyword(s):

Data Mining ◽

Pattern Recognition ◽

Large Scale ◽

Scientific Data ◽

Final Report ◽

Scientific Data Mining

Download Full-text

Design an efficient disease monitoring system for paddy leaves based on big data mining

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol23iss65pp86-99 ◽

2020 ◽

Vol 23 (65) ◽

pp. 86-99

Author(s):

Suresh K ◽

Karthik S ◽

Hanumanthappa M

Keyword(s):

Data Mining ◽

Feature Selection ◽

Big Data ◽

Monitoring System ◽

Image Acquisition ◽

Plant Diseases ◽

Support Vector ◽

Svm Classifier ◽

Disease Monitoring ◽

Big Data Mining

With the progressions in Information and Communication Technology (ICT), the innumerable electronic devices (like smart sensors) and several software applications can proffer notable contributions to the challenges that are existent in monitoring plants. In the prevailing work, the segmentation accuracy and classification accuracy of the Disease Monitoring System (DMS), is low. So, the system doesn't properly monitor the plant diseases. To overcome such drawbacks, this paper proposed an efficient monitoring system for paddy leaves based on big data mining. The proposed model comprises 5 phases: 1) Image acquisition, 2) segmentation, 3) Feature extraction, 4) Feature Selection along with 5) Classification Validation. Primarily, consider the paddy leaf image which is taken as of the dataset as the input. Then, execute image acquisition phase where 3 steps like, i) transmute RGB image to grey scale image, ii) Normalization for high intensity, and iii) preprocessing utilizing Alpha-trimmed mean filter (ATMF) through which the noises are eradicated and its nature is the hybrid of the mean as well as median filters, are performed. Next, segment the resulting image using Fuzzy C-Means (i.e. FCM) Clustering Algorithm. FCM segments the diseased portion in the paddy leaves. In the next phase, features are extorted, and then the resulted features are chosen by utilizing Multi-Verse Optimization (MVO) algorithm. After completing feature selection, the chosen features are classified utilizing ANFIS (Adaptive Neuro-Fuzzy Inference System). Experiential results contrasted with the former SVM classifier (Support Vector Machine) and the prevailing methods in respect of precision, recall, F-measure,sensitivity accuracy, and specificity. In accuracy level, the proposed one has 97.28% but the prevailing techniques only offer 91.2% for SVM classifier, 85.3% for KNN and 88.78% for ANN. Hence, this proposed DMS has more accurate detection and classification process than the other methods. The proposed DMS evinces better accuracy when contrasting with the prevailing methods.

Download Full-text

Clustering method for spread pattern analysis of corona-virus (COVID-19) infection in Iran

Journal of Applied Science, Engineering, Technology, and Education ◽

10.35877/454ri.asci31109 ◽

2020 ◽

Vol 3 (1) ◽

pp. 1-6 ◽

Cited By ~ 2

Author(s):

Mehdi Azarafza ◽

Mohammad Azarafza ◽

Haluk Akgün

Keyword(s):

Data Mining ◽

Pattern Recognition ◽

Pattern Analysis ◽

Spatiotemporal Distribution ◽

Clustering Method ◽

Corona Virus ◽

The World ◽

Spread Pattern

The COVID-19 is outbreak from China and infected more than 131,652 people and caused 7,300 deaths in Iran. Unfortunately, the infection numbers and deaths are still increasing rapidly which has put the world on the catastrophic abyss edge. Application of data mining to perform pattern recognition of infection is mainly used for preparing the spread mapping which considered in this work for spatiotemporal distribution assessment and spread pattern analysis of corona-virus (COVID-19) infection in Iran

Download Full-text

BIG DATA TELECOMMUNICATION COMPANY TOOLS TO ENHANCE DECISION-MAKING EFFICIENCY IN COMPLEX ECONOMIC SYSTEMS

Market Infrastructure ◽

10.32843/infrastruct55-32 ◽

2021 ◽

Author(s):

Nataliia Geseleva ◽

Anastasiia Yaroslavtseva

Keyword(s):

Data Mining ◽

Big Data ◽

Customer Service ◽

Mobile Communication ◽

Telecommunications Industry ◽

World Market ◽

Global Trends ◽

The World ◽

Very Large Datasets ◽

Big Data Technologies

The paper examines the telecommunications industry, its development and impact on economic growth in countries including Ukraine. The characteristics of mobile communication, as a segment of the telecommunications industry that is most actively progressing, both in the world as a whole and in Ukraine, are given. It’s examined a current state of the Ukrainian mobile communication market. Its importance for the national economy is reviewed. The Ukrainian mobile market has been studied; the changes that have taken place in recent years in the direction of global trends in the field of communications. Development trends that encourage mobile operators to develop their own platforms, introduce new products and services are considered. Examples of current developments and services of operators such as virtual mobile automatic telephone exchange, Big Data Scoring, Vodafone Analytics and others are given. The article pays special attention to Big Data processing and analysis technologies. Big data is defined as very large datasets that can be analyzed computationally to reveal patterns, trends, and associations – especially in connection with human behavior and interactions. A big data revolution has arrived with the growth of the Internet, wireless networks, smartphones, social media and other technology. These features of Big Data are the ability to use Data Mining. Data mining is a process used by companies to turn raw data into useful information. By using software to look for patterns in large batches of data, businesses can learn more about their customers to develop more effective marketing strategies, increase sales and decrease costs. Data mining depends on effective data collection, warehousing, and computer processing. Data mining processes are used to build machine learning models that power applications including search engine technology and website recommendation programs. Also describes how Big Data affects the retail industry, namely helping to optimize merchandising tactics, personalize customer service, increase advertising effectiveness, target offline shoppers (remarketing) and expand cross-selling. Also in the field of telecommunications, Big Data helps providers to automate and optimize the provision of their services. Thus, the introduction of Big Data technologies will allow Ukraine to become a more competitive country on the world market.

Download Full-text

Clustering method for spread pattern analysis of corona-virus (COVID-19) infection in Iran

10.1101/2020.05.22.20109942 ◽

2020 ◽

Author(s):

Mehdi Azarafza ◽

Mohammad Azarafza ◽

Haluk Akgün

Keyword(s):

Data Mining ◽

Pattern Recognition ◽

Pattern Analysis ◽

Spatiotemporal Distribution ◽

Clustering Method ◽

Corona Virus ◽

The World ◽

Spread Pattern

AbstractThe COVID-19 is outbreak from China and infected more than 131,652 people and caused 7,300 deaths in Iran. Unfortunately, the infection numbers and deaths are still increasing rapidly which has put the world on the catastrophic abyss edge. Application of data mining to perform pattern recognition of infection is mainly used for peparing the spread mapping which considred in this work for spatiotemporal distribution assessment and spread pattern analysis of corona-virus (COVID-19) infection in Iran.

Download Full-text

Improving the Accuracy of Feature Selection in Big Data Mining Using Accelerated Flower Pollination (AFP) Algorithm

Journal of Medical Systems ◽

10.1007/s10916-019-1200-1 ◽

2019 ◽

Vol 43 (4) ◽

Cited By ~ 2

Author(s):

K. Venkatasalam ◽

P. Rajendran ◽

M. Thangavel

Keyword(s):

Data Mining ◽

Feature Selection ◽

Big Data ◽

Flower Pollination ◽

Big Data Mining

Download Full-text

Global Big Data confirm Remdesivir to be a Recommended Antiviral Drug to Fight COVID-19

Journal of Southwest Jiaotong University ◽

10.35741/issn.0258-2724.55.4.43 ◽

2020 ◽

Vol 55 (4) ◽

Author(s):

Maslichah Mafruchati

Keyword(s):

Data Mining ◽

Clinical Trials ◽

Big Data ◽

Information Search ◽

Antiviral Drug ◽

Google Trends ◽

The Internet ◽

Relevance Score ◽

The World ◽

The Usa

COVID-19 is the latest deadly virus to haunt the world. The virus is so contagious it needs a precise drug for treating patients who contract it. The purpose of this study is to observe trends and relevant points of information about Remdesivir in global big data. This article describes a new method of collecting data, namely data mining from Google trends. The subjects under consideration is how Remdesivir is used in the USA, Russia, and India, as they are the current countries with the most COVID-19-positive cases. This method enabled us to discover how many topics on the internet related to Remdesivir, as well as how strong the relevance of this topic was in those three countries. The results shows that the USA had the biggest relevance score in the information search (8.86 points), but the information trends were quite static. Russia had fluctuating trends but the lowest relevance score. India had a dynamic trend recently and a higher relevance score than Russia. It can be concluded that medical authorities are the cause behind information trends for Remdesivir in these three countries that have approved clinical trials for the use of Remdesivir as a COVID-19 treatment.

Download Full-text