scholarly journals INCREASING QUALITY OF CLASSIFYING OBJECTS USING NEW METRICS OF CLUSTERING

Author(s):  
Iscandar Maratovich Azhmukhamedov ◽  
Raisa Yurevna Demina

The article touches upon one of the main problems of machine learning - clustering objects. It has been widely used in various subject areas: marketing, sociology, psychology, etc. Clusterization algorithms, as a rule, are based on a metric that reflects the distance between objects. However, in some cases it is not practical to use the distance between objects. In certain situations, it is possible to say that one object is similar to the other, the latter being not similar to the former. The original picture and its copy may serve as an example. For such cases, a measure of object similarity is proposed in the work, which shows how many features of one object are contained in another one. A similarity matrix is built on this measure, the analysis of which allows revealing clusters of mutually similar objects. When testing the proposed clustering method, the Rand index (the proportion of correctly connected or unrelated objects) made 0.93. There has been proposed an algorithm that allows to form a set of objects absolutely different from each other. A set of objects formed in this way can later become a learning set for classifiers and increase their fidelity in recognition.

2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
...  

Abstract This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models.


2020 ◽  
Author(s):  
Tomohiro Harada ◽  
Misaki Kaidan ◽  
Ruck Thawonmas

Abstract This paper investigates the integration of a surrogate-assisted multi-objective evolutionary algorithm (MOEA) and a parallel computation scheme to reduce the computing time until obtaining the optimal solutions in evolutionary algorithms (EAs). A surrogate-assisted MOEA solves multi-objective optimization problems while estimating the evaluation of solutions with a surrogate function. A surrogate function is produced by a machine learning model. This paper uses an extreme learning surrogate-assisted MOEA/D (ELMOEA/D), which utilizes one of the well-known MOEA algorithms, MOEA/D, and a machine learning technique, extreme learning machine (ELM). A parallelization of MOEA, on the other hand, evaluates solutions in parallel on multiple computing nodes to accelerate the optimization process. We consider a synchronous and an asynchronous parallel MOEA as a master-slave parallelization scheme for ELMOEA/D. We carry out an experiment with multi-objective optimization problems to compare the synchronous parallel ELMOEA/D with the asynchronous parallel ELMOEA/D. In the experiment, we simulate two settings of the evaluation time of solutions. One determines the evaluation time of solutions by the normal distribution with different variances. On the other hand, another evaluation time correlates to the objective function value. We compare the quality of solutions obtained by the parallel ELMOEA/D variants within a particular computing time. The experimental results show that the parallelization of ELMOEA/D significantly reduces the computational time. In addition, the integration of ELMOEA/D with the asynchronous parallelization scheme obtains higher quality of solutions quicker than the synchronous parallel ELMOEA/D.


2019 ◽  
Vol 2019 ◽  
pp. 1-16 ◽  
Author(s):  
Marcin Luckner

Machine learning techniques are a standard approach in spam detection. Their quality depends on the quality of the learning set, and when the set is out of date, the quality of classification falls rapidly. The most popular public web spam dataset that can be used to train a spam detector—WEBSPAM-UK2007—is over ten years old. Therefore, there is a place for a lifelong machine learning system that can replace the detectors based on a static learning set. In this paper, we propose a novel web spam recognition system. The system automatically rebuilds the learning set to avoid classification based on outdated data. Using a built-in automatic selection of the active classifier the system very quickly attains productive accuracy despite a limited learning set. Moreover, the system automatically rebuilds the learning set using external data from spam traps and popular web services. A test on real data from Quora, Reddit, and Stack Overflow proved the high recognition quality. Both the obtained average accuracy and the F-measure were 0.98 and 0.96 for semiautomatic and full–automatic mode, respectively.


Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
...  

This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Anqi Bi ◽  
Wenhao Ying ◽  
Lu Zhao

The diagnosis and treatment of epilepsy is a significant direction for both machine learning and brain science. This paper newly proposes a fast enhanced exemplar-based clustering (FEEC) method for incomplete EEG signal. The algorithm first compresses the potential exemplar list and reduces the pairwise similarity matrix. By processing the most complete data in the first stage, FEEC then extends the few incomplete data into the exemplar list. A new compressed similarity matrix will be constructed and the scale of this matrix is greatly reduced. Finally, FEEC optimizes the new target function by the enhanced α-expansion move method. On the other hand, due to the pairwise relationship, FEEC also improves the generalization of this algorithm. In contrast to other exemplar-based models, the performance of the proposed clustering algorithm is comprehensively verified by the experiments on two datasets.


2020 ◽  
Vol 9 (1) ◽  
pp. 1279-1282

With the advancement of technology and proliferation of computers in the country, the amount of Afaan Oromo language news documents produced increasingly which becomes a difficult task for news agencies to organize such huge collection of documents items manually. To solve this problem, researches is conducted using unsupervised machine learning python tools for Afaan Oromo news document clustering with low cost and best quality of clustering solution. In this research work focusing on k-means clustering analysis which produced better results as compared to the other cluster analysis both in terms of time requirement and the quality of the clusters produced


Author(s):  
K. T. Tokuyasu

During the past investigations of immunoferritin localization of intracellular antigens in ultrathin frozen sections, we found that the degree of negative staining required to delineate u1trastructural details was often too dense for the recognition of ferritin particles. The quality of positive staining of ultrathin frozen sections, on the other hand, has generally been far inferior to that attainable in conventional plastic embedded sections, particularly in the definition of membranes. As we discussed before, a main cause of this difficulty seemed to be the vulnerability of frozen sections to the damaging effects of air-water surface tension at the time of drying of the sections.Indeed, we found that the quality of positive staining is greatly improved when positively stained frozen sections are protected against the effects of surface tension by embedding them in thin layers of mechanically stable materials at the time of drying (unpublished).


CCIT Journal ◽  
2014 ◽  
Vol 7 (3) ◽  
pp. 335-354
Author(s):  
Untung Rahardja ◽  
Muhamad Yusup ◽  
Ana Nurmaliana

The accuracy and reliability is the quality of the information. The more accurate and reliable, the more information it’s good quality. Similarly, a survey, the better the survey, the more accurate the information provided. Implementation of student satisfaction measurement to the process of teaching and learning activities on the quality of the implementation of important lectures in order to get feedback on the assessed variables and for future repair. Likewise in Higher Education Prog has undertaken the process of measuring student satisfaction through a distributed questioner finally disemester each class lecture. However, the deployment process questioner is identified there are 7 (seven) problems. However, the problem can be resolved by the 3 (three) ways of solving problems one of which is a system of iLearning Survey (Isur), that is by providing an online survey to students that can be accessed anywhere and anytime. In the implementation shown a prototype of Isur itself. It can be concluded that the contribution Isur system can maximize the decision taken by the Higher Education Prog. By using this Isur system with questions and evaluation forms are submitted and given to the students and the other colleges. To assess the extent to which the campus has grown and how faculty performance in teaching students class, and can be used as a media Isur valid information for an assessment of activities throughout college.


Author(s):  
Feidu Akmel ◽  
Ermiyas Birihanu ◽  
Bahir Siraj

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.


Sign in / Sign up

Export Citation Format

Share Document