unsupervised data mining
Recently Published Documents


TOTAL DOCUMENTS

45
(FIVE YEARS 20)

H-INDEX

7
(FIVE YEARS 2)

2022 ◽  
Vol 5 (1) ◽  
Author(s):  
Daria Zvyagintseva ◽  
Helgi Sigurdsson ◽  
Valerii K. Kozin ◽  
Ivan Iorsh ◽  
Ivan A. Shelykh ◽  
...  

AbstractPolaritonic lattices offer a unique testbed for studying nonlinear driven-dissipative physics. They show qualitative changes of their steady state as a function of system parameters, which resemble non-equilibrium phase transitions. Unlike their equilibrium counterparts, these transitions cannot be characterised by conventional statistical physics methods. Here, we study a lattice of square-arranged polariton condensates with nearest-neighbour coupling, and simulate the polarisation (pseudospin) dynamics of the polariton lattice, observing regions with distinct steady-state polarisation patterns. We classify these patterns using machine learning methods and determine the boundaries separating different regions. First, we use unsupervised data mining techniques to sketch the boundaries of phase transitions. We then apply learning by confusion, a neural network-based method for learning labels in a dataset, and extract the polaritonic phase diagram. Our work takes a step towards AI-enabled studies of polaritonic systems.


Author(s):  
Tingzhen Liu ◽  
Tong Zhou ◽  
Yuxin Shi ◽  
Siyuan Liu ◽  
Jin Gao

The herd effect is a common phenomenon in social society. The detection of this phenomenon is of great significance in many tasks based on social network analysis such as recommendation. However, the research on social network and natural language processing seldom focuses on this issue. In this paper, we propose an unsupervised data mining method to detect herding in social networks. Taking shopping review as an example, our algorithm can identify other reviews which are affected by some previous reviews and detect a herd effect chain. From the overall perspective, the cross effects of all views form the herd effect graph. This algorithm can be widely used in various social network analysis methods through graph structure, which provides new useful features for many tasks.


2021 ◽  
Vol 2084 (1) ◽  
pp. 012004
Author(s):  
Wan Nurul Dalilah Wan Pauzi ◽  
Haliza Hasan ◽  
Zamalia Mahmud

Abstract It is the students’ dream to secure a job right after graduation. However, there are factors that hinder their employability. This study aims to predict Malaysian graduates’ employment status based on employability factors and to profile the graduates’ satisfaction towards their curricular activities and information and communications technology (ICT) skills. A total of 375,507 student records were obtained based on tracer studies conducted by the Malaysian Ministry of Higher Education between 2015 and 2018. Due to the large amount of data with various categories, supervised and unsupervised data mining techniques were used to unmask the underlying variables and reveal hidden information about graduates’ employability for better tracing the employment status of graduates. Various types of consolidation techniques were also used to reduce the number of levels for categorical inputs in the dataset, namely, classifiers without consolidation, with manual consolidation, and with tree consolidation. Three types of data mining variable selections were used to improve the performance of the classifiers in predicting employment status. The results show that logistic regression (LR) without variable selection is the best classifier for data without consolidation, while LR using variable selection with LR stepwise is the best classifier for data with manual and tree consolidations. In profiling the satisfaction of graduates, K-Means Clustering was used, which revealed seven clusters. The most prominent cluster consisted of graduates who were highly satisfied with their ICT skills but less satisfied with their curricular activities. These data mining techniques were able to trace graduates’ employment status and identify the success factors of graduates’ employability.


2021 ◽  
pp. 1-15
Author(s):  
Luis I. Lopera González ◽  
Adrian Derungs ◽  
Oliver Amft

2020 ◽  
Vol 10 (24) ◽  
pp. 9089
Author(s):  
Davor Stjelja ◽  
Juha Jokisalo ◽  
Risto Kosonen

Climate change and technological development are pushing buildings to become more sophisticated. The installation of modern building automation systems, smart meters, and IoT devices is increasing the amount of available building operational data. The common term for this kind of building is a smart building but producing large amounts of raw data does not automatically offer intelligence that would offer new insights to the building’s operation. Smart meters are mainly used only for tracking the energy or water consumption in the building. On the other hand, building occupancy is usually not monitored in the building at all, even though it is one of the main influencing factors of consumption and indoor climate parameters. This paper is bringing the true smart building closer to practice by using machine learning methods with sub-metered electricity and water consumptions to predict the building occupancy. In the first approach, the number of occupants was predicted in an office floor using a supervised data mining method Random Forest. The model performed the best with the use of all predictors available, while from individual predictors, the sub-metered electricity used for office equipment showed the best performance. Since the supervised approach requires the continuous long-term collection of ground truth reference data (between one to three months, by this study), an unsupervised data mining method k-means clustering was tested in the second approach. With the unsupervised method, this study was able to predict the level of occupancy in a day as zero, medium, or high in a case study office floor using the equipment electricity consumption.


TEM Journal ◽  
2020 ◽  
pp. 1614-1618
Author(s):  
Adya Hermawati ◽  
Sri Jumini ◽  
Mardiah Astuti ◽  
Fajri Ismail ◽  
Robbi Rahim

The purpose of this study was to analyze the k-medoids method in conducting cluster mapping in the ratio of the number of students and teachers in Indonesia by region, especially at the elementary school level. The data source is secondary obtained from the Ministry of Education and Culture which is processed by the Central Statistics Agency (abbreviated as BPS) in the BPS Catalog: 4301008 concerning the Portrait of Indonesian Education. The analysis process uses the help of Rapid Miner software by using parameters of the Davies Bouldin Index (DBI) and Performance (Classification). By using three cluster labels, namely the high cluster (K1), normal cluster (K2) and poor cluster (K3), it was found that 3 provinces were in the high cluster, 9 provinces were in the normal cluster and 22 provinces were in the fewer clusters. By testing the cluster results (k = 3) through the DBI parameter the value = 0.587 was obtained. This shows that the results of the cluster formed are optimal (the smaller the better). The test results with the parameter Performance (Classification) show the results of classification error = 2.50%. The results of the research can be used as information to determine the ratio of students and teachers because the higher the value of this ratio means that the level of teacher supervision and attention to students is reduced so that the quality of teaching tends to be lower.


2020 ◽  
Author(s):  
Armando Toda ◽  
Filipe Dwan Pereira ◽  
Ana Carolina Tomé Klock ◽  
Luiz Rodrigues ◽  
Paula Palomino ◽  
...  

Gamification design in educational environments is not trivial and many variables need to be considered to achieve positive outcomes. Often, educators and designers do not know when the students intentions on the use of gamified environments might influence their experience. Based on this premise, this paper describes an exploratory study on the users intention to use gamification, focusing on its influence in the field of education. We conducted a survey study with participants (N=1.692) and analysed their answers using unsupervised data mining techniques. As a result, we obtained empirical evidence showing that demographic and contextual variables influence (positively and negatively) peoples intention to use gamification. This evidence can support designers and educators better understand whether and when they should or should not gamify a learning environment.


2020 ◽  
Vol 187 ◽  
pp. 106431
Author(s):  
Betsy Sandoval ◽  
Emilio Barocio ◽  
Petr Korba ◽  
Felix Rafael Segundo Sevilla

Author(s):  
J. G. Rejas ◽  
C. Pothier ◽  
C. Rigotti ◽  
N. Méger ◽  
I. Vásquez ◽  
...  

Abstract. The aim of this work is to develop a geospatial methodology for the analysis of the time evolution of The Turrialba volcano using different automatic imaging techniques compared to expert-based remote sensing techniques. Change detection of hydrothermal alteration materials in relation with time series from multisensor data acquired in spectral ranges of the visible (VIS) and short wave infrared (SWIR) have been calculated. We used for this purpose multispectral and hyperspectral scenes of the Sentinel 2, ALI and Hyperion sensors, respectively, on four dates from 2013 and 2018. This work adopts a multi-source approach, applied to the analysis of the correlations between hydrothermal materials and spectral anomalies in The Turrialba volcano complex, located in The Central Volcanic Range (Costa Rica).An expert-based technique called Crosta’s technique for detecting hydrothermal materials have been applied. We have chosen four variables for generating a different Principal Component Analysis (PCA) for groups of channels, two highly reflective and two highly absorptive for each mineral. We have tested another technique to detect hydrothermal materials based on a discrete spectral profile analysis and an unsupervised data mining approach. In other sense, we have applied an automatic technique called anomaly detection to compare with the hydrothermal materials results. Results are presented as an approach based on a comparison of two different strategies whose main future interest lies in the automated identification of patterns of hydrothermally altered materials without prior knowledge or poor information about the area, which has relevant implications in image-based prospecting.


Sign in / Sign up

Export Citation Format

Share Document