scholarly journals An unsupervised machine-learning-based classification of aerosol microphysical properties over 10 years at Cabo Verde

2021 ◽  
Author(s):  
Xianda Gong ◽  
Heike Wex ◽  
Thomas Müller ◽  
Silvia Henning ◽  
Jens Voigtländer ◽  
...  

Abstract. The Cape Verde Atmospheric Observatory (CVAO), which is influenced by both, marine and desert dust air masses, has been used for long-term measurements of different properties of the atmospheric aerosol from 2008 to 2017. These properties include particle number size distributions (PNSD), light absorbing carbon (LAC) and concentrations of cloud condensation nuclei (CCN) together with their hygroscopicity. Here we summarize the results obtained for these properties and use an unsupervised machine learning algorithm for the classification of aerosol types. Five types of aerosols, i.e., marine, freshly-formed, mixture, moderate dust and heavy dust, were classified. Air masses during marine periods are from the Atlantic Ocean and during dust periods are from the Sahara. Heavy dust was more frequently present during wintertime, whereas the clean marine periods were more frequently present during springtime. It was observed that during the dust periods CCN number concentrations at a supersaturation of 0.30 % are roughly 2.5 times higher than during marine periods, but the hygroscopicity (κ) of particles in the size range from ∼30 to ∼175 nm during marine and dust periods are comparable. The long-term data presented here, together with the aerosol classification, can be used as a base to improve our understanding of annual cycles of the atmospheric aerosol in the eastern tropical Atlantic and on aerosol-cloud interactions and it can be used as a base for driving, evaluating and constraining atmospheric model simulations.

PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0260194
Author(s):  
Young Hyun Kim ◽  
Kug Jin Jeon ◽  
Chena Lee ◽  
Yoon Joo Choi ◽  
Hoi-In Jung ◽  
...  

Objectives Anatomical structure classification is necessary task in medical field, but the inevitable variability of interpretation among experts makes reliable classification difficult. This study aims to introduce cluster analysis, unsupervised machine learning method, for classification of three-dimensional (3D) mandibular canal (MC) courses, and to visualize standard MC courses derived from cluster analysis in the Korean population. Materials and methods A total of 429 cone-beam computed tomography images were used. Four sites in the mandible were selected for the measurement of the MC course and four parameters, two vertical and two horizontal parameters were measured per site. Cluster analysis was carried out as follows: parameter measurement, parameter normalization, cluster tendency evaluation, optimal number of clusters determination, and k-means cluster analysis. The 3D MC courses were classified into three types with statistically significant mean differences by cluster analysis. Results Cluster 1 showed a smooth line running towards the lingual side in the axial view and a steep slope in the sagittal view. Cluster 2 ran in an almost straight line closest to the lingual and inferior border of mandible. Cluster 3 showed the pathway with a bent buccally in the axial view and an increasing slope in the sagittal view in the posterior area. Cluster 2 showed the highest distribution (42.1%), and males were more widely distributed (57.1%) than the females (42.9%). Cluster 3 comprised similar ratio of male and female cases and accounted for 31.9% of the total distribution. Cluster 1 had the least distribution (26.0%) Distributions of the right and left sides did not show a statistically significant difference. Conclusion The MC courses were automatically classified as three types through cluster analysis. Cluster analysis enables the unbiased classification of the anatomical structures by reducing observer variability and can present representative standard information for each classified group.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Gabriel A. Colozza-Gama ◽  
Fabiano Callegari ◽  
Nikola Bešič ◽  
Ana C. de J. Paviza ◽  
Janete M. Cerutti

AbstractSomatic mutations in cancer driver genes can help diagnosis, prognosis and treatment decisions. Formalin-fixed paraffin-embedded (FFPE) specimen is the main source of DNA for somatic mutation detection. To overcome constraints of DNA isolated from FFPE, we compared pyrosequencing and ddPCR analysis for absolute quantification of BRAF V600E mutation in the DNA extracted from FFPE specimens and compared the results to the qualitative detection information obtained by Sanger Sequencing. Sanger sequencing was able to detect BRAF V600E mutation only when it was present in more than 15% total alleles. Although the sensitivity of ddPCR is higher than that observed for Sanger, it was less consistent than pyrosequencing, likely due to droplet classification bias of FFPE-derived DNA. To address the droplet allocation bias in ddPCR analysis, we have compared different algorithms for automated droplet classification and next correlated these findings with those obtained from pyrosequencing. By examining the addition of non-classifiable droplets (rain) in ddPCR, it was possible to obtain better qualitative classification of droplets and better quantitative classification compared to no rain droplets, when considering pyrosequencing results. Notable, only the Machine learning k-NN algorithm was able to automatically classify the samples, surpassing manual classification based on no-template controls, which shows promise in clinical practice.


2021 ◽  
Vol 11 (3) ◽  
pp. 92
Author(s):  
Mehdi Berriri ◽  
Sofiane Djema ◽  
Gaëtan Rey ◽  
Christel Dartigues-Pallez

Today, many students are moving towards higher education courses that do not suit them and end up failing. The purpose of this study is to help provide counselors with better knowledge so that they can offer future students courses corresponding to their profile. The second objective is to allow the teaching staff to propose training courses adapted to students by anticipating their possible difficulties. This is possible thanks to a machine learning algorithm called Random Forest, allowing for the classification of the students depending on their results. We had to process data, generate models using our algorithm, and cross the results obtained to have a better final prediction. We tested our method on different use cases, from two classes to five classes. These sets of classes represent the different intervals with an average ranging from 0 to 20. Thus, an accuracy of 75% was achieved with a set of five classes and up to 85% for sets of two and three classes.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ajay Kumar Maddirala ◽  
Kalyana C Veluvolu

AbstractIn recent years, the usage of portable electroencephalogram (EEG) devices are becoming popular for both clinical and non-clinical applications. In order to provide more comfort to the subject and measure the EEG signals for several hours, these devices usually consists of fewer EEG channels or even with a single EEG channel. However, electrooculogram (EOG) signal, also known as eye-blink artifact, produced by involuntary movement of eyelids, always contaminate the EEG signals. Very few techniques are available to remove these artifacts from single channel EEG and most of these techniques modify the uncontaminated regions of the EEG signal. In this paper, we developed a new framework that combines unsupervised machine learning algorithm (k-means) and singular spectrum analysis (SSA) technique to remove eye blink artifact without modifying actual EEG signal. The novelty of the work lies in the extraction of the eye-blink artifact based on the time-domain features of the EEG signal and the unsupervised machine learning algorithm. The extracted eye-blink artifact is further processed by the SSA method and finally subtracted from the contaminated single channel EEG signal to obtain the corrected EEG signal. Results with synthetic and real EEG signals demonstrate the superiority of the proposed method over the existing methods. Moreover, the frequency based measures [the power spectrum ratio ($$\Gamma $$ Γ ) and the mean absolute error (MAE)] also show that the proposed method does not modify the uncontaminated regions of the EEG signal while removing the eye-blink artifact.


2021 ◽  
Vol 11 (11) ◽  
pp. 5230
Author(s):  
Isabel Santiago ◽  
Jorge Luis Esquivel-Martin ◽  
David Trillo-Montero ◽  
Rafael Jesús Real-Calvo ◽  
Víctor Pallarés-López

In this work, the automatic classification of daily irradiance profiles registered in a photovoltaic installation located in the south of Spain was carried out for a period of nine years, with a sampling frequency of 5 min, and the subsequent analysis of the operation of the elements of the installation on each type of day was also performed. The classification was based on the total daily irradiance values and the fluctuations of this parameter throughout the day. The irradiance profiles were grouped into nine different categories using unsupervised machine learning algorithms for clustering, implemented in Python. It was found that the behaviour of the modules and the inverter of the installation was influenced by the type of day obtained, such that the latter worked with a better average efficiency on days with higher irradiance and lower fluctuations. However, the modules worked with better average efficiency on days with irradiance fluctuations than on clear sky days. This behaviour of the modules may be due to the presence, on days with passing clouds, of the phenomenon known as cloud enhancement, in which, due to reflections of radiation on the edges of the clouds, irradiance values can be higher at certain moments than those that occur on clear sky days, without passing clouds. This is due to the higher energy generated during these irradiance peaks and to the lower temperatures that the module reaches due to the shaded areas created by the clouds, resulting in a reduction in its temperature losses.


Water ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1217
Author(s):  
Nicolò Bellin ◽  
Erica Racchetti ◽  
Catia Maurone ◽  
Marco Bartoli ◽  
Valeria Rossi

Machine Learning (ML) is an increasingly accessible discipline in computer science that develops dynamic algorithms capable of data-driven decisions and whose use in ecology is growing. Fuzzy sets are suitable descriptors of ecological communities as compared to other standard algorithms and allow the description of decisions that include elements of uncertainty and vagueness. However, fuzzy sets are scarcely applied in ecology. In this work, an unsupervised machine learning algorithm, fuzzy c-means and association rules mining were applied to assess the factors influencing the assemblage composition and distribution patterns of 12 zooplankton taxa in 24 shallow ponds in northern Italy. The fuzzy c-means algorithm was implemented to classify the ponds in terms of taxa they support, and to identify the influence of chemical and physical environmental features on the assemblage patterns. Data retrieved during 2014 and 2015 were compared, taking into account that 2014 late spring and summer air temperatures were much lower than historical records, whereas 2015 mean monthly air temperatures were much warmer than historical averages. In both years, fuzzy c-means show a strong clustering of ponds in two groups, contrasting sites characterized by different physico-chemical and biological features. Climatic anomalies, affecting the temperature regime, together with the main water supply to shallow ponds (e.g., surface runoff vs. groundwater) represent disturbance factors producing large interannual differences in the chemistry, biology and short-term dynamic of small aquatic ecosystems. Unsupervised machine learning algorithms and fuzzy sets may help in catching such apparently erratic differences.


2020 ◽  
Vol 9 (1) ◽  
pp. 1700-1704

Classification of target from a mixture of multiple target information is quite challenging. In This paper we have used supervised Machine learning algorithm namely Linear Regression to classify the received data which is a mixture of target-return with the noise and clutter. Target state is estimated from the classified data using Kalman filter. Linear Kalman filter with constant velocity model is used in this paper. Minimum Mean Square Error (MMSE) analysis is used to measure the performance of the estimated track at various Signal to Noise Ratio (SNR) levels. The results state that the error is high for Low SNR, for High SNR the error is Low


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Hanlin Liu ◽  
Linqiang Yang ◽  
Linchao Li

A variety of climate factors influence the precision of the long-term Global Navigation Satellite System (GNSS) monitoring data. To precisely analyze the effect of different climate factors on long-term GNSS monitoring records, this study combines the extended seven-parameter Helmert transformation and a machine learning algorithm named Extreme Gradient boosting (XGboost) to establish a hybrid model. We established a local-scale reference frame called stable Puerto Rico and Virgin Islands reference frame of 2019 (PRVI19) using ten continuously operating long-term GNSS sites located in the rigid portion of the Puerto Rico and Virgin Islands (PRVI) microplate. The stability of PRVI19 is approximately 0.4 mm/year and 0.5 mm/year in the horizontal and vertical directions, respectively. The stable reference frame PRVI19 can avoid the risk of bias due to long-term plate motions when studying localized ground deformation. Furthermore, we applied the XGBoost algorithm to the postprocessed long-term GNSS records and daily climate data to train the model. We quantitatively evaluated the importance of various daily climate factors on the GNSS time series. The results show that wind is the most influential factor with a unit-less index of 0.013. Notably, we used the model with climate and GNSS records to predict the GNSS-derived displacements. The results show that the predicted displacements have a slightly lower root mean square error compared to the fitted results using spline method (prediction: 0.22 versus fitted: 0.31). It indicates that the proposed model considering the climate records has the appropriate predict results for long-term GNSS monitoring.


Sign in / Sign up

Export Citation Format

Share Document