scholarly journals Hierarchical Unsupervised Partitioning of Large Size Data and Its Application to Hyperspectral Images

2021 ◽  
Vol 13 (23) ◽  
pp. 4874
Author(s):  
Jihan Alameddine ◽  
Kacem Chehdi ◽  
Claude Cariou

In this paper, we propose a true unsupervised method to partition large-size images, where the number of classes, training samples, and other a priori information is not known. Thus, partitioning an image without any knowledge is a great challenge. This novel adaptive and hierarchical classification method is based on affinity propagation, where all criteria and parameters are adaptively calculated from the image to be partitioned. It is reliable to objectively discover classes of an image without user intervention and therefore satisfies all the objectives of an unsupervised method. Hierarchical partitioning adopted allows the user to analyze and interpret the data very finely. The optimal partition maximizing an objective criterion provides the number of classes and the exemplar of each class. The efficiency of the proposed method is demonstrated through experimental results on hyperspectral images. The obtained results show its superiority over the most widely used unsupervised and semi-supervised methods. The developed method can be used in several application domains to partition large-size images or data. It allows the user to consider all or part of the obtained classes and gives the possibility to select the samples in an objective way during a learning process.

2014 ◽  
Vol 12 (5) ◽  
pp. 594-603 ◽  
Author(s):  
Yaroslava Pushkarova ◽  
Yuriy Kholin

AbstractArtificial neural networks have proven to be a powerful tool for solving classification problems. Some difficulties still need to be overcome for their successful application to chemical data. The use of supervised neural networks implies the initial distribution of patterns between the pre-determined classes, while attribution of objects to the classes may be uncertain. Unsupervised neural networks are free from this problem, but do not always reveal the real structure of data. Classification algorithms which do not require a priori information about the distribution of patterns between the pre-determined classes and provide meaningful results are of special interest. This paper presents an approach based on the combination of Kohonen and probabilistic networks which enables the determination of the number of classes and the reliable classification of objects. This is illustrated for a set of 76 solvents based on nine characteristics. The resulting classification is chemically interpretable. The approach proved to be also applicable in a different field, namely in examining the solubility of C60 fullerene. The solvents belonging to the same group demonstrate similar abilities to dissolve C60. This makes it possible to estimate the solubility of fullerenes in solvents for which there are no experimental data


2018 ◽  
Author(s):  
Lulu Chen ◽  
Niya Wang ◽  
Robert Clarke ◽  
Zhen Zhang ◽  
Yue Wang

AbstractIntratumor heterogeneity, as both a major confounding factor and an underexploited information source, is widely implicated as a key driver of drug resistance. While a handful of reports have demonstrated the potential of supervised methods to deconvolute intratumor heterogeneity, these approaches require a priori information on the marker genes or composition of known subpopulations. To address the critical problem of the absence of validated marker genes for many (including novel) subpopulations, we developed convex analysis of mixtures (CAM), a fully unsupervised deconvolution method, for identifying marker genes and subpopulations directly from original mixed molecular expressions.


2006 ◽  
Vol 45 (3) ◽  
pp. 416-433 ◽  
Author(s):  
Mircea Grecu ◽  
William S. Olson

Abstract Precipitation estimation from satellite passive microwave radiometer observations is a problem that does not have a unique solution that is insensitive to errors in the input data. Traditionally, to make this problem well posed, a priori information derived from physical models or independent, high-quality observations is incorporated into the solution. In the present study, a database of precipitation profiles and associated brightness temperatures is constructed to serve as a priori information in a passive microwave radiometer algorithm. The precipitation profiles are derived from a Tropical Rainfall Measuring Mission (TRMM) combined radar–radiometer algorithm, and the brightness temperatures are TRMM Microwave Imager (TMI) observed. Because the observed brightness temperatures are consistent with those derived from a radiative transfer model embedded in the combined algorithm, the precipitation–brightness temperature database is considered to be physically consistent. The database examined here is derived from the analysis of a month-long record of TRMM data that yields more than a million profiles of precipitation and associated brightness temperatures. These profiles are clustered into a tractable number of classes based on the local sea surface temperature, a radiometer-based estimate of the echo-top height (the height beyond which the reflectivity drops below 17 dBZ), and brightness temperature principal components. For each class, the mean precipitation profile, brightness temperature principal components, and probability of occurrence are determined. The precipitation–brightness temperature database supports a radiometer-only algorithm that incorporates a Bayesian estimation methodology. In the Bayesian framework, precipitation estimates are weighted averages of the mean precipitation values corresponding to the classes in the database, with the weights being determined according to the similarity between the observed brightness temperature principal components and the brightness temperature principal components of the classes. Because the classes are stratified by the sea surface temperature and the echo-top-height estimator, the number of classes that are considered for retrieval is significantly smaller than the total number of classes, making the algorithm computationally efficient. The radiometer-only algorithm is applied to TMI observations, and precipitation estimates are compared with combined TRMM precipitation radar (PR)–TMI reference estimates. The TMI-only algorithm, supported by the empirically derived database, produces estimates that are more consistent with the reference values than the precipitation estimates from the version-6 TRMM facility TMI algorithm. Cloud-resolving model simulations are used to assign a latent heating profile to each precipitation profile in the empirically derived database, making it possible to estimate latent heating using the radiometer-only algorithm. Although the evaluation of latent heating estimates in this study is preliminary, because realistic conditional probability distribution functions are attached to latent heating structures in the algorithm's database, a generally positive impact on latent heating estimation from passive microwave observations is expected.


2010 ◽  
Vol 143-144 ◽  
pp. 648-652
Author(s):  
Xu Dong Zhu ◽  
Zhi Jing Liu

We present a novel online unsupervised anomaly detection method for human activities. The proposed approach is based on one-class support vector machine (OCSVM) clustering, where the novelty detection SVM capabilities are used for the identification of anomalous activities. Particular attention is given to activity classification in absence of a priori information on the distribution of outliers. Activities are represented by variable-length event sequences, but the most commonly used kernels are defined on fixed-dimension spaces. To solve the problem, we develop a novel sequence-similarity kernel, the n-grams kernel. Our kernel is conceptually simple and efficient to compute and performs well in comparison with state-of-the-art methods for anomaly detection. Moreover, most SVM algorithms require large number of memory to store the kernel matrix, or repeated access to the training samples. This makes it infeasible for online anomaly detection. In this paper, we develop simple and computationally efficient online learning algorithms for anomaly detection.


Author(s):  
Maria A. Milkova

Nowadays the process of information accumulation is so rapid that the concept of the usual iterative search requires revision. Being in the world of oversaturated information in order to comprehensively cover and analyze the problem under study, it is necessary to make high demands on the search methods. An innovative approach to search should flexibly take into account the large amount of already accumulated knowledge and a priori requirements for results. The results, in turn, should immediately provide a roadmap of the direction being studied with the possibility of as much detail as possible. The approach to search based on topic modeling, the so-called topic search, allows you to take into account all these requirements and thereby streamline the nature of working with information, increase the efficiency of knowledge production, avoid cognitive biases in the perception of information, which is important both on micro and macro level. In order to demonstrate an example of applying topic search, the article considers the task of analyzing an import substitution program based on patent data. The program includes plans for 22 industries and contains more than 1,500 products and technologies for the proposed import substitution. The use of patent search based on topic modeling allows to search immediately by the blocks of a priori information – terms of industrial plans for import substitution and at the output get a selection of relevant documents for each of the industries. This approach allows not only to provide a comprehensive picture of the effectiveness of the program as a whole, but also to visually obtain more detailed information about which groups of products and technologies have been patented.


Photonics ◽  
2021 ◽  
Vol 8 (6) ◽  
pp. 177
Author(s):  
Iliya Gritsenko ◽  
Michael Kovalev ◽  
George Krasin ◽  
Matvey Konoplyov ◽  
Nikita Stsepuro

Recently the transport-of-intensity equation as a phase imaging method turned out as an effective microscopy method that does not require the use of high-resolution optical systems and a priori information about the object. In this paper we propose a mathematical model that adapts the transport-of-intensity equation for the purpose of wavefront sensing of the given light wave. The analysis of the influence of the longitudinal displacement z and the step between intensity distributions measurements on the error in determining the wavefront radius of curvature of a spherical wave is carried out. The proposed method is compared with the traditional Shack–Hartmann method and the method based on computer-generated Fourier holograms. Numerical simulation showed that the proposed method allows measurement of the wavefront radius of curvature with radius of 40 mm and with accuracy of ~200 μm.


Sign in / Sign up

Export Citation Format

Share Document