scholarly journals Optimization algorithm for omic data subspace clustering

2021 ◽  
Author(s):  
Madalina Ciortan ◽  
Matthieu Defrance

Subspace clustering identifies multiple feature subspaces embedded in a dataset together with the underlying sample clusters. When applied to omic data, subspace clustering is a challenging task, as additional problems have to be addressed: the curse of dimensionality, the imperfect data quality and cluster separation, the presence of multiple subspaces representative of divergent views of the dataset, and the lack of consensus on the best clustering method. First, we propose a computational method discover to perform subspace clustering on tabular high dimensional data by maximizing the internal clustering score (i.e. cluster compactness) of feature subspaces. Our algorithm can be used in both unsupervised and semi-supervised settings. Secondly, by applying our method to a large set of omic datasets (i.e. microarray, bulk RNA-seq, scRNA-seq), we show that the subspace corresponding to the provided ground truth annotations is rarely the most compact one, as assumed by the methods maximizing the internal quality of clusters. Our results highlight the difficulty of fully validating subspace clusters (justified by the lack of feature annotations). Tested on identifying the ground-truth subspace, our method compared favorably with competing techniques on all datasets. Finally, we propose a suite of techniques to interpret the clustering results biologically in the absence of annotations. We demonstrate that subspace clustering can provide biologically meaningful sample-wise and feature-wise information, typically missed by traditional methods.

2018 ◽  
Author(s):  
Xiuwei Zhang ◽  
Chenling Xu ◽  
Nir Yosef

The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in-silico platforms for evaluation and validation. Simulated datasets which resemble the properties of real datasets can aid in method development and prioritization as well as in questions in experimental design by providing an objective ground truth. Here, we present SymSim, a simulator software that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation that is indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. Unlike other simulators, the parameters that govern the simulation process directly represent meaningful properties such as mRNA capture rate, the number of PCR cycles, sequencing depth, or the use of unique molecular identifiers. We demonstrate how SymSim can be used for benchmarking methods for clustering and differential expression and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population and how this number deviates from the theoretical lower bound as the quality of the data decreases. SymSim is publicly available as an R package and allows users to simulate datasets with desired properties or matched with experimental data.


Author(s):  
A. V. Ponomarev

Introduction: Large-scale human-computer systems involving people of various skills and motivation into the information processing process are currently used in a wide spectrum of applications. An acute problem in such systems is assessing the expected quality of each contributor; for example, in order to penalize incompetent or inaccurate ones and to promote diligent ones.Purpose: To develop a method of assessing the expected contributor’s quality in community tagging systems. This method should only use generally unreliable and incomplete information provided by contributors (with ground truth tags unknown).Results:A mathematical model is proposed for community image tagging (including the model of a contributor), along with a method of assessing the expected contributor’s quality. The method is based on comparing tag sets provided by different contributors for the same images, being a modification of pairwise comparison method with preference relation replaced by a special domination characteristic. Expected contributors’ quality is evaluated as a positive eigenvector of a pairwise domination characteristic matrix. Community tagging simulation has confirmed that the proposed method allows you to adequately estimate the expected quality of community tagging system contributors (provided that the contributors' behavior fits the proposed model).Practical relevance: The obtained results can be used in the development of systems based on coordinated efforts of community (primarily, community tagging systems). 


HortScience ◽  
1998 ◽  
Vol 33 (3) ◽  
pp. 544c-544
Author(s):  
A. Hakim ◽  
A. Purvis ◽  
E. Pehu ◽  
I. Voipio ◽  
E. Kaukovirta

Both external and internal quality of fruits such as tomatoes can be evaluated by different methods, but all most all of the methods are destructive. For this reason, there is a need to reassess some of the alternative techniques. Nondestructive quality evaluation is an attractive alternative. The principles of different nondestructive quality evaluation techniques such as optical, physical, and fluorescence techniques applied to tomato fruit is explained. Successful application of these techniques that could be used for evaluation of different quality attributes are illustrated. The advantages of nondestructive quality evaluation techniques are that they are very fast, easy, labor- and time-intensive, and inexpensive. These techniques could also be useful to evaluate the quality of other vegetables.


2021 ◽  
Vol 11 (12) ◽  
pp. 5690
Author(s):  
Mamdouh Alenezi

The evolution of software is necessary for the success of software systems. Studying the evolution of software and understanding it is a vocal topic of study in software engineering. One of the primary concepts of software evolution is that the internal quality of a software system declines when it evolves. In this paper, the method of evolution of the internal quality of object-oriented open-source software systems has been examined by applying a software metric approach. More specifically, we analyze how software systems evolve over versions regarding size and the relationship between size and different internal quality metrics. The results and observations of this research include: (i) there is a significant difference between different systems concerning the LOC variable (ii) there is a significant correlation between all pairwise comparisons of internal quality metrics, and (iii) the effect of complexity and inheritance on the LOC was positive and significant, while the effect of Coupling and Cohesion was not significant.


2021 ◽  
Vol 11 (8) ◽  
pp. 3562
Author(s):  
Yong Jin Lee ◽  
Sang Yong Park ◽  
Dae Yeon Kim ◽  
Jae Yoon Kim

Preharvest sprouting (PHS) is a key global issue in production and end-use quality of cereals, particularly in regions where the rainfall season overlaps the harvest. To investigate transcriptomic changes in genes affected by PHS-induction and ABA-treatment, RNA-seq analysis was performed in two wheat cultivars that differ in PHS tolerance. A total of 123 unigenes related to hormone metabolism and signaling for abscisic acid (ABA), gibberellic acid (GA), indole-3-acetic acid (IAA), and cytokinin were identified and 1862 of differentially expressed genes were identified and divided into 8 groups by transcriptomic analysis. DEG analysis showed the majority of genes were categorized in sugar related processes, which interact with ABA signaling in PHS tolerant cultivar under PHS-induction. Thus, genes related to ABA are key regulators of dormancy and germination. Our results give insight into global changes in expression of plant hormone related genes in response to PHS.


2021 ◽  
Vol 175 ◽  
pp. 111497
Author(s):  
Weijie Lan ◽  
Benoit Jaillais ◽  
Catherine M.G.C. Renard ◽  
Alexandre Leca ◽  
Songchao Chen ◽  
...  

Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 773 ◽  
Author(s):  
Carmelo Militello ◽  
Leonardo Rundo ◽  
Luigi Minafra ◽  
Francesco Paolo Cammarata ◽  
Marco Calvaruso ◽  
...  

A clonogenic assay is a biological technique for calculating the Surviving Fraction (SF) that quantifies the anti-proliferative effect of treatments on cell cultures: this evaluation is often performed via manual counting of cell colony-forming units. Unfortunately, this procedure is error-prone and strongly affected by operator dependence. Besides, conventional assessment does not deal with the colony size, which is generally correlated with the delivered radiation dose or administered cytotoxic agent. Relying upon the direct proportional relationship between the Area Covered by Colony (ACC) and the colony count and size, along with the growth rate, we propose MF2C3, a novel computational method leveraging spatial Fuzzy C-Means clustering on multiple local features (i.e., entropy and standard deviation extracted from the input color images acquired by a general-purpose flat-bed scanner) for ACC-based SF quantification, by considering only the covering percentage. To evaluate the accuracy of the proposed fully automatic approach, we compared the SFs obtained by MF2C3 against the conventional counting procedure on four different cell lines. The achieved results revealed a high correlation with the ground-truth measurements based on colony counting, by outperforming our previously validated method using local thresholding on L*u*v* color well images. In conclusion, the proposed multi-feature approach, which inherently leverages the concept of symmetry in the pixel local distributions, might be reliably used in biological studies.


2019 ◽  
pp. 289-294
Author(s):  
S.H.E.J. Gabriels ◽  
B. Brouwer ◽  
H. de Villiers ◽  
E. Westra ◽  
E.J. Woltering

2012 ◽  
Vol 43 (6) ◽  
pp. 445-452 ◽  
Author(s):  
ARTHIT PUANGSOMBUT ◽  
SIWALAK PATHAVEERAT ◽  
ANUPUN TERDWONGWORAKUL ◽  
KAEWKARN PUANGSOMBUT
Keyword(s):  

2021 ◽  
Vol 22 (15) ◽  
pp. 8246
Author(s):  
Michal Rindos ◽  
Lucie Kucerova ◽  
Lenka Rouhova ◽  
Hana Sehadova ◽  
Michal Sery ◽  
...  

Many lepidopteran larvae produce silk feeding shelters and cocoons to protect themselves and the developing pupa. As caterpillars evolved, the quality of the silk, shape of the cocoon, and techniques in forming and leaving the cocoon underwent a number of changes. The silk of Pseudoips prasinana has previously been studied using X-ray analysis and classified in the same category as that of Bombyx mori, suggesting that silks of both species have similar properties despite their considerable phylogenetic distance. In the present study, we examined P. prasinana silk using ‘omics’ technology, including silk gland RNA sequencing (RNA-seq) and a mass spectrometry-based proteomic analysis of cocoon proteins. We found that although the central repetitive amino acid sequences encoding crystalline domains of fibroin heavy chain molecules are almost identical in both species, the resulting fibers exhibit quite different mechanical properties. Our results suggest that these differences are most probably due to the higher content of fibrohexamerin and fibrohexamerin-like molecules in P. prasinana silk. Furthermore, we show that whilst P. prasinana cocoons are predominantly made of silk similar to that of other Lepidoptera, they also contain a second, minor silk type, which is present only at the escape valve.


Sign in / Sign up

Export Citation Format

Share Document