On Solving Large Data Matrix Problems in Fuzzy AHP

2022 ◽  
pp. 116488
Author(s):  
Milind Jaiwant Sakhardande ◽  
Rajesh Suresh Prabhu
2019 ◽  
Vol 191 (1) ◽  
pp. 1-17 ◽  
Author(s):  
Matt H Buys ◽  
Richard C Winkworth ◽  
Peter J de Lange ◽  
Peter G Wilson ◽  
Nora Mitchell ◽  
...  

Abstract Leptospermum scoparium (Myrtaceae) is a morphologically highly variable species found in mainland Australia, Tasmania and New Zealand. For example, in New Zealand up to six morphologically distinct varieties of this species have been described, although only two (var. scoparium and var. incanum) are now formally recognized. In the present study we provide a first examination of genetic diversity in this culturally and commercially important species with the aim of gaining insights into its origins and evolution. We used anchored hybrid enrichment to acquire sequence data from 485 orthologous low-copy nuclear loci for 27 New Zealand and three Australian accessions of L. scoparium and representatives of several other Leptospermum spp. The final concatenated data matrix contained 421 687 nucleotide positions of which 55 102 were potentially informative. Despite the relative large data set, our analyses suggest that a combination of low and incompatible data signal limits the resolution of relationships among New Zealand populations of L. scoparium. Nevertheless, our analyses are consistent with genetic diversity being geographically structured, with three groups of L. scoparium recovered. We discuss the evolutionary and taxonomic implications of our findings.


2003 ◽  
Vol 57 (8) ◽  
pp. 996-1006 ◽  
Author(s):  
Slobodan Šašić ◽  
Yukihiro Ozaki

In this paper we report two new developments in two-dimensional (2D) correlation spectroscopy; one is the combination of the moving window concept with 2D spectroscopy to facilitate the analysis of complex data sets, and the other is the definition of the noise level in synchronous/asynchronous maps. A graphical criterion for the latter is also proposed. The combination of the moving window concept with correlation spectra allows one to split a large data matrix into smaller and simpler subsets and to analyze them instead of computing overall correlation. A three-component system that mimics a consecutive chemical reaction is used as a model for the illustration of the two ideas. Both types of correlation matrices, variable–variable and sample–sample, are analyzed, and a very good agreement between the two is met. The proposed innovations enable one to comprehend the complexity of the data to be analyzed by 2D spectroscopy and thus to avoid the risks of over-interpretation, liable to occur whenever improper caution about the number of coexisting species in the system is taken.


2021 ◽  
Author(s):  
Joris Vanhoutven ◽  
Bart Cuypers ◽  
Pieter Meysman ◽  
Jef Hooyberghs ◽  
Kris Laukens ◽  
...  

AbstractIn high-throughput omics disciplines like transcriptomics, researchers face a need to assess the quality of an experiment prior to an in-depth statistical analysis. To efficiently analyze such voluminous collections of data, researchers need triage methods that are both quick and easy to use. Such a normalization method for relative quantitation, CONSTANd, was recently introduced for isobarically-labeled mass spectra in proteomics. It transforms the data matrix of abundances through an iterative, convergent process enforcing three constraints: (I) identical column sums; (II) each row sum is fixed (across matrices) and (III) identical to all other row sums. In this study, we investigate whether CONSTANd is suitable for count data from massively parallel sequencing, by qualitatively comparing its results to those of DESeq2. Further, we propose an adjustment of the method so that it may be applied to identically balanced but differently sized experiments for joint analysis. We find that CONSTANd can process large data sets with about 2 million count records in less than a second whilst removing unwanted systematic bias and thus quickly uncovering the underlying biological structure when combined with a PCA plot or hierarchical clustering. Moreover, it allows joint analysis of data sets obtained from different batches, with different protocols and from different labs but without exploiting information from the experimental setup other than the delineation of samples into identically processed sets (IPSs). CONSTANd’s simplicity and applicability to proteomics as well as transcriptomics data make it an interesting candidate for integration in multi-omics workflows.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1947
Author(s):  
Yan Wang ◽  
Shan Gao ◽  
Hongyan Chu ◽  
Xuefei Wang

In view of the practical application requirements for the rapid expansion of electric taxis (ETs) and the reasonable planning of charging stations, this paper presents a method for mining latent semantic correlation of large data by the trajectory of ETs and the planning of charging stations with optimal cost. Firstly, the vector space modeling method of ET trajectory data is studied, and the semantic similarity of the trajectory data matrix is evaluated. Secondly, the hidden characteristics of the mass trajectory data are extracted by matrix decomposition. Then, the latent semantic correlation characteristics of trajectory data are mined. Finally, the fast clustering of ETs is realized by the spectral clustering method. On this basis, with the objective of minimizing the annual construction and maintenance costs of charging stations, the optimal planning scheme of charging stations for ETs is given. In this paper, the spectrum clustering processing method of the potential semantic correlation of the big data of the driving track of ETs can be combined with the operation and maintenance costs of the charging station, and the convenience of charging for ET users is also considered. This provides decision support information for the reasonable planning of charging stations.


2010 ◽  
Vol 39 (2) ◽  
pp. 140-160 ◽  
Author(s):  
Carme Julià ◽  
Angel D. Sappa ◽  
Felipe Lumbreras ◽  
Joan Serrat ◽  
Antonio López

2016 ◽  
Vol 3 (3) ◽  
pp. 150674 ◽  
Author(s):  
Roland B. Sookias

For the first time, a phylogenetic analysis including all putative euparkeriid taxa is conducted, using a large data matrix analysed with maximum parsimony and Bayesian analysis. Using parsimony, the putative euparkeriid Dorosuchus neoetus from Russia is the sister taxon to Archosauria + Phytosauria. Euparkeria capensis is placed one node further from the crown, and forms a euparkeriid clade with the Chinese taxa Halazhaisuchus qiaoensis and ‘ Turfanosuchus shageduensis ’ and the Polish taxon Osmolskina czatkowicensis . Using Bayesian methods, Osmolskina and Halazhaisuchus are sister taxa within Euparkeriidae, in turn sister to ‘ Turfanosuchus shageduensis ’ and then Euparkeria capensis . Dorosuchus is placed in a polytomy with Euparkeriidae and Archosauria + Phytosauria. Although conclusions remain tentative owing to low node support and incompleteness, a broad phylogenetic position close to the base of Archosauria is confirmed for all putative euparkeriids, and the ancestor of Archosauria +Phytosauria is optimized as similar to euparkeriids in its morphology. Ecomorphological characters and traits are optimized onto the maximum parsimony strict consensus phylogeny presented using squared change parsimony. This optimization indicates that the ancestral archosaur was probably similar in many respects to euparkeriids, being relatively small, terrestrial, carnivorous and showing relatively cursorial limb morphology; this Bauplan may have underlain the exceptional radiaton and success of crown Archosauria.


2021 ◽  
Author(s):  
Marcos A. Antezana

ABSTRACTWhen a data matrix DM has many independent variables IVs, it is not computationally tractable to assess the association of every distinct IV subset with the dependent variable DV of the DM, because the number of subsets explodes combinatorially as IVs increase. But model selection and correcting for multiple tests is complex even with few IVs.DMs in genomics will soon summarize millions of markers (mutations) and genomes. Searching exhaustively in such DMs for mutations that alone or synergistically with others are associated with a trait is computationally tractable only for 1- and 2-mutation effects. This is also why population geneticists study mainly 2-marker combinations.I present a computationally tractable, fully parallelizable Participation in Association Score (PAS) that in a DM with markers detects one by one every column that is strongly associated in any way with others. PAS does not examine column subsets and its computational cost grows linearly with the number of columns, remaining reasonable even when DMs have millions of columns. PAS P values are readily obtained by permutation and accurately Sidak-corrected for multiple tests, bypassing model selection. The P values of a column’s PASs and dvPASs for different orders of association are i.i.d. and easily turned into a single P value.PAS exploits how associations of markers in the rows of a DM cause associations of matches in the pairwise comparisons of the rows. For every such comparison with a match at a tested column, PAS computes the matches at other columns by modifying the comparison’s total matches (scored once per DM), yielding a distribution of conditional matches that reacts diagnostically to the associations of the tested column. Equally computationally tractable is dvPAS that flags DV-associated IVs by also probing the matches at the DV.Simulations show that i) PAS and dvPAS generate uniform-(0,1)-distributed type I error in null DMs and ii) detect randomly encountered binary and trinary models of significant n-column association and n-IV association to a binary DV, respectively, with power in the order of magnitude of exhaustive evaluation’s and false positives that are uniform-(0,1)-distributed or straightforwardly tuned to be so. Power to detect 2-way associations that extend over 100+ columns is non-parametrically ultimate but that to detect pure n-column associations and pure n-IV DV associations sinks exponentially with increasing n.Important for geneticists, dvPAS power increases about twofold in trinary vs. binary DMs and by orders of magnitude with markers linked like mutations in chromosomes, specially in trinary DMs where furthermore dvPAS fine-maps with highest resolution.


2016 ◽  
Vol 283 (1828) ◽  
pp. 20160214 ◽  
Author(s):  
Min Wang ◽  
Graeme T. Lloyd

The Early Cretaceous is a critical interval in the early history of birds. Exceptional fossils indicate that important evolutionary novelties such as a pygostyle and a keeled sternum had already arisen in Early Cretaceous taxa, bridging much of the morphological gap between Archaeopteryx and crown birds. However, detailed features of basal bird evolution remain obscure because of both the small sample of fossil taxa previously considered and a lack of quantitative studies assessing rates of morphological evolution. Here we apply a recently available phylogenetic method and associated sensitivity tests to a large data matrix of morphological characters to quantify rates of morphological evolution in Early Cretaceous birds. Our results reveal that although rates were highly heterogeneous between different Early Cretaceous avian lineages, consistent patterns of significantly high or low rates were harder to pinpoint. Nevertheless, evidence for accelerated evolutionary rates is strongest at the point when Ornithuromorpha (the clade comprises all extant birds and descendants from their most recent common ancestors) split from Enantiornithes (a diverse clade that went extinct at the end-Cretaceous), consistent with the hypothesis that this key split opened up new niches and ultimately led to greater diversity for these two dominant clades of Mesozoic birds.


Author(s):  
John A. Hunt

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].


Sign in / Sign up

Export Citation Format

Share Document