scholarly journals Reexamination of Rhopalosiphum (Hemiptera: Aphididae) using linear discriminant analysis to determine the validity of synonymized species, with some new synonymies and distribution data

2020 ◽  
Vol 8 ◽  
Author(s):  
Michael Skvarla ◽  
Matthew Kramer ◽  
Christopher Owen ◽  
Gary Miller

Although 17 species of Rhopalosiphum (Hemiptera: Aphididae) are currently recognized, 85 taxonomic names have been proposed historically. Some species are morphologically similar, especially alate individuals and most synonymies were proposed in catalogues without evidence. This has led to both confusion and difficulty in making accurate species-level identifications. In an attempt to address these issues, we developed a new approach to resolve synonymies based on linear discriminant analysis (LDA) and suggest that this approach may be useful for other taxonomic groups to reassess previously proposed synonymies. We compared 34 valid and synonymized species using 49 measurements and 20 ratios from 1,030 individual aphids. LDA was repeatedly applied to subsets of the data after removing clearly separated groups found in a previous iteration. We found our characters and technique worked well to distinguish among apterae. However, it separated well only those alatae with some distinctive traits, while those apterate which were morphologically similar were not well separated using LDA. Based on our morphological investigation, we transfer R. arundinariae (Tissot, 1933) to Melanaphis supported by details of the wing veination and other morphological traits and propose Melanaphis takahashii Skvarla and Miller as a replacement name for M. arundinariae (Takahashi, 1937); we also synonymize R. momo (Shinji, 1922) with R. nymphaeae (Linnaeus, 1761). Our analyses confirmed many of the proposed synonymies, which will help to stabilize the nomenclature and species concepts within Rhopalosiphum.

Biometrika ◽  
2021 ◽  
Author(s):  
Juhyun Park ◽  
Jeongyoun Ahn ◽  
Yongho Jeon

Abstract Functional linear discriminant analysis offers a simple yet efficient method for classification, with the possibility of achieving a perfect classification. Several methods are proposed in the literature that mostly address the dimensionality of the problem. On the other hand, there is a growing interest in interpretability of the analysis, which favors a simple and sparse solution. In this work, we propose a new approach that incorporates a type of sparsity that identifies nonzero sub-domains in the functional setting, offering a solution that is easier to interpret without compromising performance. With the need to embed additional constraints in the solution, we reformulate the functional linear discriminant analysis as a regularization problem with an appropriate penalty. Inspired by the success of ℓ1-type regularization at inducing zero coefficients for scalar variables, we develop a new regularization method for functional linear discriminant analysis that incorporates an L1-type penalty, ∫ |f|, to induce zero regions. We demonstrate that our formulation has a well-defined solution that contains zero regions, achieving a functional sparsity in the sense of domain selection. In addition, the misclassification probability of the regularized solution is shown to converge to the Bayes error if the data are Gaussian. Our method does not presume that the underlying function has zero regions in the domain, but produces a sparse estimator that consistently estimates the true function whether or not the latter is sparse. Numerical comparisons with existing methods demonstrate this property in finite samples with both simulated and real data examples.


2013 ◽  
Vol 23 (2) ◽  
pp. 463-471 ◽  
Author(s):  
Tomasz Górecki ◽  
Maciej Łuczak

The Linear Discriminant Analysis (LDA) technique is an important and well-developed area of classification, and to date many linear (and also nonlinear) discrimination methods have been put forward. A complication in applying LDA to real data occurs when the number of features exceeds that of observations. In this case, the covariance estimates do not have full rank, and thus cannot be inverted. There are a number of ways to deal with this problem. In this paper, we propose improving LDA in this area, and we present a new approach which uses a generalization of the Moore-Penrose pseudoinverse to remove this weakness. Our new approach, in addition to managing the problem of inverting the covariance matrix, significantly improves the quality of classification, also on data sets where we can invert the covariance matrix. Experimental results on various data sets demonstrate that our improvements to LDA are efficient and our approach outperforms LDA.


2020 ◽  
Vol 16 (8) ◽  
pp. 1079-1087
Author(s):  
Jorgelina Z. Heredia ◽  
Carlos A. Moldes ◽  
Raúl A. Gil ◽  
José M. Camiña

Background: The elemental composition of maize grains depends on the soil, land and environment characteristics where the crop grows. These effects are important to evaluate the availability of nutrients with complex dynamics, such as the concentration of macro and micronutrients in soils, which can vary according to different topographies. There is available scarce information about the influence of topographic characteristics (upland and lowland) where culture is developed with the mineral composition of crop products, in the present case, maize seeds. On the other hand, the study of the topographic effect on crops using multivariate analysis tools has not been reported. Objective: This paper assesses the effect of topographic conditions on plants, analyzing the mineral profiles in maize seeds obtained in two land conditions: uplands and lowlands. Materials and Methods: The mineral profile was studied by microwave plasma atomic emission spectrometry. Samples were collected from lowlands and uplands of cultivable lands of the north-east of La Pampa province, Argentina. Results: Differentiation of maize seeds collected from both topographical areas was achieved by principal components analysis (PCA), cluster analysis (CA) and linear discriminant analysis (LDA). PCA model based on mineral profile allowed to differentiate seeds from upland and lowlands by the influence of Cr and Mg variables. A significant accumulation of Cr and Mg in seeds from lowlands was observed. Cluster analysis confirmed such grouping but also, linear discriminant analysis achieved a correct classification of both the crops, showing the effect of topography on elemental profile. Conclusions: Multi-elemental analysis combined with chemometric tools proved useful to assess the effect of topographic characteristics on crops.


2020 ◽  
Vol 15 ◽  
Author(s):  
Mohanad Mohammed ◽  
Henry Mwambi ◽  
Bernard Omolo

Background: Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). We developed a hybrid (DNA mutation and RNA expression) signature and assessed its predictive properties for the mutation status and survival of CRC patients. Methods: Publicly-available microarray and RNASeq data from 54 matched formalin-fixed paraffin-embedded (FFPE) samples from the Affymetrix GeneChip and RNASeq platforms, were used to obtain differentially expressed genes between mutant and wild-type samples. We applied the support-vector machines, artificial neural networks, random forests, k-nearest neighbor, naïve Bayes, negative binomial linear discriminant analysis, and the Poisson linear discriminant analysis algorithms for classification. Cox proportional hazards model was used for survival analysis. Results: Compared to the genelist from each of the individual platforms, the hybrid genelist had the highest accuracy, sensitivity, specificity, and AUC for mutation status, across all the classifiers and is prognostic for survival in patients with CRC. NBLDA method was the best performer on the RNASeq data while the SVM method was the most suitable classifier for CRC across the two data types. Nine genes were found to be predictive of survival. Conclusion: This signature could be useful in clinical practice, especially for colorectal cancer diagnosis and therapy. Future studies should determine the effectiveness of integration in cancer survival analysis and the application on unbalanced data, where the classes are of different sizes, as well as on data with multiple classes.


Sign in / Sign up

Export Citation Format

Share Document