scholarly journals Sparse principal component regression for generalized linear models

2018 ◽  
Vol 124 ◽  
pp. 180-196 ◽  
Author(s):  
Shuichi Kawano ◽  
Hironori Fujisawa ◽  
Toyoyuki Takada ◽  
Toshihiko Shiroishi
Author(s):  
Hervé Cardot ◽  
Pascal Sarda

This article presents a selected bibliography on functional linear regression (FLR) and highlights the key contributions from both applied and theoretical points of view. It first defines FLR in the case of a scalar response and shows how its modelization can also be extended to the case of a functional response. It then considers two kinds of estimation procedures for this slope parameter: projection-based estimators in which regularization is performed through dimension reduction, such as functional principal component regression, and penalized least squares estimators that take into account a penalized least squares minimization problem. The article proceeds by discussing the main asymptotic properties separating results on mean square prediction error and results on L2 estimation error. It also describes some related models, including generalized functional linear models and FLR on quantiles, and concludes with a complementary bibliography and some open problems.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e1978 ◽  
Author(s):  
Gabriela Fontanarrosa ◽  
Virginia Abdala

Grasping is one of a few adaptive mechanisms that, in conjunction with clinging, hooking, arm swinging, adhering, and flying, allowed for incursion into the arboreal eco-space. Little research has been done that addresses grasping as an enhanced manual ability in non-mammalian tetrapods, with the exception of studies comparing the anatomy of muscle and tendon structure. Previous studies showed that grasping abilities allow exploitation for narrow branch habitats and that this adaptation has clear osteological consequences. The objective of this work is to ascertain the existence of morphometric descriptors in the hand skeleton of lizards related to grasping functionality. A morphological matrix was constructed using 51 morphometric variables in 278 specimens, from 24 genera and 13 families of Squamata. To reduce the dimensions of the dataset and to organize the original variables into a simpler system, three PCAs (Principal Component Analyses) were performed using the subsets of (1) carpal variables, (2) metacarpal variables, and (3) phalanges variables. The variables that demonstrated the most significant contributions to the construction of the PCA synthetic variables were then used in subsequent analyses. To explore which morphological variables better explain the variations in the functional setting, we ranGeneralized Linear Modelsfor the three different sets. This method allows us to model the morphology that enables a particular functional trait. Grasping was considered the only response variable, taking the value of 0 or 1, while the original variables retained by the PCAs were considered predictor variables. Our analyses yielded six variables associated with grasping abilities: two belong to the carpal bones, two belong to the metacarpals and two belong to the phalanges. Grasping in lizards can be performed with hands exhibiting at least two different independently originated combinations of bones. The first is a combination of a highly elongated centrale bone, reduced palmar sesamoid, divergence angles above 90°, and slender metacarpal V and phalanges, such as exhibited byAnolissp. andTropidurussp. The second includes an elongated centrale bone, lack of a palmar sesamoid, divergence angles above 90°, and narrow metacarpal V and phalanges, as exhibited by geckos. Our data suggest that the morphological distinction between graspers and non-graspers is demonstrating the existence of ranges along the morphological continuum within which a new ability is generated. Our results support the hypothesis of the nested origin of grasping abilities within arboreality. Thus, the manifestation of grasping abilities as a response to locomotive selective pressure in the context of narrow-branch eco-spaces could also enable other grasping-dependent biological roles, such as prey handling.


Author(s):  
Constantin Ahlmann-Eltze ◽  
Wolfgang Huber

Abstract Motivation The Gamma-Poisson distribution is a theoretically and empirically motivated model for the sampling variability of single cell RNA-sequencing counts (Grün et al., 2014; Svensson, 2020; Silverman et al., 2018; Hafemeister and Satija, 2019) and an essential building block for analysis approaches including differential expression analysis (Robinson et al., 2010; McCarthy et al., 2012; Anders and Huber, 2010; Love et al., 2014), principal component analysis (Townes et al., 2019) and factor analysis (Risso et al., 2018). Existing implementations for inferring its parameters from data often struggle with the size of single cell datasets, which can comprise millions of cells; at the same time, they do not take full advantage of the fact that zero and other small numbers are frequent in the data. These limitations have hampered uptake of the model, leaving room for statistically inferior approaches such as logarithm(-like) transformation. Results We present a new R package for fitting the Gamma-Poisson distribution to data with the characteristics of modern single cell datasets more quickly and more accurately than existing methods. The software can work with data on disk without having to load them into RAM simultaneously. Availability The package glmGamPoi is available from Bioconductor for Windows, macOS, and Linux, and source code is available on github.com/const-ae/glmGamPoi under a GPL-3 license.


Author(s):  
Anderson G. Costa ◽  
Eudócio R. O. da Silva ◽  
Murilo M. de Barros ◽  
Jonatthan A. Fagundes

ABSTRACT The quality and price of coffee drinks can be affected by contamination with impurities during roasting and grinding. Methods that enable quality control of marketed products are important to meet the standards required by consumers and the industry. The purpose of this study was to estimate the percentage of impurities contained in coffee using textural and colorimetric descriptors obtained from digital images. Arabica coffee beans (Coffea arabica L.) at 100% purity were subjected to roasting and grinding processes, and the initially pure ground coffee was gradually contaminated with impurities. Digital images were collected from coffee samples with 0, 10, 30, 50, and 70% impurities. From the images, textural descriptors of the histograms (mean, standard deviation, entropy, uniformity, and third moment) and colorimetric descriptors (RGB color space and HSI color space) were obtained. The principal component regression (PCR) method was applied to the data group of textural and colorimetric descriptors for the development of linear models to estimate coffee impurities. The selected models for the textural descriptors data group and the colorimetric descriptors data group were composed of two and three principal components, respectively. The model from the colorimetric descriptors showed a greater capacity to estimate the percentage of impurities in coffee when compared to the model from the textural descriptors.


Author(s):  
Constantin Ahlmann-Eltze ◽  
Wolfgang Huber

AbstractMotivationThe Gamma-Poisson distribution is a theoretically and empirically motivated model for the sampling variability of single cell RNA-sequencing counts (Grün et al., 2014; Townes et al., 2019; Svensson, 2020; Silverman et al., 2018; Hafemeister and Satija, 2019) and an essential building block for analysis approaches including differential expression analysis (Robinson et al., 2010; McCarthy et al., 2012; Anders and Huber, 2010; Love et al., 2014), principal component analysis (Townes et al., 2019) and factor analysis (Risso et al., 2018). Existing implementations for inferring its parameters from data often struggle with the size of single cell datasets, which typically comprise thousands or millions of cells; at the same time, they do not take full advantage of the fact that zero and other small numbers are frequent in the data. These limitations have hampered uptake of the model, leaving room for statistically inferior approaches such as logarithm(-like) transformation.ResultsWe present a new R package for fitting the Gamma-Poisson distribution to data with the characteristics of modern single cell datasets more quickly and more accurately than existing methods. The software can work with data on disk without having to load them into RAM simultaneously.AvailabilityThe package glmGamPoi is available from Bioconductor (since release 3.11) for Windows, macOS, and Linux, and source code is available on GitHub under a GPL-3 license. The scripts to reproduce the results of this paper are available on GitHub as [email protected]


2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Khairunnisa Khairunnisa ◽  
Rizka Pitri ◽  
Victor P Butar-Butar ◽  
Agus M Soleh

This research used CFSRv2 data as output data general circulation model. CFSRv2 involves some variables data with high correlation, so in this research is using principal component regression (PCR) and partial least square (PLS) to solve the multicollinearity occurring in CFSRv2 data. This research aims to determine the best model between PCR and PLS to estimate rainfall at Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station by comparing RMSEP value and correlation value. Size used was 3×3, 4×4, 5×5, 6×6, 7×7, 8×8, 9×9, and 11×11 that was located between (-40) N - (-90) S and 1050 E -1100 E with a grid size of 0.5×0.5 The PLS model was the best model used in stastistical downscaling in this research than PCR model because of the PLS model obtained the lower RMSEP value and the higher correlation value. The best domain and RMSEP value for Bandung geophysical station, Bogor climatology station, Citeko meteorological station, and Jatiwangi meteorological station is 9 × 9 with 100.06, 6 × 6 with 194.3, 8 × 8 with 117.6, and 6 × 6 with 108.2, respectively.


2020 ◽  
Vol 02 ◽  
Author(s):  
RM Garcia ◽  
WF Vieira-Junior ◽  
JD Theobaldo ◽  
NIP Pini ◽  
GM Ambrosano ◽  
...  

Objective: To evaluate color and roughness of bovine enamel exposed to dentifrices, dental bleaching with 35% hydrogen peroxide (HP), and erosion/staining by red wine. Methods: Bovine enamel blocks were exposed to: artificial saliva (control), Oral-B Pro-Health (stannous fluoride with sodium fluoride, SF), Sensodyne Repair & Protect (bioactive glass, BG), Colgate Pro-Relief (arginine and calcium carbonate, AR), or Chitodent (chitosan, CHI). After toothpaste exposure, half (n=12) of the samples were bleached (35% HP), and the other half were not (n=12). The color (CIE L*a* b*, ΔE), surface roughness (Ra), and scanning electron microscopy were evaluated. Color and roughness were assessed at baseline, post-dentifrice and/or -dental bleaching, and after red wine. The data were subjected to analysis of variance (ANOVA) (ΔE) for repeated measures (Ra), followed by Tukey ́s test. The L*, a*, and b* values were analyzed by generalized linear models (a=0.05). Results: The HP promoted an increase in Ra values; however, the SF, BG, and AR did not enable this alteration. After red wine, all groups apart from SF (unbleached) showed increases in Ra values; SF and AR promoted decreases in L* values; AR demonstrated higher ΔE values, differing from the control; and CHI decreased the L* variation in the unbleached group. Conclusion: Dentifrices did not interfere with bleaching efficacy of 35% HP. However, dentifrices acted as a preventive agent against surface alteration from dental bleaching (BG, SF, and AR) or red wine (SF). Dentifrices can decrease (CHI) or increase (AR and SF) staining by red wine.


2020 ◽  
Vol 9 (16) ◽  
pp. 1105-1115
Author(s):  
Shuqing Wu ◽  
Xin Cui ◽  
Shaoyu Zhang ◽  
Wenqi Tian ◽  
Jiazhen Liu ◽  
...  

Aim: This real-world data study investigated the economic burden and associated factors of readmissions for cerebrospinal fluid leakage (CSFL) post-cranial, transsphenoidal, or spinal index surgeries. Methods: Costs of CSFL readmissions and index hospitalizations during 2014–2018 were collected. Readmission cost was measured as absolute cost and as percentage of index hospitalization cost. Factors associated with readmission cost were explored using generalized linear models. Results: Readmission cost averaged US$2407–6106, 35–94% of index hospitalization cost. Pharmacy costs were the leading contributor. Generalized linear models showed transsphenoidal index surgery and surgical treatment for CSFL were associated with higher readmission costs. Conclusion: CSFL readmissions are a significant economic burden in China. Factors associated with higher readmission cost should be monitored.


Sign in / Sign up

Export Citation Format

Share Document