When does gap filling of trait data confound taxonomic and functional analyses? 

Author(s):  
Julia Joswig ◽  
Jens Kattge ◽  
Guido Kraemer ◽  
Miguel Mahecha ◽  
Nadja Rüger ◽  
...  

<p>Data on plant traits are increasingly used to understand relationships between biodiversity and ecosystem processes. Large trait databases are sparse because they are compiled from many smaller and usually more local databases. This sparsity severely limits the potential for both multivariate and global data analyses, and so "gap-filling" (imputation) approaches are commonly used to predict missing trait data prior to analysis. Data imputation can result in large biases and circularity; yet, no best practice has evolved for the appropriate use of gap-filled data. Here, we use the TRY database, the largest global database of plant traits, in combination with the commonly used gap-filling algorithm, BayesianHierarchical Probabilistic Matrix Factorization (BHPMF), to address opportunities and problems introduced by gap-filling. BHPMF is the gap-filling method of choice for both TRY, and the large and widely used database sPLOT. It predicts missing trait data using the taxonomic hierarchy and observed patterns of trait variance and trait-trait correlations. We use three metrics: root mean square error estimates, coefficient of variation to assess univariate deviation, and silhouette indices to assess multivariate deviation and clustering strength. We show that gap-filling results in deviation of these metrics calculated for groupings at lower taxonomic levels (intra-specific and intra-genera), but less so at higher taxonomic levels (family) and for functional groups. Trait-trait correlations are preserved at all levels. The strength of deviations depends both on the percentage of gaps, and on data characteristics, e.g. intra-taxa variability. Gap-filling with dataset-external trait data generally ameliorates prediction error, but the deviations of intra-taxonomic variation measures depend on the content of the added data. We conclude that BHPMF gap-filling introduces little bias if specifically used for analyses of traits within functional groups, including growth forms and plant functional types (PFTs), as well as trait-trait correlations. However, we generally discourage their use for analyses of taxonomic groupings at or below the family level. In summary, our study supports decisions on when and how to integrate BHPMF gap-filled trait data in future studies. We conclude with selected best practices when using sparse databases.</p>

Author(s):  
STEVE D. JONES ◽  
CORINNE LE QUÉRÉ ◽  
CHRISTIAN RÖDENBECK ◽  
ANDREW C. MANNING ◽  
ARE OLSEN

2017 ◽  
Author(s):  
Minseok Kang ◽  
Joon Kim ◽  
Bindu Malla Thakuri ◽  
Junghwa Chun ◽  
Chunho Cho

Abstract. The continuous measurement of H2O and CO2 fluxes using the eddy covariance (EC) technique is still challenging for forests in complex terrain because of large amounts of wet canopy evaporation (EWC), which occur during and following rain events when the EC systems rarely work correctly, and the horizontal advection of CO2 generated at night. We propose new techniques for gap-filling and partitioning of the H2O and CO2 fluxes: (1) a model-stats hybrid method (MSH) and (2) a modified moving point test method (MPTm). The former enables the recovery of the missing EWC in the traditional gap-filling method and the partitioning of the evapotranspiration (ET) into transpiration and (wet canopy) evaporation. The latter determines the friction velocity (u*) threshold based on an iterative approach using moving windows for both time and u*, thereby allowing not only the nighttime CO2 flux correction and partitioning but also the assessment of the significance of the CO2 drainage. We tested and validated these new methods using the datasets from two flux towers, which are located at forests in hilly and complex terrains. The MSH reasonably recovered the missing EWC of 16 ~ 41 mm year−1 and separated it from the ET (14 ~ 23 % of the annual ET). The MPTm produced consistent carbon budgets using those from the previous research and diameter increment, while it has improved applicability. Additionally, we illustrated certain advantages of the proposed techniques, which enables us to understand better how ET responses to environmental changes and how the water cycle is connected to the carbon cycle in a forest ecosystem.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yihua Xiao ◽  
Shirong Liu ◽  
Manyun Zhang ◽  
Fuchun Tong ◽  
Zhihong Xu ◽  
...  

Urbanization causes alteration in atmospheric, soil, and hydrological factors and substantially affects a range of morphological and physiological plant traits. Correspondingly, plants might adopt different strategies to adapt to urbanization promotion or pressure. Understanding of plant traits responding to urbanization will reveal the capacity of plant adaptation and optimize the choice of plant species in urbanization green. In this study, four different functional groups (herbs, shrubs, subcanopies, and canopies, eight plant species totally) located in urban, suburban, and rural areas were selected and eight replicated plants were selected for each species at each site. Their physiological and photosynthetic properties and heavy metal concentrations were quantified to reveal plant adaptive strategies to urbanization. The herb and shrub species had significantly higher starch and soluble sugar contents in urban than in suburban areas. Urbanization decreased the maximum photosynthetic rates and total chlorophyll contents of the canopies (Engelhardtia roxburghiana and Schima superba). The herbs (Lophatherum gracile and Alpinia chinensis) and shrubs (Ardisia quinquegona and Psychotria rubra) species in urban areas had significantly lower nitrogen (N) allocated in the cell wall and leaf δ15N values but higher heavy metal concentrations than those in suburban areas. The canopy and subcanopy (Diospyros morrisiana and Cratoxylum cochinchinense) species adapt to the urbanization via reducing resource acquisition but improving defense capacity, while the herb and shrub species improve resource acquisition to adapt to the urbanization. Our current studies indicated that functional groups affected the responses of plant adaptive strategies to the urbanization.


2019 ◽  
Vol 145 (3) ◽  
pp. EL236-EL242
Author(s):  
Bae-Hyung Kim ◽  
Viksit Kumar ◽  
Azra Alizad ◽  
Mostafa Fatemi

2019 ◽  
Author(s):  
Luke Gregor ◽  
Alice D. Lebehot ◽  
Schalk Kok ◽  
Pedro M. Scheel Monteiro

Abstract. Over the last decade, advanced statistical inference and machine learning have been used to fill the gaps in sparse surface ocean CO2 measurements (Rödenbeck et al. 2015). The estimates from these methods have been used to constrain seasonal, interannual and decadal variability in sea-air CO2 fluxes and the drivers of these changes (Landschützer et al. 2015, 2016, Gregor et al. 2018). However, it is also becoming clear that these methods are converging towards a common bias and RMSE boundary: the wall, which suggests that pCO2 estimates are now limited by both data gaps and scale-sensitive observations. Here, we analyse this problem by introducing a new gap-filling method, an ensemble of six machine learning models (CSIR-ML6 version 2019a), where each model is constructed with a two-step clustering-regression approach. The ensemble is then statistically compared to well-established methods. The ensemble, CSIR-ML6, has an RMSE of 17.16 µatm and bias of 0.89 µatm when compared to a test-dataset kept separate from training procedures. However, when validating our estimates with independent datasets, we find that our method improves only incrementally on other gap-filling methods. We investigate the differences between the methods to understand the extent of the limitations of gap-filling estimates of pCO2. We show that disagreement between methods in the South Atlantic, southeastern Pacific and parts of the Southern Ocean are too large to interpret the interannual variability with confidence. We conclude that improvements in surface ocean pCO2 estimates will likely be incremental with the optimisation of gap-filling methods by (1) the inclusion of additional clustering and regression variables (e.g. eddy kinetic energy), (2) increasing the sampling resolution. Larger improvements will only be realised with an increase in CO2 observational coverage, particularly in today's poorly sampled areas.


2001 ◽  
Vol 324 (4) ◽  
pp. 1159-1168 ◽  
Author(s):  
D. Fierry Fraillon ◽  
T. Appourchaux
Keyword(s):  

2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Xiaosong Zhao ◽  
Yao Huang

Missing data is an inevitable problem when measuring CO2, water, and energy fluxes between biosphere and atmosphere by eddy covariance systems. To find the optimum gap-filling method for short vegetations, we review three-methods mean diurnal variation (MDV), look-up tables (LUT), and nonlinear regression (NLR) for estimating missing values of net ecosystem CO2exchange (NEE) in eddy covariance time series and evaluate their performance for different artificial gap scenarios based on benchmark datasets from marsh and cropland sites in China. The cumulative errors for three methods have no consistent bias trends, which ranged between −30 and +30 mgCO2 m−2from May to October at three sites. To reduce sum bias in maximum, combined gap-filling methods were selected for short vegetation. The NLR or LUT method was selected after plant rapidly increasing in spring and before the end of plant growing, and MDV method was used to the other stage. The sum relative error (SRE) of optimum method ranged between −2 and +4% for four-gap level at three sites, except for 55% gaps at soybean site, which also obviously reduced standard deviation of error.


Gene ◽  
1991 ◽  
Vol 109 (1) ◽  
pp. 81-87 ◽  
Author(s):  
Takeshi Iwasaki ◽  
Katsuhiko Shirahige ◽  
Hiroshi Yoshikawa ◽  
Naotake Ogasawara

Sign in / Sign up

Export Citation Format

Share Document