independent observations
Recently Published Documents


TOTAL DOCUMENTS

368
(FIVE YEARS 92)

H-INDEX

38
(FIVE YEARS 5)

2021 ◽  
Vol 9 (Suppl 1) ◽  
pp. e001290
Author(s):  
Jenine K Harris

Family medicine has traditionally prioritised patient care over research. However, recent recommendations to strengthen family medicine include calls to focus more on research including improving research methods used in the field. Binary logistic regression is one method frequently used in family medicine research to classify, explain or predict the values of some characteristic, behaviour or outcome. The binary logistic regression model relies on assumptions including independent observations, no perfect multicollinearity and linearity. The model produces ORs, which suggest increased, decreased or no change in odds of being in one category of the outcome with an increase in the value of the predictor. Model significance quantifies whether the model is better than the baseline value (ie, the percentage of people with the outcome) at explaining or predicting whether the observed cases in the data set have the outcome. One model fit measure is the count- R2, which is the percentage of observations where the model correctly predicted the outcome variable value. Related to the count- R2 are model sensitivity—the percentage of those with the outcome who were correctly predicted to have the outcome—and specificity—the percentage of those without the outcome who were correctly predicted to not have the outcome. Complete model reporting for binary logistic regression includes descriptive statistics, a statement on whether assumptions were checked and met, ORs and CIs for each predictor, overall model significance and overall model fit.


Author(s):  
Julian Koch ◽  
Mehmet Cüneyd Demirel ◽  
Simon Stisen

Spatial pattern-oriented evaluations of distributed hydrological models have contributed towards an improved realism of hydrological simulations. This advancement was supported by the broad range of readily available satellite-based datasets of key hydrological variables, such as evapotranspiration (ET). At larger scale, spatial patterns of ET are often characterized by an underlying climate gradient, and with this study, we argue that gradient dominated patterns may hamper the potential of spatial pattern-oriented evaluation frameworks. We hypothesize that the climate control of spatial patterns of ET overshadows the effect model parameters have on the simulated variability. To solve this limitation, we propose a climate normalization strategy. This is demonstrated for the Senegal River basin as modeling case study, where the dominant north-south precipitation gradient is the main driver of the observed hydrological variability. Two multi-objective calibration experiments investigate the effect of climate normalization. Both calibrations utilize observed discharge (Q) in combination with remote sensing ET data, where one is based on the original ET pattern and the other utilizes the normalized ET pattern. We identify parameter sets that balance the tradeoffs between the two independent observations and find that the calibration using the normalized ET pattern does not compromise the spatial patern performance of the original pattern. However, vice versa, this is not necessarily the case, since the calibration using the original ET pattern showed a poorer performance for the normalized pattern. Both calibrations reached comparable performance of Q. With this study, we identified a general shortcoming of spatial pattern-oriented model evaluations using ET in basins dominated by a climate gradient, but we argue that this also applies to other variables such as, soil moisture or land surface temperature.


2021 ◽  
Vol 921 (2) ◽  
pp. 176
Author(s):  
Dana S. Balser ◽  
Trey V. Wenger ◽  
L. D. Anderson ◽  
W. P. Armentrout ◽  
T. M. Bania ◽  
...  

Abstract We investigate the kinematic properties of Galactic H ii regions using radio recombination line (RRL) emission detected by the Australia Telescope Compact Array at 4–10 GHz and the Jansky Very Large Array at 8–10 GHz. Our H ii region sample consists of 425 independent observations of 374 nebulae that are relatively well isolated from other, potentially confusing sources and have a single RRL component with a high signal-to-noise ratio. We perform Gaussian fits to the RRL emission in position-position–velocity data cubes and discover velocity gradients in 178 (42%) of the nebulae with magnitudes between 5 and 200 m s − 1 arcsec − 1 . About 15% of the sources also have an RRL width spatial distribution that peaks toward the center of the nebula. The velocity gradient position angles appear to be random on the sky with no favored orientation with respect to the Galactic plane. We craft H ii region simulations that include bipolar outflows or solid body rotational motions to explain the observed velocity gradients. The simulations favor solid body rotation since, unlike the bipolar outflow kinematic models, they are able to produce both the large, >40 m s − 1 arcsec − 1 , velocity gradients and also the RRL width structure that we observe in some sources. The bipolar outflow model, however, cannot be ruled out as a possible explanation for the observed velocity gradients for many sources in our sample. We nevertheless suggest that most H ii region complexes are rotating and may have inherited angular momentum from their parent molecular clouds.


2021 ◽  
Author(s):  
Thiago Peixoto Leal ◽  
Vinicius C Furlan ◽  
Mateus Henrique Gouveia ◽  
Julia Maria Saraiva Duarte ◽  
Pablo AS Fonseca ◽  
...  

Genetic and omics analyses frequently require independent observations, which is not guaranteed in real datasets. When relatedness can not be accounted for, solutions involve removing related individuals (or observations) and, consequently, a reduction of available data. We developed a network-based relatedness-pruning method that minimizes dataset reduction while removing unwanted relationships in a dataset. It uses node degree centrality metric to identify highly connected nodes (or individuals) and implements heuristics that approximate the minimal reduction of a dataset to allow its application to large datasets. NAToRA outperformed two popular methodologies (implemented in software PLINK and KING) by showing the best combination of effective relatedness-pruning, removing all relatives while keeping the largest possible number of individuals in all datasets tested and also, with similar or lesser reduction in genetic diversity. NAToRA is freely available, both as a standalone tool that can be easily incorporated as part of a pipeline, and as a graphical web tool that allows visualization of the relatedness networks. NAToRA also accepts a variety of relationship metrics as input, which facilitates its use. We also present a genealogies simulator software used for different tests performed in the manuscript.


Electronics ◽  
2021 ◽  
Vol 10 (20) ◽  
pp. 2469
Author(s):  
Te Zeng ◽  
Francis C. M. Lau

We present a novel reinforcement learning architecture that learns a structured representation for use in symbolic melody harmonization. Probabilistic models are predominant in melody harmonization tasks, most of which only treat melody notes as independent observations and do not take note of substructures in the melodic sequence. To fill this gap, we add substructure discovery as a crucial step in automatic chord generation. The proposed method consists of a structured representation module that generates hierarchical structures for the symbolic melodies, a policy module that learns to break a melody into segments (whose boundaries concur with chord changes) and phrases (the subunits in segments), and a harmonization module that generates chord sequences for each segment. We formulate the structure discovery process as a sequential decision problem with a policy gradient RL method selecting the boundary of each segment or phrase to obtain an optimized structure. We conduct experiments on our preprocessed HookTheory Lead Sheet Dataset, which has 17,979 melody/chord pairs. The results demonstrate that our proposed method can learn task-specific representations and, thus, yield competitive results compared with state-of-the-art baselines.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Magda Guglielmo ◽  
Fiona H. M. Tang ◽  
Chiara Pasut ◽  
Federico Maggi

AbstractWe introduce here SOIL-WATERGRIDS, a new dataset of dynamic changes in soil moisture and depth of water table over 45 years from 1970 to 2014 globally resolved at 0.25 × 0.25 degree resolution (about 30 × 30 km at the equator) along a 56 m deep soil profile. SOIL-WATERGRIDS estimates were obtained using the BRTSim model instructed with globally gridded soil physical and hydraulic properties, land cover and use characteristics, and hydrometeorological variables to account for precipitation, ecosystem-specific evapotranspiration, snowmelt, surface runoff, and irrigation. We validate our estimates against independent observations and re-analyses of the soil moisture, water table depth, wetland occurrence, and runoff. SOIL-WATERGRIDS brings into a single product the monthly mean water saturation at three depths in the root zone and the depth of the highest and lowest water tables throughout the reference period, their long-term monthly averages, and data quality. SOIL-WATERGRIDS can therefore be used to analyse trends in water availability for agricultural abstraction, assess the water balance under historical weather patterns, and identify water stress in sensitive managed and unmanaged ecosystems.


2021 ◽  
Author(s):  
Guorong Zhong ◽  
Xuegang Li ◽  
Jinming Song ◽  
Baoxiao Qu ◽  
Fan Wang ◽  
...  

Abstract. Various machine learning methods were attempted in the global mapping of surface ocean partial pressure of CO2 (pCO2) to reduce the uncertainty of global ocean CO2 sink estimate due to undersampling of pCO2. In previous researches the predicators of pCO2 were usually selected empirically based on theoretic drivers of surface ocean pCO2 and same combination of predictors were applied in all areas unless lack of coverage. However, the differences between the drivers of surface ocean pCO2 in different regions were not considered. In this work, we combined the stepwise regression algorithm and a Feed Forward Neural Network (FFNN) to selected predicators of pCO2 based on mean absolute error in each of the 11 biogeochemical provinces defined by Self-Organizing Map (SOM) method. Based on the predicators selected, a monthly global 1° × 1° surface ocean pCO2 product from January 1992 to August 2019 was constructed. Validation of different combination of predicators based on the SOCAT dataset version 2020 and independent observations from time series stations was carried out. The prediction of pCO2 based on region-specific predicators selected by the stepwise FFNN algorithm were more precise than that based on predicators from previous researches. Appling of a FFNN size improving algorithm in each province decreased the mean absolute error (MAE) of global estimate to 11.32 μatm and the root mean square error (RMSE) to 17.99 μatm. The script file of the stepwise FFNN algorithm and pCO2 product are distributed through the Institute of Oceanology of the Chinese Academy of Sciences Marine Science Data Center (IOCAS; http://dx.doi.org/10.12157/iocas.2021.0022, Zhong et al., 2021).


Author(s):  
Alexey Yu. Kharin

An important mathematical problem of computer data analysis – the problem of statistical sequential testing of simple hypotheses on parameters of probability distributions of observed binary data – is considered in the paper. This problem is being solved for two models of observation: for independent observations and for homogeneous Markov chains. Explicit expressions of the sequential tests statistics are derived, transparent for interpretation and convenient for computer realisation. An approach is developed to calculate the performance characteristics – error probabilities and mathematical expectations of the random number of observations required to guarantee the requested accuracy for decision rules. Asymptotic expansions for the mentioned performance characteristics are constructed under «contamination» of the probability distributions of observed data.


Author(s):  
Randal D. Koster ◽  
Anthony M. DeAngelis ◽  
Siegfried D. Schubert ◽  
Andrea M. Molod

AbstractSoil moisture (W) helps control evapotranspiration (ET), and ET variations can in turn have a distinct impact on 2-m air temperature (T2M), given that increases in evaporative cooling encourage reduced temperatures. Soil moisture is accordingly linked to T2M, and realistic soil moisture initialization has, in previous studies, been shown to improve the skill of subseasonal T2M forecasts. The relationship between soil moisture and evapotranspiration, however, is distinctly nonlinear, with ET tending to increase with soil moisture in drier conditions and to be insensitive to soil moisture variations in wetter conditions. Here, through an extensive analysis of subseasonal forecasts produced with a state-of-the-art seasonal forecast system, this nonlinearity is shown to imprint itself on T2M forecast error in the conterminous United States in two unique ways: (i) the T2M forecast bias (relative to independent observations) induced by a negative precipitation bias tends to be larger for dry initializations, and (ii) on average, the unbiased root-mean-square error (ubRMSE) tends to be larger for dry initializations. Such findings can aid in the identification of forecasts of opportunity; taken a step further, they suggest a pathway for improving bias correction and uncertainty estimation in subseasonal T2M forecasts by conditioning each on initial soil moisture state.


Author(s):  
Giorgio Gotti ◽  
Seán G. Roberts ◽  
Marco Fasan ◽  
Cole B. J. Robertson

This paper investigates whether a consideration of linguistic history is important when studying the relationship between economic and linguistic behaviors. Several recent economic studies have suggested that differences between languages can affect the way people think and behave (linguistic relativity or Sapir–Whorf hypothesis). For example, the way a language obliges one to talk about the future might influence intertemporal decisions, such as a company’s earnings management. However, languages have historical relations that lead to shared features—they do not constitute independent observations. This can inflate correlations between variables if not dealt with appropriately (Galton’s problem). We discuss this problem and provide an overview of the latest methods to control linguistic history. We then provide an empirical demonstration of how Galton’s problem can bias results in an investigation of whether a company’s earnings management behavior is predicted by structural features of its employees’ language. We find a strong relationship when not controlling linguistic history, but the relationship disappears when controls are applied. In contrast, economic predictors of earnings management remain robust. Overall, our results suggest that careful consideration of linguistic history is important for distinguishing true causes from spurious correlations in economic behaviors.


Sign in / Sign up

Export Citation Format

Share Document