scholarly journals The Green Toad Example: A Comparison of Pattern Recognition Software

2020 ◽  
Author(s):  
Stephan Burgstaller ◽  
Günter Gollmann ◽  
Lukas Landler

Abstract Background Individual identification of animals is important for assessing the size and status of populations. Photo-based approaches, where animals are recognized by naturally occurring and visually identifiable features, such as color patterns, are cost-effective methods for this purpose. We compared five available programs for their power to semi-automatically identify dorsal patterns of the European green toad (Bufotes viridis). Method We created a data set of 200 pictures of known identity, two pictures for each individual, and analyzed the percentage of correctly identified animals for each software. Furthermore, we employed a generalized linear mixed model to identify important factors contributing to correct identifications. We used these results to estimate the population size of our hypothetical population. Conclusions The freely available HotSpotter application was the software which performed by far the best for our green toad example, identifying close to 100% of the photos correctly. The animals’ sex highly significantly influenced detection probability, presumably because of sex-specific differences in the pattern contrast. Population estimates were close to the expected 100 for HotSpotter, but for the other applications population size was highly overestimated. Given the clarity of our results we strongly recommend the HotSpotter software, which is a highly efficient tool for individual pattern recognition.

Author(s):  
Guri Feten ◽  
Trygve Almøy ◽  
Are H. Aastveit

Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six imputation methods. A comparative study of the methods was performed on data from mice and data from the bacterium Enterococcus faecalis, and a linear mixed model was used to test for differences between the methods. The study showed that different methods' capability to predict is dependent on the data, hence the ideal choice of method and number of components are different for each data set. For data with correlation structure methods based on K-nearest neighbours seemed to be best, while for data without correlation structure using the average of the gene was to be preferred.


Author(s):  
Robin Pla ◽  
Arthur Leroy ◽  
Yannis Raineteau ◽  
Philippe Hellard

Purpose: To quantify the impact of successive competitions on swimming performance in world-class swimmers. Methods: An entire data set of all events swum during a new competition named the International Swimming League was collected. A Bayesian linear mixed model has been proposed to evaluate whether a progression could be observed during the International Swimming League’s successive competitions and to quantify this effect according to event, age, and gender. Results: An overall progression of 0.0005 (0.0001 to 0.0010) m/s/d was observed. The daily mean progression (ie, faster performance) was twice as high for men as for women (0.0008 [0.00 to 0.0014] vs 0.0003 [−0.0003 to 0.0009] m·s−1). A tendency toward higher progression for middle distances (200 and 400 m) and for swimmers of a higher caliber (above 850 FINA [Fédération Internationale de Natation] points) was also observed. Swimmers between 23 and 26 years of age seemed to improve their swimming speed more in comparison with the other swimmers. Conclusions: This new league format, which involves several competitions in a row, seems to allow for an enhancement in swimming performance. Coaches and their support staff can now adapt their periodization plan in order to promote competition participation.


2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 655-655
Author(s):  
Eunhee Cho ◽  
Jinhee Shin ◽  
Bada Kang ◽  
Sujin Kim ◽  
Sinwoo Hwang ◽  
...  

Abstract Sleep disturbance is a common and significant symptom experienced by older adults with dementia. Early detection and timely treatment of sleep disturbance are critical to prevent adverse consequences including decreased quality of life for persons with dementia and increased caregiver burden. While direct observations and sleep diaries are often unreliable, actigraphy is a cost-effective method in measuring sleep problems in older adults with dementia and provides reliable and rich sleep data. Therefore, this study aimed to examine sleep disturbance objectively measured by actigraphy and its risk factors in community-dwelling older adults with dementia in Korea. This is a prospective study consisting of a two-wave dataset. The model was fitted using Wave 1 data (n=151) and then validated using Wave 2 data (n=59). Independent variables were demographics, cognitive and physical function, depressive symptoms, physical activity level, and neuropsychiatric symptoms measured by Neuropsychiatric Inventory(NPI), and clinical factors including dementia type, sedative use, and comorbidities. Sleep disturbance was defined as less than six nighttime sleep hours and sleep efficacy less than 75%. Using the Youden’s Index, the sample was dichotomized into sleep disturbance group (n=83) and sound sleep group (n=68). The results of the generalized linear mixed model showed that the risk factors for sleep disturbance included vascular dementia, age, step count, and having three neuropsychiatric symptoms (i.e., delusions, depression, and disinhibition). Individuals with dementia at risk for sleep disturbance should be identified to prioritize early prevention strategies and individualized interventions. Particularly, management of delusion, depression, disinhibition is critical in preventing disturbed sleep.


2020 ◽  
Author(s):  
Johannes Laimighofer ◽  
Gregor Laaha

<p><span lang="en-US">Standardized drought indices such as SPI are frequently used around the world to assess drought severity across a continent or a larger region covering different meteorological regimes. But how standard are the standardized indices? In this paper we quantify the uncertainty of SPI and SPEI based on an Austrian data set to shed light on what are the main sources of uncertainty in the study area. Here we analyze the uncertainty contributions by a linear mixed model that employs a restrictive maximum likelihood estimator in order to produce unbiased variance and covariance components. Five factors that either defy the control of the analyst (record length, observation period), or need to be subjectively decided during the steps of the calculation (choice of the distribution, parameter estimation method, and GOF-test of the fitted distribution) are considered. The results show that, overall, the choice of the distribution and the observational window are the most important sources of uncertainty. We quantify the relative uncertainty contributions in greater detail in order to give guidance how to make estimates most accurate for a given data set. We finally analyze the total uncertainty of SPI and SPEI to shed light on our main question whether the indices are skillful enough to provide a quantification of atmospheric drought that is standardized enough to allow the intended comparisons across various data situations and meteorological regimes. </span></p>


Genetics ◽  
1997 ◽  
Vol 146 (1) ◽  
pp. 409-416 ◽  
Author(s):  
T H E Meuwissen ◽  
M E Goddard

A method was derived to estimate effects of quantitative trait loci (QTL) using incomplete genotype information in large outbreeding populations with complex pedigrees. The method accounts for background genes by estimating polygenic effects. The basic equations used are very similar to the usual linear mixed model equations for polygenic models, and segregation analysis was used to estimate the probabilities of the QTL genotypes for each animal. Method R was used to estimate the polygenic heritability simultaneously with the QTL effects. Also, initial allele frequencies were estimated. The method was tested in a simulated data set of 10,000 animals evenly distributed over 10 generations, where 0, 400 or 10,000 animals were genotyped for a candidate gene. In the absence of selection, the bias of the QTL estimates was <2%. Selection biased the estimate of the Aa genotype slightly, when zero animals were genotyped. Estimates of the polygenic heritability were 0.251 and 0.257, in absence and presence of selection, respectively, while the simulated value was 0.25. Although not tested in this study, marker information could be accommodated by adjusting the transmission probabilities of the genotypes from parent to offspring according to the marker information. This renders a QTL mapping study in large multi-generation pedigrees possible.


2021 ◽  
Author(s):  
Baohong Guo

ABSTRACTGenomic predictions have been recognized as a new promising technique in animal and plant breeding. Linear mixed model is a widely used statistical technique, but it may not be desirable for large training sets and number of molecular markers, because it is intensive in computation. Deep learning is a subfield of machine learning and it can be used for complex predictions on a large scale. Multi task deep learning (MT-DL) incorporates related tasks(labels or traits) into one learning process to enable the learning model to perform better than single task deep learning (ST-DL). I applied MT-DL to genotype by environment genomic predictions to predict the performances of breeding lines at multiple environments. I compared MT-DL with linear mixed model-based Bayesian genotype × environment method (BGGE) and separate genomic predictions on single environments with widely used rrBLUP, ridge regression and ST-DL using cross validations. Compared with rrBLUP, MT-DL and non-linear BGGE showed a moderate increase of 9.4 and 7.6%, respectively, ST-DL has a small increase of 5.4%, ridge regression had a similar prediction accuracy and linear BGGE had a small decrease of −2.0% for prediction accuracy. I also found that all methods including rrBLUP had an overfitting, this is likely because yield genomic predictions are complex and the data set used in this study are small. rrBLUP, ridge regression, ST-DL and MT-DL has similar overfitting. Difference between training and test set prediction accuracies was between 0.344 and 0. 387. Linear and nonlinear BGGE methods seem to have much worse overfitting than other methods. Difference between training and test set prediction accuracies were 0.429 and 0.472, respectively. I also discussed the potential applications of ST-DL and MT-DL in genomic predictions of hybrid crops such as maize


2014 ◽  
Vol 96 ◽  
Author(s):  
JOAQUIM CASELLAS ◽  
DANIEL GIANOLA ◽  
JUAN F. MEDRANO

SummaryThe continuous uploading of polygenic additive mutational variability has been reported in several studies in laboratory species with an inbred genetic background. These studies have focused on the direct contribution of new mutations without considering the possibility of epistatic effects derived from the interaction of new mutations with pre-existing polymorphisms. In this work we focused on this main topic and analysed the statistical and biological relevance of the epistatic variance for 9 week body weight in two populations of inbred mice. We developed a new linear mixed model parameterization where founder-related additive genetic variability, additive mutational variability and the interaction terms between both sources of variation were accounted for under a Bayesian design and without requiring the inversion of a matrix of epistatic genetic covariances. The analyses focused on a six-generations data set from C57BL/6J mice (n = 3736) and a five-generations data set from C57BL/6Jhg/hg mice (n = 2843). The deviance information criterion (DIC) clearly favoured the model accounting for epistatic variability with reductions larger than 50 DIC units in both populations. Modal estimates for founder related, mutational and epistatic heritabilities were 0·068, 0·011 and 0·095 in C57BL/6J and 0·060, 0·010 and 0·113 in C57BL/6Jhg/hg, ruling out any doubt about the biological relevance of epistasis originating from new mutations in mice. These results contribute new insights on the relevance of epistasis in the genetic architecture of mammals and serve as an important component of an additional source of genetic heterogeneity for inbred strains of laboratory mice.


2018 ◽  
Vol 75 (6) ◽  
pp. 2172-2181
Author(s):  
Guillaume Forget ◽  
Jean-Luc Baglinière ◽  
Frédéric Marchand ◽  
Arnaud Richard ◽  
Marie Nevoux

Abstract Maintaining connectivity in aquatic ecosystems is important to ensure adequate ecological functioning. A large dam removal project in the Sélune River (Normandy, France) would reconnect 827 km2 of catchment area to the sea. Only the downstream section of the Selune is currently available to diadromous fish, which migrate between freshwater and the marine environment. In particular, managers focus on the future potential abundance of Atlantic salmon, Salmo salar, for conservation and fishery purposes. As in stream channel habitat drives carrying capacity of juvenile salmon, salmon abundance is usually inferred from intensive and linear habitat surveys. However, this approach is neither cost-effective for large-scale surveys nor feasible for riverbed sections with low accessibility for measurement with traditional methods, e.g. dam lakes. We used well-defined relationships between gradient, hydrology and channel habitat structure to construct a simple model to estimate potential suitable habitat for juvenile salmon. Using fine-scale habitat data from nearby rivers, we parameterized a linear mixed model to estimate the area of suitable habitat based on simple physical descriptors of river characteristics. We compared our predictions to fine-scale habitat surveys on the upper Sélune. Using only slope and width, our model was able to explain 80% of the variance in suitable habitat. Estimates indicated that dam removal on the Sélune River would generate a threefold increase in suitable habitat for juveniles. This could increase the mean number of adult salmon returning to the river by 1420.9 (s.e. = 1015.5). More generally, this model provides an alternative and cost-effective tool to help better manage salmon populations in rivers impacted by dams.


2018 ◽  
Vol 30 (4) ◽  
pp. 2153-2174 ◽  
Author(s):  
Shan Lin ◽  
Shuai Yang ◽  
Minghui Ma ◽  
Jian Huang

Purpose In recent years, hotels in China have been interested in leveraging social media platforms to facilitate interactions with and among consumers. Such brand engagement efforts on social media networks are believed to promote brands through co-creation of consumer experiences and values. This study was conducted in the context of Chinese hotels. The paper aims to identify two forms of brand engagement via social media platforms – consumer-initiated engagement and firm-initiated engagement – and to examine their effects on hotels’ display advertising effectiveness. Design/methodology/approach This study collected a comprehensive data set. First, the authors collected display advertisement data from two hotel chains in China. Second, the authors gathered the two hotels’ engagement data from Weibo. A generalized linear mixed model was used in data analysis. Findings The findings of the study indicate that both forms of brand engagement on social media network sites positively influence display advertising effectiveness. Moreover, for a strong brand, consumer-initiated engagement is more influential in increasing display advertising effectiveness; however, for a weak brand, firm-initiated engagement gains more clicks and conversions from advertisements. Practical implications As hotels in China continue to leverage online media platforms to reach, engage with and co-create value with potential and existing consumers, this study provides managers with insight as to how they can achieve higher advertising effectiveness by engaging with consumers on a consistent basis on social media. Originality/value This study mainly contributes to recent increasing research on engagement and value co-creation by providing a lens through which to assess the relationship between brand engagement via social media networks and online display advertising effectiveness.


2018 ◽  
Vol 6 (2) ◽  
pp. 204-231
Author(s):  
DANIEL K. SEWELL

AbstractWhile logistic regression models are easily accessible to researchers, when applied to network data there are unrealistic assumptions made about the dependence structure of the data. For temporal networks measured in discrete time, recent work has made good advances (Almquist & Butts, 2014), but there is still the assumption that the dyads are conditionally independent given the edge histories. This assumption can be quite strong and is sometimes difficult to justify. If time steps are rather large, one would typically expect not only the existence of temporal dependencies among the dyads across observed time points but also the existence of simultaneous dependencies affecting how the dyads of the network co-evolve. We propose a general observation-driven model for dynamic networks that overcomes this problem by modeling both the mean and the covariance structures as functions of the edge histories using a flexible autoregressive approach. This approach can be shown to fit into a generalized linear mixed model framework. We propose a visualization method that provides evidence concerning the existence of simultaneous dependence. We describe a simulation study to determine the method's performance in the presence and absence of simultaneous dependence, and we analyze both a proximity network from conference attendees and a world trade network. We also use this last data set to illustrate how simultaneous dependencies become more prominent as the time intervals become coarser.


Sign in / Sign up

Export Citation Format

Share Document