scholarly journals Training and Validating a Machine Learning Model for the Sensor-Based Monitoring of Lying Behavior in Dairy Cows on Pasture and in the Barn

Animals ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. 2660
Author(s):  
Lara Schmeling ◽  
Golnaz Elmamooz ◽  
Phan Thai Hoang ◽  
Anastasiia Kozar ◽  
Daniela Nicklas ◽  
...  

Monitoring systems assist farmers in monitoring the health of dairy cows by predicting behavioral patterns (e.g., lying) and their changes with machine learning models. However, the available systems were developed either for indoors or for pasture and fail to predict the behavior in other locations. Therefore, the goal of our study was to train and evaluate a model for the prediction of lying on a pasture and in the barn. On three farms, 7–11 dairy cows each were equipped with the prototype of the monitoring system containing an accelerometer, a magnetometer and a gyroscope. Video observations on the pasture and in the barn provided ground truth data. We used 34.5 h of datasets from pasture for training and 480.5 h from both locations for evaluating. In comparison, random forest, an orientation-independent feature set with 5 s windows without overlap, achieved the highest accuracy. Sensitivity, specificity and accuracy were 95.6%, 80.5% and 87.4%, respectively. Accuracy on the pasture (93.2%) exceeded accuracy in the barn (81.4%). Ruminating while standing was the most confused with lying. Out of individual lying bouts, 95.6 and 93.4% were identified on the pasture and in the barn, respectively. Adding a model for standing up events and lying down events could improve the prediction of lying in the barn.

2021 ◽  
Vol 14 (6) ◽  
pp. 997-1005
Author(s):  
Sandeep Tata ◽  
Navneet Potti ◽  
James B. Wendt ◽  
Lauro Beltrão Costa ◽  
Marc Najork ◽  
...  

Extracting structured information from templatic documents is an important problem with the potential to automate many real-world business workflows such as payment, procurement, and payroll. The core challenge is that such documents can be laid out in virtually infinitely different ways. A good solution to this problem is one that generalizes well not only to known templates such as invoices from a known vendor, but also to unseen ones. We developed a system called Glean to tackle this problem. Given a target schema for a document type and some labeled documents of that type, Glean uses machine learning to automatically extract structured information from other documents of that type. In this paper, we describe the overall architecture of Glean, and discuss three key data management challenges : 1) managing the quality of ground truth data, 2) generating training data for the machine learning model using labeled documents, and 3) building tools that help a developer rapidly build and improve a model for a given document type. Through empirical studies on a real-world dataset, we show that these data management techniques allow us to train a model that is over 5 F1 points better than the exact same model architecture without the techniques we describe. We argue that for such information-extraction problems, designing abstractions that carefully manage the training data is at least as important as choosing a good model architecture.


2020 ◽  
Vol 12 (21) ◽  
pp. 3475
Author(s):  
Miae Kim ◽  
Jan Cermak ◽  
Hendrik Andersen ◽  
Julia Fuchs ◽  
Roland Stirnberg

Clouds are one of the major uncertainties of the climate system. The study of cloud processes requires information on cloud physical properties, in particular liquid water path (LWP). This parameter is commonly retrieved from satellite data using look-up table approaches. However, existing LWP retrievals come with uncertainties related to assumptions inherent in physical retrievals. Here, we present a new retrieval technique for cloud LWP based on a statistical machine learning model. The approach utilizes spectral information from geostationary satellite channels of Meteosat Spinning-Enhanced Visible and Infrared Imager (SEVIRI), as well as satellite viewing geometry. As ground truth, data from CloudNet stations were used to train the model. We found that LWP predicted by the machine-learning model agrees substantially better with CloudNet observations than a current physics-based product, the Climate Monitoring Satellite Application Facility (CM SAF) CLoud property dAtAset using SEVIRI, edition 2 (CLAAS-2), highlighting the potential of such approaches for future retrieval developments.


Author(s):  
Tomáš Černý ◽  
Milan Večeřa ◽  
Daniel Falta ◽  
Gustav Chládek

The aim of this study was to evaluate the seasonal behavior and milk yield of dairy cows of Czech Fleckvieh cattle. The subject of the monitoring was one section (housed in one quarter of barn) with 103 free cubicle beds with an average of 95 lactating dairy cows of Czech Fleckvieh cattle. In the seasons (spring, summer, autumn, winter) temperature (°C), relative humidity (%) and temperature-humidity index (THI) were monitored. Furthermore, behavioral signs were also observed (a total of 4,940 observations): dairy cows were either lying down (3,432 observations) or standing up (1,508 observations). In the conditions that dairy cows were standing up in the cubicle, up to 585 observations were proved. If dairy cows were standing up outside of the cubicle (923 observations), they were either lying on the left side (1,924 observations) or right side (1,508). Significant seasonal influence was found out (p < 0.05) on the number of dairy cows standing up (a maximum of 410 observations in the spring, a minimum of 342 observations in the summer) and then the number of cows lying both on the left (a maximum of 519 observations in the autumn and a minimum of 444 observations in the spring) and on the right side (a maximum of 415 observations in the winter, a minimum of 320 observations in the autumn). The seasonal influence was no significant (p > 0.05) in the remaining behavioral signs. With regard to milk yield, a significant seasonal influence was proved. The highest milk yield was reached with dairy cows in spring (29.27 kg of milk) and the lowest in the autumn (24.58 kg of milk). No significant differences of milk yield were detected between behavioral signs (p > 0.05). The maximum difference of milk yield was found out up to 1.39 kg between dairy cows lying down on the left side (28.35 kg) and the dairy cows standing up in a cubicle (26.96 kg) in the winter but even this difference was not statistically significant (p > 0.05).


Author(s):  
Mariana Carvalho de Menezes ◽  
Vanderlei Pascoal de Matos ◽  
Maria de Fátima de Pina ◽  
Bruna Vieira de Lima Costa ◽  
Larissa Loures Mendes ◽  
...  

AbstractTo overcome the challenge of obtaining accurate data on community food retail, we developed an innovative tool to automatically capture food retail data from Google Earth (GE). The proposed method is relevant to non-commercial use or scholarly purposes. We aimed to test the validity of web sources data for the assessment of community food retail environment by comparison to ground-truth observations (gold standard). A secondary aim was to test whether validity differs by type of food outlet and socioeconomic status (SES). The study area included a sample of 300 census tracts stratified by SES in two of the largest cities in Brazil, Rio de Janeiro and Belo Horizonte. The GE web service was used to develop a tool for automatic acquisition of food retail data through the generation of a regular grid of points. To test its validity, this data was compared with the ground-truth data. Compared to the 856 outlets identified in 285 census tracts by the ground-truth method, the GE interface identified 731 outlets. In both cities, the GE interface scored moderate to excellent compared to the ground-truth data across all of the validity measures: sensitivity, specificity, positive predictive value, negative predictive value and accuracy (ranging from 66.3 to 100%). The validity did not differ by SES strata. Supermarkets, convenience stores and restaurants yielded better results than other store types. To our knowledge, this research is the first to investigate using GE as a tool to capture community food retail data. Our results suggest that the GE interface could be used to measure the community food environment. Validity was satisfactory for different SES areas and types of outlets.


Author(s):  
◽  
S. S. Ray

<p><strong>Abstract.</strong> Crop Classification and recognition is a very important application of Remote Sensing. In the last few years, Machine learning classification techniques have been emerging for crop classification. Google Earth Engine (GEE) is a platform to explore the multiple satellite data with different advanced classification techniques without even downloading the satellite data. The main objective of this study is to explore the ability of different machine learning classification techniques like, Random Forest (RF), Classification And Regression Trees (CART) and Support Vector Machine (SVM) for crop classification. High Resolution optical data, Sentinel-2, MSI (10&amp;thinsp;m) was used for crop classification in the Indian Agricultural Research Institute (IARI) farm for the Rabi season 2016 for major crops. Around 100 crop fields (~400 Hectare) in IARI were analysed. Smart phone-based ground truth data were collected. The best cloud free image of Sentinel 2 MSI data (5 Feb 2016) was used for classification using automatic filtering by percentage cloud cover property using the GEE. Polygons as feature space was used as training data sets based on the ground truth data for crop classification using machine learning techniques. Post classification, accuracy assessment analysis was done through the generation of the confusion matrix (producer and user accuracy), kappa coefficient and F value. In this study it was found that using GEE through cloud platform, satellite data accessing, filtering and pre-processing of satellite data could be done very efficiently. In terms of overall classification accuracy and kappa coefficient, Random Forest (93.3%, 0.9178) and CART (73.4%, 0.6755) classifiers performed better than SVM (74.3%, 0.6867) classifier. For validation, Field Operation Service Unit (FOSU) division of IARI, data was used and encouraging results were obtained.</p>


2018 ◽  
Author(s):  
Christian Damgaard

AbstractIn order to fit population ecological models, e.g. plant competition models, to new drone-aided image data, we need to develop statistical models that may take the new type of measurement uncertainty when applying machine-learning algorithms into account and quantify its importance for statistical inferences and ecological predictions. Here, it is proposed to quantify the uncertainty and bias of image predicted plant taxonomy and abundance in a hierarchical statistical model that is linked to ground-truth data obtained by the pin-point method. It is critical that the error rate in the species identification process is minimized when the image data are fitted to the population ecological models, and several avenues for reaching this objective are discussed. The outlined method to statistically model known sources of uncertainty when applying machine-learning algorithms may be relevant for other applied scientific disciplines.


2020 ◽  
Vol 12 (3) ◽  
pp. 355 ◽  
Author(s):  
Nam Thang Ha ◽  
Merilyn Manley-Harris ◽  
Tien Dat Pham ◽  
Ian Hawes

Seagrass has been acknowledged as a productive blue carbon ecosystem that is in significant decline across much of the world. A first step toward conservation is the mapping and monitoring of extant seagrass meadows. Several methods are currently in use, but mapping the resource from satellite images using machine learning is not widely applied, despite its successful use in various comparable applications. This research aimed to develop a novel approach for seagrass monitoring using state-of-the-art machine learning with data from Sentinel–2 imagery. We used Tauranga Harbor, New Zealand as a validation site for which extensive ground truth data are available to compare ensemble machine learning methods involving random forests (RF), rotation forests (RoF), and canonical correlation forests (CCF) with the more traditional maximum likelihood classifier (MLC) technique. Using a group of validation metrics including F1, precision, recall, accuracy, and the McNemar test, our results indicated that machine learning techniques outperformed the MLC with RoF as the best performer (F1 scores ranging from 0.75–0.91 for sparse and dense seagrass meadows, respectively). Our study is the first comparison of various ensemble-based methods for seagrass mapping of which we are aware, and promises to be an effective approach to enhance the accuracy of seagrass monitoring.


2020 ◽  
Author(s):  
Lennart Schmidt ◽  
Hannes Mollenhauer ◽  
Corinna Rebmann ◽  
David Schäfer ◽  
Antje Claussnitzer ◽  
...  

&lt;p&gt;With more and more data being gathered from environmental sensor networks, the importance of automated quality-control (QC) routines to provide usable data in near-real time is becoming increasingly apparent. Machine-learning (ML) algorithms exhibit a high potential to this respect as they are able to exploit the spatio-temporal relation of multiple sensors to identify anomalies while allowing for non-linear functional relations in the data. In this study, we evaluate the potential of ML for automated QC on two spatio-temporal datasets at different spatial scales: One is a dataset of atmospheric variables at 53 stations across Northern Germany. The second dataset contains timeseries of soil moisture and temperature at 40 sensors at a small-scale measurement plot.&lt;/p&gt;&lt;p&gt;Furthermore, we investigate strategies to tackle three challenges that are commonly present when applying ML for QC: 1) As sensors might drop out, the ML models have to be designed to be robust against missing values in the input data. We address this by comparing different data imputation methods, coupled with a binary representation of whether a value is missing or not. 2) Quality flags that mark erroneous data points to serve as ground truth for model training might not be available. And 3) There is no guarantee that the system under study is stationary, which might render the outputs of a trained model useless in the future. To address 2) and 3), we frame the problem both as a supervised and unsupervised learning problem. Here, the use of unsupervised ML-models can be beneficial as they do not require ground truth data and can thus be retrained more easily should the system be subject to significant changes. In this presentation, we discuss the performance, advantages and drawbacks of the proposed strategies to tackle the aforementioned challenges. Thus, we provide a starting point for researchers in the largely untouched field of ML application for automated quality control of environmental sensor data.&lt;/p&gt;


2005 ◽  
Vol 81 (3) ◽  
pp. 423-430 ◽  
Author(s):  
L. Gygax ◽  
H. Schulze Westerath ◽  
J. Kuhlicke ◽  
B. Wechsler ◽  
C. Mayer

AbstractFinishing bulls need increasingly large cubicles throughout their growth, and optimal cubicle dimensions may differ from those used for dairy cows. The space requirements of finishing bulls was investigated by observing standing-up and lying-down behaviour, lying duration and number of lying bouts, as well as the cleanliness of cubicles and animals before and after increasing cubicle size at four different points in time. Lying area in the cubicles measured 120 × 70 cm at the start and 185 × 110 cm at the end of the finishing period (approx. at 160 and 550 kg, respectively). Twenty animals kept in four groups were observed at weights of approximately 220, 330, 380 and 500 kg before and after cubicle dimensions were increased. The proportion of standing-up events with more than one head lunge decreased with enlargement of the cubicles (P = 0·01). As cubicle size increased, bulls hit the partition rails less on standing up, except at 220 kg weight where the pattern was inverted (interaction: P = 0·001). Partitions were also hit less on lying down as cubicle size increased, except at 220 kg weight with an inverse pattern (interaction: P = 0·01). The number of exploratory head sweeps before lying down did not change with cubicle enlargement (P > 0·5). Bulls slipped more often with cubicle enlargement, except at 380 kg where the difference was inverted (interaction: P = 0·03). They never fell and never turned around in the cubicles. In general, both animals and cubicles were very clean. On average, lying duration decreased (P < 0·01) while the number of lying bouts tended to increase (P = 0·052) with enlargement of the cubicles but the absolute differences were small. Consequently at each point in time, the smaller cubicles still seemed to provide sufficient lying space for the bulls. If the impacts with the partitions were minor and did not represent a serious welfare concern, as suggested by qualitative observations, the cubicle dimensions used could be considered suitable for housing the type of finishing bulls used in this study.


Sign in / Sign up

Export Citation Format

Share Document