scholarly journals Machine learning for high-throughput field phenotyping and image processing provides insight into the association of above and below-ground traits in cassava (Manihot esculenta Crantz)

2020 ◽  
Author(s):  
Michael Gomez Selvaraj ◽  
Manuel Valderrama ◽  
Diego Guzman ◽  
Milton Valencia ◽  
Henry Ruiz ◽  
...  

Abstract Background: Rapid non-destructive measurements to predict cassava root yield over the full growing season through large numbers of germplasm and multiple environments is a huge challenge in Cassava breeding programs. As opposed to waiting until the harvest season, multispectral imagery using unmanned aerial vehicles (UAV) are capable of measuring the canopy metrics and vegetation indices (VIs) traits at different time points of the growth cycle. This resourceful time series aerial image processing with appropriate analytical framework is very important for the automatic extraction of phenotypic features from the image data. Many studies have demonstrated the usefulness of advanced remote sensing technologies coupled with machine learning (ML) approaches for accurate prediction of valuable crop traits. Until now, Cassava has received little to no attention in aerial image-based phenotyping and ML model testing. Results : To accelerate image processing, an automated image-analysis framework called CIAT Pheno-i was developed to extract plot level vegetation indices/canopy metrics. Multiple linear regression models were constructed at different key growth stages of cassava, using ground-truth data and vegetation indices obtained from a multispectral sensor. Henceforth, the spectral indices/features were combined to develop models and predict cassava root yield using different Machine learning techniques. Our results showed that (1) Developed CIAT pheno-i image analysis framework was found to be easier and more rapid than manual methods. (2) The correlation analysis of four phenological stages of cassava revealed that elongation (EL) and late bulking (LBK) were the most useful stages to estimate above-ground biomass (AGB), below-ground biomass (BGB) and canopy height (CH). (3) The multi-temporal analysis revealed that cumulative image feature information of EL+early bulky (EBK) stages showed a higher significant correlation ( r = 0.77) for Green Normalized Difference Vegetation indices (GNDVI) with BGB than individual time points. Canopy height measured on the ground correlated well with UAV (CHuav)-based measurements ( r = 0.92) at late bulking (LBK) stage. Among different image features, normalized difference red edge index (NDRE) data were found to be consistently highly correlated ( r = 0.65 to 0.84) with AGB at LBK stage. (4) Among the four ML algorithms used in this study, k-Nearest Neighbours (kNN), Random Forest (RF) and Support Vector Machine (SVM) showed the best performance for root yield prediction with the highest accuracy of R 2 = 0.67, 0.66 and 0.64, respectively. Conclusion : UAV platforms, time series image acquisition, automated image analytical framework (CIAT Pheno-i), and key vegetation indices (VIs) to estimate phenotyping traits and root yield described in this work have great potential for use as a selection tool in the modern cassava breeding programs around the world to accelerate germplasm and varietal selection. The image analysis software (CIAT Pheno-i) developed from this study can be widely applicable to any other crop to extract phenotypic information rapidly.

2020 ◽  
Author(s):  
Michael Gomez Selvaraj ◽  
Manuel Valderrama ◽  
Diego Guzman ◽  
Milton Valencia ◽  
Henry Ruiz ◽  
...  

Abstract Background: Rapid non-destructive measurements to predict cassava root yield over the full growing season through large numbers of germplasm and multiple environments is a huge challenge in Cassava breeding programs. As opposed to waiting until the harvest season, multispectral imagery using unmanned aerial vehicles (UAV) are capable of measuring the canopy metrics and vegetation indices (VIs) traits at different time points of the growth cycle. This resourceful time series aerial image processing with appropriate analytical framework is very important for the automatic extraction of phenotypic features from the image data. Many studies have demonstrated the usefulness of advanced remote sensing technologies coupled with machine learning (ML) approaches for accurate prediction of valuable crop traits. Until now, Cassava has received little to no attention in aerial image-based phenotyping and ML model testing. Results: To accelerate image processing, an automated image-analysis framework called CIAT Pheno-i was developed to extract plot level vegetation indices/canopy metrics. Multiple linear regression models were constructed at different key growth stages of cassava, using ground-truth data and vegetation indices obtained from a multispectral sensor. Henceforth, the spectral indices/features were combined to develop models and predict cassava root yield using different Machine learning techniques. Our results showed that (1) Developed CIAT pheno-i image analysis framework was found to be easier and more rapid than manual methods. (2) The correlation analysis of four phenological stages of cassava revealed that elongation (EL) and late bulking (LBK) were the most useful stages to estimate above-ground biomass (AGB), below-ground biomass (BGB) and canopy height (CH). (3) The multi-temporal analysis revealed that cumulative image feature information of EL+early bulky (EBK) stages showed a higher significant correlation (r = 0.77) for Green Normalized Difference Vegetation indices (GNDVI) with BGB than individual time points. Canopy height measured on the ground correlated well with UAV (CHuav)-based measurements (r = 0.92) at late bulking (LBK) stage. Among different image features, normalized difference red edge index (NDRE) data were found to be consistently highly correlated (r = 0.65 to 0.84) with AGB at LBK stage. (4) Among the four ML algorithms used in this study, k-Nearest Neighbours (kNN), Random Forest (RF) and Support Vector Machine (SVM) showed the best performance for root yield prediction with the highest accuracy of R2 = 0.67, 0.66 and 0.64, respectively. Conclusion: UAV platforms, time series image acquisition, automated image analytical framework (CIAT Pheno-i), and key vegetation indices (VIs) to estimate phenotyping traits and root yield described in this work have great potential for use as a selection tool in the modern cassava breeding programs around the world to accelerate germplasm and varietal selection. The image analysis software (CIAT Pheno-i) developed from this study can be widely applicable to any other crop to extract phenotypic information rapidly.


2020 ◽  
Author(s):  
Michael Gomez Selvaraj ◽  
Manuel Valderrama ◽  
Diego Guzman ◽  
Milton Valencia ◽  
Henry Ruiz ◽  
...  

Abstract Background: Rapid non-destructive measurements to predict cassava root yield over the full growing season through large numbers of germplasm and multiple environments is a huge challenge in Cassava breeding programs. As opposed to waiting until the harvest season, multispectral imagery using unmanned aerial vehicles (UAV) are capable of measuring the canopy metrics and vegetation indices (VIs) traits at different time points of the growth cycle. This resourceful time series aerial image processing with appropriate analytical framework is very important for the automatic extraction of phenotypic features from the image data. Many studies have demonstrated the usefulness of advanced remote sensing technologies coupled with machine learning (ML) approaches for accurate prediction of valuable crop traits. Until now, Cassava has received little to no attention in aerial image-based phenotyping and ML model testing. Results : To accelerate image processing, automated image-analysis framework called CIAT Pheno-i was developed to extract plot level vegetation indices/canopy metrics. Multiple linear regression models were constructed at different key growth stages of cassava, using ground-truth data and vegetation indices obtained from a multispectral sensor. Henceforth, the spectral indices/features were combined to develop models and predict cassava root yield using different Machine learning techniques. Our results showed that (1) Developed CIAT pheno-i image analysis framework was found to be easier and more rapid than manual methods. (2) The correlation analysis of four phenological stages of cassava revealed that elongation (EL) and late bulking (LBK) were the most useful stages to estimate above-ground (AGB), below-ground biomass (BGB) and canopy height (CH). (3) The multi-temporal analysis revealed that cumulative image feature information of EL+EBK stages showed a higher significant correlation ( r = 0.77 for GNDVI) with BGB than individual time points. Canopy height measured on the ground correlated well with UAV (CHuav)-based measurements ( r = 0.92) at late bulking (LBK) stage. Among different image features, normalized difference red edge index (NDRE) data were found to be consistently highly correlated ( r = 0.65 to 0.84) with ABG at LBK stage. (4) Among the four ML algorithms used in this study, k-Nearest Neighbours (kNN), Random Forest (RF) and Support Vector Machine (SVM) showed the best performance for root yield prediction with the highest accuracy of R 2 = 0.67, 0.66 and 0.64, respectively. Conclusion : UAV platforms, time series image acquisition, automated image analytical framework (CIAT pheno-i), and key vegetation indices (VIs) to estimate phenotyping traits and root yield described in this work have great potential for use as a selection tool in the modern cassava breeding programs around the world to accelerate germplasm and varietal selection. The image analysis software (CIAT pheno-i) developed from this study can be widely applicable to any other crop to extract phenotypic information rapidly.


2020 ◽  
Vol 7 ◽  
pp. 1-26 ◽  
Author(s):  
Silas Nyboe Ørting ◽  
Andrew Doyle ◽  
Arno Van Hilten ◽  
Matthias Hirth ◽  
Oana Inel ◽  
...  

Rapid advances in image processing capabilities have been seen across many domains, fostered by the  application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.


Author(s):  
Yasin Bakış ◽  
Xiaojun Wang ◽  
Hank Bart

Over 1 billion biodiversity collection specimens ranging from fungi to fish to fossils are housed in more than 1,600 natural history collections across the United States. The digitization of these specimens has risen significantly within the last few decades and this is only likely to increase, as the use of digitized data gains more importance every day. Numerous experiments with automated image analysis have proven the practicality and usefulness of digitized biodiversity images by computational techniques such as neural networks and image processing. However, most of the computational techniques to analyze images of biodiversity collection specimens require a good curation of this data. One of the challenges in curating multimedia data of biodiversity collection specimens is the quality of the multimedia objects—in our case, two dimensional images. To tackle the image quality problem, multimedia needs to be captured in a specific format and presented with appropriate descriptors. In this study we present an analysis of two image repositories each consisting of 2D images of fish specimens from several institutions—the Integrated Digitized Biocollections (iDigBio) and the Great Lakes Invasives Network (GLIN). Approximately 70 thousand images have been processed from the GLIN repository and 450 thousand images have been processed from the iDigBio repository and their suitability assessed for use in neural network-based species identification and trait extraction applications. Our findings showed that images that came from the GLIN dataset were more successful for image processing and machine learning purposes. Almost 40% of the species have been represented with less than 10 images while only 20% have more than 100 images per species. We identified and captured 20 metadata descriptors that define quality and usability of the image. According to the captured metadata information, 70% of the GLIN dataset images were found to be useful for further analysis according to the overall image quality score. Quality issues with the remaining images included: curved specimens, non-fish objects in the images such as tags, labels and rocks that obstructed the view of the specimen; color, focus and brightness issues; folded or overlapping parts as well as missing parts. We used both the web interface and the API (Application Programming Interface) for downloading images from iDigBio. We searched for all fish genera, families and classes in three different searches with the images-only option selected. Then we combined all of the search results and removed duplicates. Our search on the iDigBio database for fish taxa returned approximately 450 thousand records with images. We narrowed this down to 90 thousand fish images aided by the multimedia metadata with the downloaded search results, excluding some non-fish images, fossil samples, X-ray and CT (computed tomography) scans and several others. Only 44% of these 90 thousand images were found to be suitable for further analysis. In this study, we discovered some of the limitations of biodiversity image datasets and built an infrastructure for assessing the quality of biodiversity images for neural network analysis. Our experience with the fish images gathered from two different image repositories has enabled describing image quality metadata features. With the help of these metadata descriptors, one can simply create a dataset for a desired image quality for the purpose of analysis. Likewise, the availability of the metadata descriptors will help advance our understanding of quality issues, while helping data technicians, curators and the other digitization staff be more aware of multimedia.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Polina Lemenkova

The paper presents the use of the Landsat TM image processed by the ArcGIS Spatial Analyst Tool for environmental mapping of southwestern Iceland, region of Reykjavik.  Iceland is one of the most special Arctic regions with unique flora and landscapes. Its environment is presented by vulnerable ecosystems of highlands where vegetation is affected by climate, human or geologic factors: overgrazing, volcanism, annual temperature change. Therefore, mapping land cover types in Iceland contribute to the nature conservation, sustainable development and environmental monitoring purposes. This paper starts by review of the current trends in remote sensing, the importance of Landsat TM imagery for environmental mapping in general and Iceland in particular, and the requirements of GIS specifically for satellite image analysis. This is followed by the extended methodological workflow supported by illustrative print screens and technical description of data processing in ArcGIS. The data used in this research include Landsat TM image which was captured using GloVis and processed in ArcGIS. The methodology includes a workflow involving several technical steps of raster data processing in ArcGIS: 1) coordinate projecting, 2) panchromatic sharpening, 3) inspection of raster statistics, 4) spectral bands combination, 5) calculations, 6) unsupervised classification, 7) mapping. The classification was done by clustering technique using ISO Cluster algorithm and Maximum Likelihood Classification. This paper finally presents the results of the ISO Cluster application for Landsat TM image processing and concludes final remarks on the perspectives of environmental mapping based on Landsat TM image processing in ArcGIS.The results of the classification present landscapes divided into eight distinct land cover classes: 1) bare soils; 2) shrubs and smaller trees in the river valleys, urban areas including green spaces; 3) water areas; 4) forests including the Reykjanesfólkvangur National reserve; 5) ice-covered areas, glaciers and cloudy regions; 6) ravine valleys with a sparse type of the vegetation: rowan, alder, heathland, wetland; 7) rocks; 8) mixed areas. The final remarks include the discussion on the development of machine learning methods and opportunities of their technical applications in GIS-based analysis and Earth Observation data processing in ArcGIS, including image analysis and classification, mapping and visualization, machine learning and environmental applications for decision making in forestry and sustainable development.


2020 ◽  
Vol 7 ◽  
pp. 1-26
Author(s):  
Silas Nyboe Ørting ◽  
Andrew Doyle ◽  
Arno Van Hilten ◽  
Matthias Hirth ◽  
Oana Inel ◽  
...  

Rapid advances in image processing capabilities have been seen across many domains, fostered by the  application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.


Author(s):  
Ali Guendouz ◽  
Hocine Bendada ◽  
Ramadan Benniou

Background: Chlorophyll is the most important pigment in plant which absorbs light and subsequently transfers its energy to drive the photochemical reactions of photosynthesis. The numerical image processing techniques have been widely used in the analysis of leaf characteristics.Methods: The methods based on RGB (Red, Green and Blue) image analysis may emerge as a new and low-cost method for estimation the chlorophyll content. In this work, we use eight RGB vegetation indices as alternative for chlorophyll content estimation. Result: The student t-test showed that all the RGB indices tested are suitable to estimate the chlorophyll content in barley genotypes. In addition, the results which based on the correlation analysis in combination with the values of root mean squared error (RMSE) demonstrate that the very suitable RGB indices are these with high values of correlation coefficient and lowest values of RMSE. Data collected from barley genotypes leaves indicated that digital image processing technology can be a useful and rapid non-destructive method for assessment of chlorophyll content. Among the RGB indexes tested in this study the 100-(2R-B) and RGRI (R/G) are the most promising index to estimate the chlorophyll content in barley genotypes.


Sign in / Sign up

Export Citation Format

Share Document