scholarly journals A practical study of CITES wood species identification by untargeted DART/QTOF, GC/QTOF and LC/QTOF together with machine learning processes and statistical analysis

2021 ◽  
pp. 100089
Author(s):  
Pamela Brunswick ◽  
Daniel Cuthbertson ◽  
Jeffrey Yan ◽  
Candice C. Chua ◽  
Isabelle Duchesne ◽  
...  
2022 ◽  
Author(s):  
Núbia Rosa Da Silva ◽  
Victor Deklerck ◽  
Jan Baetens ◽  
Jan Van den Bulcke ◽  
Maaike De Ridder ◽  
...  

Abstract Background: The identification of tropical African wood species based on microscopic imagery is a challenging problem due to the heterogeneous nature of the composition of wood combined with the vast number of candidate species. Image classification methods that rely on machine learning can facilitate this identification, provided that sufficient training material is available. Despite the fact that the three main anatomical sections contain information that is relevant for species identification, current methods only rely on the transversal section. Additionally, commonly used procedures for evaluating the performance of these methods neglect the fact that multiple images often originate from the same tree, leading to an overly optimistic estimate of the performance. Results: We introduce a new image dataset containing microscopic images of the three main anatomical sections of 77 Congolese wood species. A dedicated multiview image classification method is developed and obtains an accuracy (computed using the naive but common approach) of 95%, outperforming the singleview methods by a large margin. An in-depth analysis shows that naive accuracy estimates can lead to a dramatic over-prediction, of up to 60%, of the accuracy. Conclusions: Additional images from the non-transversal sections can boost the performance of machine-learning-based wood species identification methods. Additionally, care should be taken when evaluating the performance of machine-learningbased wood species identification methods to avoid an overestimation of the performance.


2021 ◽  
Vol 11 (7) ◽  
pp. 3184
Author(s):  
Ismael Garrido-Muñoz  ◽  
Arturo Montejo-Ráez  ◽  
Fernando Martínez-Santiago  ◽  
L. Alfonso Ureña-López 

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.


Geosciences ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 243
Author(s):  
Hernandez-Martinez Francisco G. ◽  
Al-Tabbaa Abir ◽  
Medina-Cetina Zenon ◽  
Yousefpour Negin

This paper presents the experimental database and corresponding statistical analysis (Part I), which serves as a basis to perform the corresponding parametric analysis and machine learning modelling (Part II) of a comprehensive study on organic soil strength and stiffness, stabilized via the wet soil mixing method. The experimental database includes unconfined compression tests performed under laboratory-controlled conditions to investigate the impact of soil type, the soil’s organic content, the soil’s initial natural water content, binder type, binder quantity, grout to soil ratio, water to binder ratio, curing time, temperature, curing relative humidity and carbon dioxide content on the stabilized organic specimens’ stiffness and strength. A descriptive statistical analysis complements the description of the experimental database, along with a qualitative study on the stabilization hydration process via scanning electron microscopy images. Results confirmed findings on the use of Portland cement alone and a mix of Portland cement with ground granulated blast furnace slag as suitable binders for soil stabilization. Findings on mixes including lime and magnesium oxide cements demonstrated minimal stabilization. Specimen size affected stiffness, but not the strength for mixes of peat and Portland cement. The experimental database, along with all produced data analyses, are available at the Texas Data Repository as indicated in the Data Availability Statement below, to allow for data reproducibility and promote the use of artificial intelligence and machine learning competing modelling techniques as the ones presented in Part II of this paper.


ChemCatChem ◽  
2019 ◽  
Vol 11 (18) ◽  
pp. 4443-4443
Author(s):  
Keisuke Suzuki ◽  
Takashi Toyao ◽  
Zen Maeno ◽  
Satoru Takakusagi ◽  
Ken‐ichi Shimizu ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Aaron N. Shugar ◽  
B. Lee Drake ◽  
Greg Kelley

AbstractAn innovative approach for the rapid identification of wood species is presented. By combining X-ray fluorescence spectrometry with convolutional neural network machine learning, 48 different wood specimens were clearly differentiated and identified with a 99% accuracy. Wood species identification is imperative to assess illegally logged and transported lumber. Alternative options for identification can be time consuming and require some level of sampling. This non-invasive technique offers a viable, cost-effective alternative to rapidly and accurately identify timber in efforts to support environmental protection laws and regulations.


2020 ◽  
Vol 134 (1) ◽  
pp. 15-25
Author(s):  
Sabri Soussi ◽  
Gary S. Collins ◽  
Peter Jüni ◽  
Alexandre Mebazaa ◽  
Etienne Gayat ◽  
...  

SUMMARY Interest in developing and using novel biomarkers in critical care and perioperative medicine is increasing. Biomarkers studies are often presented with flaws in the statistical analysis that preclude them from providing a scientifically valid and clinically relevant message for clinicians. To improve scientific rigor, the proper application and reporting of traditional and emerging statistical methods (e.g., machine learning) of biomarker studies is required. This Readers’ Toolbox article aims to be a starting point to nonexpert readers and investigators to understand traditional and emerging research methods to assess biomarkers in critical care and perioperative medicine.


Data analytics has grown in a machine learning context. Whatever the reason data is used or exploited, customer segmentation or marketing targeting, it must be processed first and represented on feature vectors. Many algorithms, such as clustering, regression, classification, and others, need to be represented and clarified in order to facilitate processing and statistical analysis. If we have seen, through the previous chapters, the importance of big data analysis (the Why?), as with every major innovation, the biggest confusion lies in the exact scope (What?) and its implementation (How?). In this chapter, we will take a look at the different algorithms and techniques analytics that we can use in order to exploit the large amounts of data.


Animals ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 1687
Author(s):  
Giovanni P. Burrai ◽  
Andrea Gabrieli ◽  
Valentina Moccia ◽  
Valentina Zappulli ◽  
Ilaria Porcellato ◽  
...  

Canine mammary tumors (CMTs) represent a serious issue in worldwide veterinary practice and several risk factors are variably implicated in the biology of CMTs. The present study examines the relationship between risk factors and histological diagnosis of a large CMT dataset from three academic institutions by classical statistical analysis and supervised machine learning methods. Epidemiological, clinical, and histopathological data of 1866 CMTs were included. Dogs with malignant tumors were significantly older than dogs with benign tumors (9.6 versus 8.7 years, p < 0.001). Malignant tumors were significantly larger than benign counterparts (2.69 versus 1.7 cm, p < 0.001). Interestingly, 18% of malignant tumors were smaller than 1 cm in diameter, providing compelling evidence that the size of the tumor should be reconsidered during the assessment of the TNM-WHO clinical staging. The application of the logistic regression and the machine learning model identified the age and the tumor’s size as the best predictors with an overall diagnostic accuracy of 0.63, suggesting that these risk factors are sufficient but not exhaustive indicators of the malignancy of CMTs. This multicenter study increases the general knowledge of the main epidemiologica-clinical risk factors involved in the onset of CMTs and paves the way for further investigations of these factors in association with CMTs and in the application of machine learning technology.


Sign in / Sign up

Export Citation Format

Share Document