Apple Disease Recognition Based on Small-scale Data Sets

HighlightsThe proposed data enhancement method can be used for small-scale data sets with rich sample image features.The accuracy of the new model reaches 98.5%, which is better than the traditional CNN method.Abstract: GoogLeNet offers far better performance in identifying apple disease compared to traditional methods. However, the complexity of GoogLeNet is relatively high. For small volumes of data, GoogLeNet does not achieve the same performance as it does with large-scale data. We propose a new apple disease identification model using GoogLeNet’s inception module. The model adopts a variety of methods to optimize its generalization ability. First, geometric transformation and image modification of data enhancement methods (including rotation, scaling, noise interference, random elimination, color space enhancement) and random probability and appropriate combination of strategies are used to amplify the data set. Second, we employ a deep convolution generative adversarial network (DCGAN) to enhance the richness of generated images by increasing the diversity of the noise distribution of the generator. Finally, we optimize the GoogLeNet model structure to reduce model complexity and model parameters, making it more suitable for identifying apple tree diseases. The experimental results show that our approach quickly detects and classifies apple diseases including rust, spotted leaf disease, and anthrax. It outperforms the original GoogLeNet in recognition accuracy and model size, with identification accuracy reaching 98.5%, making it a feasible method for apple disease classification. Keywords: Apple disease identification, Data enhancement, DCGAN, GoogLeNet.

Download Full-text

Bayesian Inference of Species Trees using Diffusion Models

Systematic Biology ◽

10.1093/sysbio/syaa051 ◽

2020 ◽

Vol 70 (1) ◽

pp. 145-161 ◽

Cited By ~ 1

Author(s):

Marnus Stoltz ◽

Boris Baeumer ◽

Remco Bouckaert ◽

Colin Fox ◽

Gordon Hiscott ◽

...

Keyword(s):

Bayesian Inference ◽

Numerical Algorithms ◽

Diffusion Models ◽

Model Parameters ◽

Data Sets ◽

Species Trees ◽

Computationally Efficient ◽

Data Set ◽

Snp Data ◽

Binary Markers

Abstract We describe a new and computationally efficient Bayesian methodology for inferring species trees and demographics from unlinked binary markers. Likelihood calculations are carried out using diffusion models of allele frequency dynamics combined with novel numerical algorithms. The diffusion approach allows for analysis of data sets containing hundreds or thousands of individuals. The method, which we call Snapper, has been implemented as part of the BEAST2 package. We conducted simulation experiments to assess numerical error, computational requirements, and accuracy recovering known model parameters. A reanalysis of soybean SNP data demonstrates that the models implemented in Snapp and Snapper can be difficult to distinguish in practice, a characteristic which we tested with further simulations. We demonstrate the scale of analysis possible using a SNP data set sampled from 399 fresh water turtles in 41 populations. [Bayesian inference; diffusion models; multi-species coalescent; SNP data; species trees; spectral methods.]

Download Full-text

Simple Convolutional-Based Models: Are They Learning the Task or the Data?

Neural Computation ◽

10.1162/neco_a_01446 ◽

2021 ◽

pp. 1-17

Author(s):

Luis Sa-Couto ◽

Andreas Wichert

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Training Data ◽

Model Complexity ◽

Data Sets ◽

Simple Task ◽

Data Set ◽

Knowing That ◽

Handwritten Digit ◽

End To End

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.

Download Full-text

Four planes of social being: more connections

10.1332/policypress/9781447354550.003.0006 ◽

2021 ◽

pp. 127-144

Author(s):

Priscilla Alderson

Keyword(s):

Social Relations ◽

Interdisciplinary Research ◽

Policy Making ◽

Interpersonal Relations ◽

Oil Spills ◽

Small Scale ◽

Data Sets ◽

Mortality And Morbidity ◽

Critical Realist ◽

Scale Data

Adverse mortality and morbidity effects of the huge oil spills in Bayelsa State, Niger Delta, illustrate the value of critical realism’s four planes of social being for organising complex findings and for combining large- and small-scale data sets. These planes cover every aspect of being human: bodies in relation to nature; interpersonal relations; larger social relations and structures; and inner human being in the mental-social-embodied personality. Chapter 5 also considers critical realist approaches to managing data-analysis: laminated systems analysis; interdisciplinary research and policy-making; critical realist theories about interdisciplinarity; overcoming barriers to interdisciplinarity, and interdisciplinary commitments. The detailed examples are about improving the physical health of people with a diagnosis of serious mental illness, and feminist-informed counselling after sexual assault.

Download Full-text

Using target enrichment sequencing to study the higher-level phylogeny of the largest lichen-forming fungi family: Parmeliaceae (Ascomycota)

IMA Fungus ◽

10.1186/s43008-020-00051-x ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Felix Grewe ◽

Claudio Ametrano ◽

Todd J. Widhelm ◽

Steven Leavitt ◽

Isabel Distefano ◽

...

Keyword(s):

Data Sets ◽

Target Enrichment ◽

Data Set ◽

Reduced Genome ◽

Genome Data ◽

Worldwide Distribution ◽

Phylogenetic Studies ◽

Genome Scale ◽

Scale Data ◽

Promising Avenue

AbstractParmeliaceae is the largest family of lichen-forming fungi with a worldwide distribution. We used a target enrichment data set and a qualitative selection method for 250 out of 350 genes to infer the phylogeny of the major clades in this family including 81 taxa, with both subfamilies and all seven major clades previously recognized in the subfamily Parmelioideae. The reduced genome-scale data set was analyzed using concatenated-based Bayesian inference and two different Maximum Likelihood analyses, and a coalescent-based species tree method. The resulting topology was strongly supported with the majority of nodes being fully supported in all three concatenated-based analyses. The two subfamilies and each of the seven major clades in Parmelioideae were strongly supported as monophyletic. In addition, most backbone relationships in the topology were recovered with high nodal support. The genus Parmotrema was found to be polyphyletic and consequently, it is suggested to accept the genus Crespoa to accommodate the species previously placed in Parmotrema subgen. Crespoa. This study demonstrates the power of reduced genome-scale data sets to resolve phylogenetic relationships with high support. Due to lower costs, target enrichment methods provide a promising avenue for phylogenetic studies including larger taxonomic/specimen sampling than whole genome data would allow.

Download Full-text

Seasonal and intra-diurnal variability of small-scale gravity waves in OH airglow at two Alpine stations

Atmospheric Measurement Techniques ◽

10.5194/amt-12-457-2019 ◽

2019 ◽

Vol 12 (1) ◽

pp. 457-469 ◽

Cited By ~ 2

Author(s):

Patrick Hannawald ◽

Carsten Schmidt ◽

René Sedlak ◽

Sabine Wüst ◽

Michael Bittner

Keyword(s):

Gravity Waves ◽

High Frequency ◽

Phase Speed ◽

Propagation Direction ◽

Field Of View ◽

Small Scale ◽

Data Sets ◽

Data Set ◽

Diurnal Variability ◽

Fully Automatic

Abstract. Between December 2013 and August 2017 the instrument FAIM (Fast Airglow IMager) observed the OH airglow emission at two Alpine stations. A year of measurements was performed at Oberpfaffenhofen, Germany (48.09∘ N, 11.28∘ E) and 2 years at Sonnblick, Austria (47.05∘ N, 12.96∘ E). Both stations are part of the network for the detection of mesospheric change (NDMC). The temporal resolution is two frames per second and the field-of-view is 55 km × 60 km and 75 km × 90 km at the OH layer altitude of 87 km with a spatial resolution of 200 and 280 m per pixel, respectively. This resulted in two dense data sets allowing precise derivation of horizontal gravity wave parameters. The analysis is based on a two-dimensional fast Fourier transform with fully automatic peak extraction. By combining the information of consecutive images, time-dependent parameters such as the horizontal phase speed are extracted. The instrument is mainly sensitive to high-frequency small- and medium-scale gravity waves. A clear seasonal dependency concerning the meridional propagation direction is found for these waves in summer in the direction to the summer pole. The zonal direction of propagation is eastwards in summer and westwards in winter. Investigations of the data set revealed an intra-diurnal variability, which may be related to tides. The observed horizontal phase speed and the number of wave events per observation hour are higher in summer than in winter.

Download Full-text

Developing a ‘Semi-Systematic’ Approach to Using Large-Scale Data-Sets for Small-Scale Interventions: The ‘Baby Matterz’ Initiative as a Case Study

The Urban Review ◽

10.1007/s11256-009-0144-z ◽

2010 ◽

Vol 43 (2) ◽

pp. 235-254

Author(s):

Mark O’Brien

Keyword(s):

Large Scale ◽

Systematic Approach ◽

Small Scale ◽

Data Sets ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Four-dimensional mesospheric and lower thermospheric wind ﬁelds using Gaussian process regression on multistatic specular meteor radar observations

10.5194/amt-2021-40 ◽

2021 ◽

Author(s):

Ryan Volz ◽

Jorge L. Chau ◽

Philip J. Erickson ◽

Juha P. Vierinen ◽

J. Miguel Urco ◽

...

Keyword(s):

Gaussian Process ◽

Wind Velocity ◽

Wind Field ◽

Gaussian Process Regression ◽

Small Scale ◽

Model Parameters ◽

Meteor Radar ◽

Wind Fields ◽

Data Set ◽

Uncertainty Estimates

Abstract. Mesoscale dynamics in the mesosphere and lower thermosphere (MLT) region have been difficult to study from either ground- or satellite-based observations. For understanding of atmospheric coupling processes, important spatial scales at these altitudes range between tens to hundreds of kilometers in the horizontal plane. To date, this scale size is challenging observationally, and so structures are usually parameterized in global circulation models. The advent of multistatic specular meteor radar networks allows exploration of MLT mesocale dynamics on these scales using an increased number of detections and a diversity of viewing angles inherent to multistatic networks. In this work, we introduce a four dimensional wind field inversion method that makes use of Gaussian process regression (GPR), a non-parametric and Bayesian approach. The method takes measured projected wind velocities and prior distributions of the wind velocity as a function of space and time, specified by the user or estimated from the data, and produces posterior distributions for the wind velocity. Computation of the predictive posterior distribution is performed on sampled points of interest and is not necessarily regularly sampled. The main benefits of the GPR method include this non-gridded sampling, the built-in statistical uncertainty estimates, and the ability to horizontally-resolve winds on relatively small scales. The performance of the GPR implementation has been evaluated on Monte Carlo simulations with known distributions using the same spatial and temporal sampling as one day of real meteor measurements. Based on the simulation results we find that the GPR implementation is robust, providing wind fields that are statistically unbiased and with statistical variances that depend on the geometry and are proportional to the prior velocity variances. A conservative and fast approach can be straightforwardly implemented by employing overestimated prior variances and distances, while a more robust but computationally intensive approach can be implemented by employing training and fitting of model parameters. The latter GPR approach has been applied to a 24-hour data set and shown to compare well to previously used homogeneous and gradient methods. Small scale features have reasonably low statistical uncertainties, implying geophysical wind field horizontal structures as low as 20–50 km. We suggest that this GPR approach forms a suitable method for MLT regional and weather studies.

Download Full-text

Early results from the Whisper instrument on Cluster: an overview

Annales Geophysicae ◽

10.5194/angeo-19-1241-2001 ◽

2001 ◽

Vol 19 (10/12) ◽

pp. 1241-1258 ◽

Cited By ~ 116

Author(s):

P. M. E. Décréau ◽

P. Fergeau ◽

V. Krasnoselskikh ◽

E. Le Guirriec ◽

M. Lévêque ◽

...

Keyword(s):

Electron Density ◽

Small Scale ◽

Data Sets ◽

Total Density ◽

Total Electron Density ◽

Data Set ◽

Total Electron ◽

Density Data ◽

Early Results ◽

Natural Emissions

Abstract. The Whisper instrument yields two data sets: (i) the electron density determined via the relaxation sounder, and (ii) the spectrum of natural plasma emissions in the frequency band 2–80 kHz. Both data sets allow for the three-dimensional exploration of the magnetosphere by the Cluster mission. The total electron density can be derived unambiguously by the sounder in most magnetospheric regions, provided it is in the range of 0.25 to 80 cm-3 . The natural emissions already observed by earlier spacecraft are fairly well measured by the Whisper instrument, thanks to the digital technology which largely overcomes the limited telemetry allocation. The natural emissions are usually related to the plasma frequency, as identified by the sounder, and the combination of an active sounding operation and a passive survey operation provides a time resolution for the total density determination of 2.2 s in normal telemetry mode and 0.3 s in burst mode telemetry, respectively. Recorded on board the four spacecraft, the Whisper density data set forms a reference for other techniques measuring the electron population. We give examples of Whisper density data used to derive the vector gradient, and estimate the drift velocity of density structures. Wave observations are also of crucial interest for studying small-scale structures, as demonstrated in an example in the fore-shock region. Early results from the Whisper instrument are very encouraging, and demonstrate that the four-point Cluster measurements indeed bring a unique and completely novel view of the regions explored.Key words. Space plasma physics (instruments and techniques; discontinuities, general or miscellaneous)

Download Full-text

The Adjusted Log-logistic Generalized Exponential Distribution with Application to Lifetime Data

International Journal of Statistics and Probability ◽

10.5539/ijsp.v6n4p1 ◽

2017 ◽

Vol 5 (4) ◽

pp. 1

Author(s):

I. E. Okorie ◽

A. C. Akpanta ◽

J. Ohakwe ◽

D. C. Chikezie ◽

C. U. Onyemachi ◽

...

Keyword(s):

Exponential Distribution ◽

Likelihood Estimation ◽

Model Parameters ◽

Data Sets ◽

Lifetime Data ◽

Generalized Exponential Distribution ◽

Data Set ◽

Method Of Maximum Likelihood ◽

The One ◽

Special Case

This paper introduces a new generator of probability distribution-the adjusted log-logistic generalized (ALLoG) distribution and a new extension of the standard one parameter exponential distribution called the adjusted log-logistic generalized exponential (ALLoGExp) distribution. The ALLoGExp distribution is a special case of the ALLoG distribution and we have provided some of its statistical and reliability properties. Notably, the failure rate could be monotonically decreasing, increasing or upside-down bathtub shaped depending on the value of the parameters $\delta$ and $\theta$. The method of maximum likelihood estimation was proposed to estimate the model parameters. The importance and flexibility of he ALLoGExp distribution was demonstrated with a real and uncensored lifetime data set and its fit was compared with five other exponential related distributions. The results obtained from the model fittings shows that the ALLoGExp distribution provides a reasonably better fit than the one based on the other fitted distributions. The ALLoGExp distribution is therefore ecommended for effective modelling of lifetime data sets.

Download Full-text

Bianchi type I Universe: An extension of ΛCDM model

International Journal of Geometric Methods in Modern Physics ◽

10.1142/s0219887821500699 ◽

2021 ◽

pp. 2150069

Author(s):

Rajendra Prasad ◽

Lalit Kumar Gupta ◽

A. Beesham ◽

G. K. Goswami ◽

Anil Kumar Yadav

Keyword(s):

Bianchi Type ◽

Hubble Constant ◽

Model Parameters ◽

Constant Density ◽

Data Sets ◽

Type I ◽

Data Set ◽

Bianchi Type I ◽

Age Of The Universe ◽

The Universe

In this paper, we investigate a Bianchi type I exact Universe by taking into account the cosmological constant as the source of energy at the present epoch. We have performed a [Formula: see text] test to obtain the best fit values of the model parameters of the Universe in the derived model. We have used two types of data sets, viz., (i) 31 values of the Hubble parameter and (ii) the 1048 Pantheon data set of various supernovae distance moduli and apparent magnitudes. From both the data sets, we have estimated the current values of the Hubble constant, density parameters [Formula: see text] and [Formula: see text]. The dynamics of the deceleration parameter shows that the Universe was in a decelerating phase for redshift [Formula: see text]. At a transition redshift [Formula: see text], the present Universe entered an accelerating phase of expansion. The current age of the Universe is obtained as [Formula: see text] Gyrs. This is in good agreement with the value of [Formula: see text] calculated from the Plank collaboration results and WMAP observations.

Download Full-text