scholarly journals Worldwide co-occurrence analysis of 17 species of the genus Brachypodium using data mining

PeerJ ◽  
2019 ◽  
Vol 6 ◽  
pp. e6193 ◽  
Author(s):  
Simon Orozco-Arias ◽  
Ana María Núñez-Rincón ◽  
Reinel Tabares-Soto ◽  
Diana López-Álvarez

The co-occurrence of plant species is a fundamental aspect of plant ecology that contributes to understanding ecological processes, including the establishment of ecological communities and its applications in biological conservation. A priori algorithms can be used to measure the co-occurrence of species in a spatial distribution given by coordinates. We used 17 species of the genus Brachypodium, downloaded from the Global Biodiversity Information Facility data repository or obtained from bibliographical sources, to test an algorithm with the spatial points process technique used by Silva et al. (2016), generating association rules for co-occurrence analysis. Brachypodium spp. has emerged as an effective model for monocot species, growing in different environments, latitudes, and elevations; thereby, representing a wide range of biotic and abiotic conditions that may be associated with adaptive natural genetic variation. We created seven datasets of two, three, four, six, seven, 15, and 17 species in order to test the algorithm with four different distances (1, 5, 10, and 20 km). Several measurements (support, confidence, lift, Chi-square, and p-value) were used to evaluate the quality of the results generated by the algorithm. No negative association rules were created in the datasets, while 95 positive co-occurrences rules were found for datasets with six, seven, 15, and 17 species. Using 20 km in the dataset with 17 species, we found 16 positive co-occurrences involving five species, suggesting that these species are coexisting. These findings are corroborated by the results obtained in the dataset with 15 species, where two species with broad range distributions present in the previous dataset are eliminated, obtaining seven positive co-occurrences. We found that B. sylvaticum has co-occurrence relations with several species, such as B. pinnatum, B. rupestre, B. retusum, and B. phoenicoides, due to its wide distribution in Europe, Asia, and north of Africa. We demonstrate the utility of the algorithm implemented for the analysis of co-occurrence of 17 species of the genus Brachypodium, agreeing with distributions existing in nature. Data mining has been applied in the field of biological sciences, where a great amount of complex and noisy data of unseen proportion has been generated in recent years. Particularly, ecological data analysis represents an opportunity to explore and comprehend biological systems with data mining and bioinformatics tools.

2018 ◽  
Author(s):  
Molly F Jenkins ◽  
Ethan P White ◽  
Allen H Hurlbert

Ecological communities are composed of a combination of core species that maintain local viable populations and transient species that occur infrequently due to dispersal from surrounding regions. Preliminary work indicates that while core and transient species are both commonly observed in community surveys of a wide range of taxonomic groups, their relative prevalence varies substantially from one community to another depending upon the spatial scale at which the community was characterized and its environmental context. We used a geographically extensive dataset of 968 bird community time series to quantitatively describe how the proportion of core species in a community varies with spatial scale and environmental heterogeneity. We found that the proportion of core species in an assemblage increased with spatial scale in a positive decelerating fashion with a concomitant decrease in the proportion of transient species. Variation in the shape of this scaling relationship between sites was related to regional environmental heterogeneity, with lower proportions of core species at a given scale associated with high environmental heterogeneity. This influence of scale and environmental heterogeneity on the proportion of core species may help resolve discrepancies between studies of biotic interactions, resource availability, and mass effects conducted at different scales, because the importance of these and other ecological processes are expected to differ substantially between core and transient species.


2020 ◽  
Author(s):  
Edlin J. Guerra-Castro ◽  
Juan Carlos Cajas ◽  
Nuno Simões ◽  
Juan J Cruz-Motta ◽  
Maite Mascaró

ABSTRACTSSP (simulation-based sampling protocol) is an R package that uses simulation of ecological data and dissimilarity-based multivariate standard error (MultSE) as an estimator of precision to evaluate the adequacy of different sampling efforts for studies that will test hypothesis using permutational multivariate analysis of variance. The procedure consists in simulating several extensive data matrixes that mimic some of the relevant ecological features of the community of interest using a pilot data set. For each simulated data, several sampling efforts are repeatedly executed and MultSE calculated. The mean value, 0.025 and 0.975 quantiles of MultSE for each sampling effort across all simulated data are then estimated and standardized regarding the lowest sampling effort. The optimal sampling effort is identified as that in which the increase in sampling effort do not improve the precision beyond a threshold value (e.g. 2.5 %). The performance of SSP was validated using real data, and in all examples the simulated data mimicked well the real data, allowing to evaluate the relationship MultSE – n beyond the sampling size of the pilot studies. SSP can be used to estimate sample size in a wide range of situations, ranging from simple (e.g. single site) to more complex (e.g. several sites for different habitats) experimental designs. The latter constitutes an important advantage, since it offers new possibilities for complex sampling designs, as it has been advised for multi-scale studies in ecology.


Author(s):  
Gaurav Vaidya ◽  
Hilmar Lapp ◽  
Nico Cellinese

Most biological data and knowledge are directly or indirectly linked to biological taxa via taxon names. Using taxon names is one of the most fundamental and ubiquitous ways in which a wide range of biological data are integrated, aggregated, and indexed, from genomic and microbial diversity to macro-ecological data. To this day, the names used, as well as most methods and resources developed for this purpose, are drawn from Linnaean nomenclature. This leads to numerous problems when applied to data-intensive science that depends on computation to take full advantage of the vast – and rapidly increasing – amount of available digital biodiversity data. The theoretical and practical complexities of reconciling taxon names and concepts has plagued the systematics community for decades and now more than ever before, Linnaean names based in Linnaean taxonomy, by far the most prevalent means of linking data to taxa, are unfit for the age of computation-driven data science, due to fundamental theoretical and practical shortfalls that cannot be cured. We propose an alternate approach based on the use of phylogenetic clade definitions, which is a well-developed method for unambiguously defining the semantics of a clade concept in terms of shared evolutionary ancestry (de Queiroz and Gauthier 1990, de Queiroz and Gauthier 1994). These semantics allow locating the defined clade on any phylogeny, or showing that a clade is inconsistent with the topology of a given phylogeny and hence cannot be present on it at all. We have built a workflow for defining phylogenetic clade definitions in terms of shared ancestor and excluded lineage properties, and locating these definitions on any input phylogeny. Once these definitions have been located, we can use the list of species found within that clade on that phylogeny in order to aggregate occurrence data from the Global Biodiversity Information Facility (GBIF). Thus, our approach uses clade definitions with machine-understandable semantics to programmatically and reproducibly aggregate biodiversity data by higher-level taxonomic concepts. This approach has several advantages over the use of taxonomic hierarchies: Unlike taxa, the semantics of clade definitions can be expressed in unambiguous, machine-understandable and reproducible terms and language. The resolution of a given clade definition will depend on the phylogeny being used. Thus, if the phylogeny of groups of interest is updated in light of new evolutionary knowledge, the clade definition can be applied to the new phylogeny to obtain an updated list of clade members consistent with the updated evolutionary knowledge. Machine reproducibility of analyses is possible simply by archiving the machine-readable representations of the clade definition and the phylogeny being used. Unlike taxa, the semantics of clade definitions can be expressed in unambiguous, machine-understandable and reproducible terms and language. The resolution of a given clade definition will depend on the phylogeny being used. Thus, if the phylogeny of groups of interest is updated in light of new evolutionary knowledge, the clade definition can be applied to the new phylogeny to obtain an updated list of clade members consistent with the updated evolutionary knowledge. Machine reproducibility of analyses is possible simply by archiving the machine-readable representations of the clade definition and the phylogeny being used. Clade definitions can be created by biologists as needed or can be reused from those published in peer-reviewed journals. In addition, nearly 300 peer-reviewed clade definitions were recently published as part of the Phylonym volume of the PhyloCode (de Queiroz et al. 2020) and are now available on the Regnum website. As part of the Phyloreferencing Project, we digitize this collection as a machine-readable ontology, where each clade is represented as a class defined by logical conjunctions for class membership, corresponding to a set of necessary and sufficient conditions of shared or divergent evolutionary ancestry. We call these classes phyloreferences, and have created a fully automated workflow for digitizing the Regnum database content into an OWL ontology (W3C OWL Working Group 2012) that we call the Clade Ontology. This ontology includes reference phylogenies and additional metadata about the verbatim clade definitions. Once complete, the Clade Ontology will include all clade definitions from RegNum, both those included in Phylonym after passing peer-review, and those contributed by the community, whether or not under the PhyloCode nomenclature. As an openly available community resource, this will allow researchers to use them to aggregate biodiversity data for comparative biology with grouping semantics that are transparent, machine-processable, and reproducible. In our presentation, we will demonstrate the use of phyloreferences to locate clades on the Open Tree of Life synthetic tree (Hinchliff et al. 2015), to retrieve lists of species in each clade, and to use them to find and aggregate occurrence records in GBIF. We will also describe the workflow we are currently using to build and test the Clade Ontology, and describe our plans for publishing this resource. Finally, we will discuss the advantages and disadvantages of this approach as compared to taxonomic checklists.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e6019 ◽  
Author(s):  
Molly F. Jenkins ◽  
Ethan P. White ◽  
Allen H. Hurlbert

Ecological communities are composed of a combination of core species that maintain local viable populations and transient species that occur infrequently due to dispersal from surrounding regions. Preliminary work indicates that while core and transient species are both commonly observed in community surveys of a wide range of taxonomic groups, their relative prevalence varies substantially from one community to another depending upon the spatial scale at which the community was characterized and its environmental context. We used a geographically extensive dataset of 968 bird community time series to quantitatively describe how the proportion of core species in a community varies with spatial scale and environmental heterogeneity. We found that the proportion of core species in an assemblage increased with spatial scale in a positive decelerating fashion with a concomitant decrease in the proportion of transient species. Variation in the shape of this scaling relationship between sites was related to regional environmental heterogeneity, with lower proportions of core species at a given scale associated with high environmental heterogeneity. Understanding this influence of scale and environmental heterogeneity on the proportion of core species may help resolve discrepancies between studies of biotic interactions, resource availability, and mass effects conducted at different scales, because the importance of these and other ecological processes are expected to differ substantially between core and transient species.


Author(s):  
Ricardo Timarán Pereira ◽  
Lisbeth Rosero Legarda ◽  
Yehimy Cabrera Cabrera

En este artículo se presentan los primeros resultados del proyecto de investigación que tuvo como objetivo detectar patrones de eventos eruptivos del volcán Galeras con técnicas de minería de datos, a partir de los datos almacenados en el Observatorio Vulcanológico y Sismológico de Pasto - OVSP (Colombia), aplicando la metodología CRISP-DM. Se construyó, limpió y transformó un repositorio de datos con la información de los eventos eruptivos del volcán Galeras registrados desde 1989 hasta 2013. A partir de este repositorio, se detectaron patrones asociados a estos eventos, utilizando la tarea de minería de datos asociación. El conocimiento generado se integrará al existente con el fin de ayudar al OVSP y a los organismos gubernamentales de prevención de desastres a tomar decisiones eficaces en lo relacionado a la implementación de planes de prevención ante una posible erupción del volcán Galeras.Palabras Clave: Patrones de Eventos Eruptivos, Volcán Galeras, Minería de DatosIn this paper, the first results of a research project that aimed to detect patterns of eruptive events of Galeras volcano with data mining techniques from the data stored in the Volcanological and Seismological Observatory of Pasto - VSOP (Colombia), applying CRISP-DM methodology, are presented. A data repository with the information of the eruptive events of Galeras volcano recorded from 1989 to 2013 was built, cleaned and transformed. Using the data mining task association were detected patterns associated with these events. The knowledge generated will be integrated to the existing order to help VSOP and government agencies of disaster prevention to take effective decisions related to the implementation of prevention plans for a possible eruption of the Galeras volcano.Keywords: Patterns of eruptive events, Galeras volcano, Data Mining.


2018 ◽  
Author(s):  
Molly F Jenkins ◽  
Ethan P White ◽  
Allen H Hurlbert

Ecological communities are composed of a combination of core species that maintain local viable populations and transient species that occur infrequently due to dispersal from surrounding regions. Preliminary work indicates that while core and transient species are both commonly observed in community surveys of a wide range of taxonomic groups, their relative prevalence varies substantially from one community to another depending upon the spatial scale at which the community was characterized and its environmental context. We used a geographically extensive dataset of 968 bird community time series to quantitatively describe how the proportion of core species in a community varies with spatial scale and environmental heterogeneity. We found that the proportion of core species in an assemblage increased with spatial scale in a positive decelerating fashion with a concomitant decrease in the proportion of transient species. Variation in the shape of this scaling relationship between sites was related to regional environmental heterogeneity, with lower proportions of core species at a given scale associated with high environmental heterogeneity. This influence of scale and environmental heterogeneity on the proportion of core species may help resolve discrepancies between studies of biotic interactions, resource availability, and mass effects conducted at different scales, because the importance of these and other ecological processes are expected to differ substantially between core and transient species.


2014 ◽  
Vol 1 (1) ◽  
pp. 339-342
Author(s):  
Mirela Danubianu ◽  
Dragos Mircea Danubianu

AbstractSpeech therapy can be viewed as a business in logopaedic area that aims to offer services for correcting language. A proper treatment of speech impairments ensures improved efficiency of therapy, so, in order to do that, a therapist must continuously learn how to adjust its therapy methods to patient's characteristics. Using Information and Communication Technology in this area allowed collecting a lot of data regarding various aspects of treatment. These data can be used for a data mining process in order to find useful and usable patterns and models which help therapists to improve its specific education. Clustering, classification or association rules can provide unexpected information which help to complete therapist's knowledge and to adapt the therapy to patient's needs.


Forests ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 273
Author(s):  
Samuel Royer-Tardif ◽  
Jürgen Bauhus ◽  
Frédérik Doyon ◽  
Philippe Nolet ◽  
Nelson Thiffault ◽  
...  

Climate change is threatening our ability to manage forest ecosystems sustainably. Despite strong consensus on the need for a broad portfolio of options to face this challenge, diversified management options have yet to be widely implemented. Inspired by functional zoning, a concept aimed at optimizing biodiversity conservation and wood production in multiple-use forest landscapes, we present a portfolio of management options that intersects management objectives with forest vulnerability to better address the wide range of goals inherent to forest management under climate change. Using this approach, we illustrate how different adaptation options could be implemented when faced with impacts related to climate change and its uncertainty. These options range from establishing ecological reserves in climatic refuges, where self-organizing ecological processes can result in resilient forests, to intensive plantation silviculture that could ensure a stable wood supply in an uncertain future. While adaptation measures in forests that are less vulnerable correspond to the traditional functional zoning management objectives, forests with higher vulnerability might be candidates for transformative measures as they may be more susceptible to abrupt changes in structure and composition. To illustrate how this portfolio of management options could be applied, we present a theoretical case study for the eastern boreal forest of Canada. Even if these options are supported by solid evidence, their implementation across the landscape may present some challenges and will require good communication among stakeholders and with the public.


Sign in / Sign up

Export Citation Format

Share Document