Genomic Approaches to Plant-Pathogen Epidemiology and Diagnostics

Diseases have a significant cost to agriculture. Findings from analyses of whole-genome sequences show great promise for informing strategies to mitigate risks from diseases caused by phytopathogens. Genomic approaches can be used to dramatically shorten response times to outbreaks and inform disease management in novel ways. However, the use of these approaches requires expertise in working with big, complex data sets and an understanding of their pitfalls and limitations to infer well-supported conclusions. We suggest using an evolutionary framework to guide the use of genomic approaches in epidemiology and diagnostics of plant pathogens. We also describe steps that are necessary for realizing these as standard approaches in disease surveillance. Expected final online publication date for the Annual Review of Phytopathology, Volume 59 is August 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Sparse Structures for Multivariate Extremes

Annual Review of Statistics and Its Application ◽

10.1146/annurev-statistics-040620-041554 ◽

2020 ◽

Vol 8 (1) ◽

Author(s):

Sebastian Engelke ◽

Jevgenijs Ivanovs

Keyword(s):

Graphical Models ◽

Rare Events ◽

Principal Component ◽

Extreme Value Statistics ◽

High Dimensional ◽

Annual Review ◽

Publication Date ◽

Data Sets ◽

Complex Data ◽

Complex Data Sets

Extreme value statistics provides accurate estimates for the small occurrence probabilities of rare events. While theory and statistical tools for univariate extremes are well developed, methods for high-dimensional and complex data sets are still scarce. Appropriate notions of sparsity and connections to other fields such as machine learning, graphical models, and high-dimensional statistics have only recently been established. This article reviews the new domain of research concerned with the detection and modeling of sparse patterns in rare events. We first describe the different forms of extremal dependence that can arise between the largest observations of a multivariate random vector. We then discuss the current research topics, including clustering, principal component analysis, and graphical modeling for extremes. Identification of groups of variables that can be concomitantly extreme is also addressed. The methods are illustrated with an application to flood risk assessment. Expected final online publication date for the Annual Review of Statistics, Volume 8 is March 8, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Satellite Imaging, Laser Technology, and Computer Programs

Handbook of Soils for Landscape Architects ◽

10.1093/oso/9780195121025.003.0021 ◽

1999 ◽

Author(s):

Robert F. Keefer

Keyword(s):

Response Times ◽

Development Planning ◽

Computer Hardware ◽

Data Sets ◽

Complex Data ◽

Satellite Imaging ◽

Systems Research ◽

Management And Development ◽

Complex Data Sets ◽

Using Data

management, and development planning. Two examples of this are: GIS could allow emergency planners to easily calculate emergency response times during natural disasters; or GIS could be used to find wetlands that need protection from pollution. A Geographic Information System (GIS) is an organized collection of computer hardware, software, geographic data and personnel designed to capture, manipulate, analyze, and display all forms of geographically referenced information (Allender, 1998). A more simplified definition would he: a computer system capable of holding and using data, describing places on the earth’s surface, for the purpose of spatial analysis. It is also “intelligent graphics” to aid in the analysis and depiction of complex data sets. Components of GIS include ARC/INFO:GIS software by ESRI (Environmental Systems Research Institute), ARC—graphical features of points, lines/arcs, and polygons, INFO—the relational database component of tables of data of any attribute that ties to a graphical component.

Download Full-text

Key Insights and Research Prospects at the Dawn of the Population Genomics Era for Verticillium dahliae

Annual Review of Phytopathology ◽

10.1146/annurev-phyto-020620-121925 ◽

2021 ◽

Vol 59 (1) ◽

Author(s):

Jie-Yin Chen ◽

Steven J. Klosterman ◽

Xiao-Ping Hu ◽

Xiao-Feng Dai ◽

Krishna V. Subbarao

Keyword(s):

Population Structure ◽

Verticillium Dahliae ◽

Plant Pathogens ◽

Population Genomics ◽

Population Divergence ◽

Annual Review ◽

Publication Date ◽

Disease Symptoms ◽

Historical Perspectives ◽

Characteristic Features

The genomics era has ushered in exciting possibilities to examine the genetic bases that undergird the characteristic features of Verticillium dahliae and other plant pathogens. In this review, we provide historical perspectives on some of the salient biological characteristics of V. dahliae, including its morphology, microsclerotia formation, host range, disease symptoms, vascular niche, reproduction, and population structure. The kaleidoscopic population structure of this pathogen is summarized, including different races of the pathogen, defoliating and nondefoliating phenotypes, vegetative compatibility groupings, and clonal populations. Where possible, we place the characteristic differences in the context of comparative and functional genomics analyses that have offered insights into population divergence within V. dahliae and the related species. Current challenges are highlighted along with some suggested future population genomics studies that will contribute to advancing our understanding of the population divergence in V. dahliae. Expected final online publication date for the Annual Review of Phytopathology, Volume 59 is August 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Biotic and Abiotic Controls on the Phanerozoic History of Marine Animal Biodiversity

Annual Review of Ecology Evolution and Systematics ◽

10.1146/annurev-ecolsys-012021-035131 ◽

2021 ◽

Vol 52 (1) ◽

Author(s):

Andrew M. Bush ◽

Jonathan L. Payne

Keyword(s):

Biotic Interactions ◽

Annual Review ◽

Publication Date ◽

Data Sets ◽

Marine Animal ◽

Marine Animals ◽

Marine Habitat ◽

Positive Feedbacks ◽

Abiotic Controls ◽

Time Changes

During the past 541 million years, marine animals underwent three intervals of diversification (early Cambrian, Ordovician, Cretaceous–Cenozoic) separated by nondirectional fluctuation, suggesting diversity-dependent dynamics with the equilibrium diversity shifting through time. Changes in factors such as shallow-marine habitat area and climate appear to have modulated the nondirectional fluctuations. Directional increases in diversity are best explained by evolutionary innovations in marine animals and primary producers coupled with stepwise increases in the availability of food and oxygen. Increasing intensity of biotic interactions such as predation and disturbance may have led to positive feedbacks on diversification as ecosystems became more complex. Important areas for further research include improving the geographic coverage and temporal resolution of paleontological data sets, as well as deepening our understanding of Earth system evolution and the physiological and ecological traits that modulated organismal responses to environmental change. Expected final online publication date for the Annual Review of Ecology, Evolution, and Systematics, Volume 52 is November 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Using Hadoop Technology to Overcome Big Data Problems by Choosing Proposed Cost-efficient Scheduler Algorithm for Heterogeneous Hadoop System (BD3)

Journal of Scientific Research and Reports ◽

10.9734/jsrr/2020/v26i930310 ◽

2020 ◽

pp. 58-84

Author(s):

Abou_el_ela Abdou Hussein

Keyword(s):

Big Data ◽

Data Processing ◽

Data Storage ◽

Database Management System ◽

Data Sets ◽

Complex Data ◽

Daily Data ◽

Complex Data Sets ◽

Cost Efficient ◽

Hadoop System

Day by day advanced web technologies have led to tremendous growth amount of daily data generated volumes. This mountain of huge and spread data sets leads to phenomenon that called big data which is a collection of massive, heterogeneous, unstructured, enormous and complex data sets. Big Data life cycle could be represented as, Collecting (capture), storing, distribute, manipulating, interpreting, analyzing, investigate and visualizing big data. Traditional techniques as Relational Database Management System (RDBMS) couldn’t handle big data because it has its own limitations, so Advancement in computing architecture is required to handle both the data storage requisites and the weighty processing needed to analyze huge volumes and variety of data economically. There are many technologies manipulating a big data, one of them is hadoop. Hadoop could be understand as an open source spread data processing that is one of the prominent and well known solutions to overcome handling big data problem. Apache Hadoop was based on Google File System and Map Reduce programming paradigm. Through this paper we dived to search for all big data characteristics starting from first three V's that have been extended during time through researches to be more than fifty six V's and making comparisons between researchers to reach to best representation and the precise clarification of all big data V’s characteristics. We highlight the challenges that face big data processing and how to overcome these challenges using Hadoop and its use in processing big data sets as a solution for resolving various problems in a distributed cloud based environment. This paper mainly focuses on different components of hadoop like Hive, Pig, and Hbase, etc. Also we institutes absolute description of Hadoop Pros and cons and improvements to face hadoop problems by choosing proposed Cost-efficient Scheduler Algorithm for heterogeneous Hadoop system.

Download Full-text

Smartphones and the Neuroscience of Mental Health

Annual Review of Neuroscience ◽

10.1146/annurev-neuro-101220-014053 ◽

2021 ◽

Vol 44 (1) ◽

Author(s):

Claire M. Gillan ◽

Robb B. Rutledge

Keyword(s):

Mental Health ◽

Annual Review ◽

Publication Date ◽

Data Sets ◽

Daily Lives ◽

Neuroscience Research ◽

Health And Illness ◽

Passive Data ◽

Smartphone Sensors ◽

Rich Data

Improvements in understanding the neurobiological basis of mental illness have unfortunately not translated into major advances in treatment. At this point, it is clear that psychiatric disorders are exceedingly complex and that, in order to account for and leverage this complexity, we need to collect longitudinal datasets from much larger and more diverse samples than is practical using traditional methods. We discuss how smartphone-based research methods have the potential to dramatically advance our understanding of the neuroscience of mental health. This, we expect, will take the form of complementing lab-based hard neuroscience research with dense sampling of cognitive tests, clinical questionnaires, passive data from smartphone sensors, and experience-sampling data as people go about their daily lives. Theory- and data-driven approaches can help make sense of these rich data sets, and the combination of computational tools and the big data that smartphones make possible has great potential value for researchers wishing to understand how aspects of brain function give rise to, or emerge from, states of mental health and illness. Expected final online publication date for the Annual Review of Neuroscience, Volume 44 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Alternative Clustering

Advances in Business Information Systems and Analytics - Applying Predictive Analytics Within the Service Sector ◽

10.4018/978-1-5225-2148-8.ch001 ◽

2017 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Avinash Navlani ◽

V. B. Gupta

Keyword(s):

Research Problem ◽

Research Community ◽

Data Sets ◽

Complex Data ◽

High Quality ◽

Data Set ◽

Alternative Clustering ◽

Complex Data Sets ◽

Data Objects ◽

Community Clustering

In the last couple of decades, clustering has become a very crucial research problem in the data mining research community. Clustering refers to the partitioning of data objects such as records and documents into groups or clusters of similar characteristics. Clustering is unsupervised learning, because of unsupervised nature there is no unique solution for all problems. Most of the time complex data sets require explanation in multiple clustering sets. All the Traditional clustering approaches generate single clustering. There is more than one pattern in a dataset; each of patterns can be interesting in from different perspectives. Alternative clustering intends to find all unlike groupings of the data set such that each grouping has high quality and distinct from each other. This chapter gives you an overall view of alternative clustering; it's various approaches, related work, comparing with various confusing related terms like subspace, multi-view, and ensemble clustering, applications, issues, and challenges.

Download Full-text

Science Communication with Dinosaurs

Handbook of Research on Computational Science and Engineering - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-61350-116-0.ch024 ◽

2012 ◽

pp. 587-611

Author(s):

Phillip L. Manning ◽

Peter L. Falkingham

Keyword(s):

High Performance Computing ◽

Public Engagement ◽

Science Communication ◽

High Performance ◽

Computational Science ◽

Data Sets ◽

Complex Data ◽

Multidisciplinary Research ◽

Complex Data Sets ◽

Performance Computing

Dinosaurs successfully conjure images of lost worlds and forgotten lives. Our understanding of these iconic, extinct animals now comes from many disciplines, not just the science of palaeontology. In recent years palaeontology has benefited from the application of new and existing techniques from physics, biology, chemistry, engineering, but especially computational science. The application of computers in palaeontology is highlighted in this chapter as a key area of development in studying fossils. The advances in high performance computing (HPC) have greatly aided and abetted multiple disciplines and technologies that are now feeding paleontological research, especially when dealing with large and complex data sets. We also give examples of how such multidisciplinary research can be used to communicate not only specific discoveries in palaeontology, but also the methods and ideas, from interrelated disciplines to wider audiences. Dinosaurs represent a useful vehicle that can help enable wider public engagement, communicating complex science in digestible chunks.

Download Full-text

Anomaly Detection for Inferring Social Structure

Social Computing ◽

10.4018/978-1-60566-984-7.ch118 ◽

2010 ◽

pp. 1797-1803

Author(s):

Lisa Friedland

Keyword(s):

Data Analysis ◽

Anomaly Detection ◽

Social Structure ◽

Small Groups ◽

Analysis Data ◽

Data Sets ◽

Complex Data ◽

Detection Approach ◽

Complex Data Sets ◽

Data Points

In traditional data analysis, data points lie in a Cartesian space, and an analyst asks certain questions: (1) What distribution can I fit to the data? (2) Which points are outliers? (3) Are there distinct clusters or substructure? Today, data mining treats richer and richer types of data. Social networks encode information about people and their communities; relational data sets incorporate multiple types of entities and links; and temporal information describes the dynamics of these systems. With such semantically complex data sets, a greater variety of patterns can be described and views constructed of the data. This article describes a specific social structure that may be present in such data sources and presents a framework for detecting it. The goal is to identify tribes, or small groups of individuals that intentionally coordinate their behavior—individuals with enough in common that they are unlikely to be acting independently. While this task can only be conceived of in a domain of interacting entities, the solution techniques return to the traditional data analysis questions. In order to find hidden structure (3), we use an anomaly detection approach: develop a model to describe the data (1), then identify outliers (2).

Download Full-text

Introduction to Big Data and Business Analytics

10.4018/978-1-6684-3662-2.ch004 ◽

2022 ◽

pp. 67-76

Author(s):

Dineshkumar Bhagwandas Vaghela

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Data Sets ◽

Complex Data ◽

Business Analytics ◽

Database Applications ◽

Complex Data Sets ◽

History Of ◽

Rapid Generation

The term big data has come due to rapid generation of data in various organizations. In big data, the big is the buzzword. Here the data are so large and complex that the traditional database applications are not able to process (i.e., they are inadequate to deal with such volume of data). Usually the big data are described by 5Vs (volume, velocity, variety, variability, veracity). The big data can be structured, semi-structured, or unstructured. Big data analytics is the process to uncover hidden patterns, unknown correlations, predict the future values from large and complex data sets. In this chapter, the following topics will be covered more in detail. History of big data and business analytics, big data analytics technologies and tools, and big data analytics uses and challenges.

Download Full-text