Supporting the Construction of Workflows for Biodiversity Problem-Solving Accessing Secure, Distributed Resources

In the Biodiversity World (BDW) project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to explain past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack inherent interoperability. The present BDW system brings all these disparate units together so that the user can combine tools with little thought as to their original availability, data formats and interoperability. The new prototype BDW system architecture not only brings together heterogeneous resources but also enables utilisation of computational resources and provides a secure access to BDW resources via a federated security model. We describe features of the new BDW system and its security model which enable user authentication from a workflow application as part of workflow execution.

Download Full-text

Molecular Structure Determination on the Grid

Grid and Cloud Computing ◽

10.4018/978-1-4666-0879-5.ch406 ◽

2012 ◽

pp. 862-880

Author(s):

Russ Miller ◽

Charles Weeks

Keyword(s):

Molecular Structure ◽

New York ◽

Data Storage ◽

New York State ◽

Open Science ◽

Data Sets ◽

Distributed Resources ◽

Data Repositories ◽

Computational Resources ◽

Open Science Grid

Grids represent an emerging technology that allows geographically- and organizationally-distributed resources (e.g., computer systems, data repositories, sensors, imaging systems, and so forth) to be linked in a fashion that is transparent to the user. The New York State Grid (NYS Grid) is an integrated computational and data grid that provides access to a wide variety of resources to users from around the world. NYS Grid can be accessed via a Web portal, where the users have access to their data sets and applications, but do not need to be made aware of the details of the data storage or computational devices that are specifically employed in solving their problems. Grid-enabled versions of the SnB and BnP programs, which implement the Shake-and-Bake method of molecular structure (SnB) and substructure (BnP) determination, respectively, have been deployed on NYS Grid. Further, through the Grid Portal, SnB has been run simultaneously on all computational resources on NYS Grid as well as on more than 1100 of the over 3000 processors available through the Open Science Grid.

Download Full-text

Shake-and-Bakeon the grid

Journal of Applied Crystallography ◽

10.1107/s0021889807034565 ◽

2007 ◽

Vol 40 (5) ◽

pp. 938-944 ◽

Cited By ~ 7

Author(s):

Russ Miller ◽

Naimesh Shah ◽

Mark L. Green ◽

William Furey ◽

Charles M. Weeks

Keyword(s):

New York ◽

Data Storage ◽

New York State ◽

Open Science ◽

Data Sets ◽

Distributed Resources ◽

Web Based ◽

York State ◽

Computational Data ◽

Computational Resources

Computational and data grids represent an emerging technology that allows geographically and organizationally distributed resources (e.g.computing and storage resources) to be linked and accessed in a fashion that is transparent to the user, presenting an extension of the desktop for users whose computational, data and visualization needs extend beyond their local systems. The New York State Grid is an integrated computational and data grid that provides web-based access for users from around the world to computational, application and data storage resources. This grid is used in a ubiquitous fashion, where the users have virtual access to their data sets and applications, but do not need to be made aware of the details of the data storage or computational devices that are specifically employed. Two of the applications that users worldwide have access to on a variety of grids, including the New York State Grid, are theSnBandBnPprograms, which implement theShake-and-Bakemethod of molecular structure (SnB) and substructure (BnP) determination, respectively. In particular, through our grid portal (i.e.logging on to a web site),SnBhas been run simultaneously on all computational resources on the New York State Grid as well as on more than 1100 of the over 3000 processors available through the Open Science Grid.

Download Full-text

Molecular Structure Determination on the Grid

Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare ◽

10.4018/978-1-60566-374-6.ch017 ◽

2011 ◽

pp. 327-345

Author(s):

Russ Miller ◽

Charles Weeks

Keyword(s):

Molecular Structure ◽

New York ◽

Data Storage ◽

New York State ◽

Open Science ◽

Data Sets ◽

Distributed Resources ◽

Data Repositories ◽

Computational Resources ◽

Open Science Grid

Grids represent an emerging technology that allows geographically- and organizationally-distributed resources (e.g., compute systems, data repositories, sensors, imaging systems, and so forth) to be linked in a fashion that is transparent to the user. The New York State Grid (NYS Grid) is an integrated computational and data grid that provides access to a wide variety of resources to users from around the world. NYS Grid can be accessed via a Web portal, where the users have access to their data sets and applications, but do not need to be made aware of the details of the data storage or computational devices that are specifically employed in solving their problems. Grid-enabled versions of the SnB and BnP programs, which implement the Shake-and-Bake method of molecular structure (SnB) and substructure (BnP) determination, respectively, have been deployed on NYS Grid. Further, through the Grid Portal, SnB has been run simultaneously on all computational resources on NYS Grid as well as on more than 1100 of the over 3000 processors available through the Open Science Grid.

Download Full-text

Proteomics: Opportunities and Challenges

International Journal of Pharmaceutical Sciences and Nanotechnology ◽

10.37285/ijpsn.2010.3.4.1 ◽

2011 ◽

Vol 3 (4) ◽

pp. 1165-1172 ◽

Cited By ~ 1

Author(s):

Parag A Pathade ◽

Vinod A Bairagi ◽

Yogesh S. Ahire ◽

Neela M Bhatia

Keyword(s):

Biomedical Research ◽

Therapy Monitoring ◽

Dynamic State ◽

Data Sets ◽

Protein Markers ◽

Drug Target Discovery ◽

Proteomic Data ◽

Cell Tissue ◽

A Cell ◽

Analytical Tools

‘‘Proteomics’’, is the emerging technology leading to high-throughput identification and understanding of proteins. Proteomics is the protein equivalent of genomics and has captured the imagination of biomolecular scientists, worldwide. Because proteome reveals more accurately the dynamic state of a cell, tissue, or organism, much is expected from proteomics to indicate better disease markers for diagnosis and therapy monitoring. Proteomics is expected to play a major role in biomedical research, and it will have a significant impact on the development of diagnostics and therapeutics for cancer, heart ailments and infectious diseases, in future. Proteomics research leads to the identification of new protein markers for diagnostic purposes and novel molecular targets for drug discovery. Though the potential is great, many challenges and issues remain to be solved, such as gene expression, peptides, generation of low abundant proteins, analytical tools, drug target discovery and cost. A systematic and efficient analysis of vast genomic and proteomic data sets is a major challenge for researchers, today. Nevertheless, proteomics is the groundwork for constructing and extracting useful comprehension to biomedical research. This review article covers some opportunities and challenges offered by proteomics.

Download Full-text

J-CO: A Platform-Independent Framework for Managing Geo-Referenced JSON Data Sets

Electronics ◽

10.3390/electronics10050621 ◽

2021 ◽

Vol 10 (5) ◽

pp. 621

Author(s):

Giuseppe Psaila ◽

Paolo Fosci

Keyword(s):

Query Language ◽

Open Data ◽

Internet Technology ◽

Data Sets ◽

Specific Storage ◽

Current State ◽

Execution Engine ◽

Share Data ◽

Cloud Servers ◽

Computational Resources

Internet technology and mobile technology have enabled producing and diffusing massive data sets concerning almost every aspect of day-by-day life. Remarkable examples are social media and apps for volunteered information production, as well as Open Data portals on which public administrations publish authoritative and (often) geo-referenced data sets. In this context, JSON has become the most popular standard for representing and exchanging possibly geo-referenced data sets over the Internet.Analysts, wishing to manage, integrate and cross-analyze such data sets, need a framework that allows them to access possibly remote storage systems for JSON data sets, to retrieve and query data sets by means of a unique query language (independent of the specific storage technology), by exploiting possibly-remote computational resources (such as cloud servers), comfortably working on their PC in their office, more or less unaware of real location of resources. In this paper, we present the current state of the J-CO Framework, a platform-independent and analyst-oriented software framework to manipulate and cross-analyze possibly geo-tagged JSON data sets. The paper presents the general approach behind the J-CO Framework, by illustrating the query language by means of a simple, yet non-trivial, example of geographical cross-analysis. The paper also presents the novel features introduced by the re-engineered version of the execution engine and the most recent components, i.e., the storage service for large single JSON documents and the user interface that allows analysts to comfortably share data sets and computational resources with other analysts possibly working in different places of the Earth globe. Finally, the paper reports the results of an experimental campaign, which show that the execution engine actually performs in a more than satisfactory way, proving that our framework can be actually used by analysts to process JSON data sets.

Download Full-text

Application Natura 2000 Data For The Invasive Plants Spread Prediction

Scientia Agriculturae Bohemica ◽

10.1515/sab-2015-0031 ◽

2015 ◽

Vol 46 (4) ◽

pp. 159-166 ◽

Cited By ~ 1

Author(s):

J. Pěknicová ◽

D. Petrus ◽

K. Berchová-Bímová

Keyword(s):

Environmental Factors ◽

Invasive Plants ◽

Species Distribution ◽

Natura 2000 ◽

Distribution Model ◽

Distribution Data ◽

Habitat Types ◽

Distribution Models ◽

Occurrence Data ◽

Heracleum Mantegazzianum

AbstractThe distribution of invasive plants depends on several environmental factors, e.g. on the distance from the vector of spreading, invaded community composition, land-use, etc. The species distribution models, a research tool for invasive plants spread prediction, involve the combination of environmental factors, occurrence data, and statistical approach. For the construction of the presented distribution model, the occurrence data on invasive plants (Solidagosp.,Fallopiasp.,Robinia pseudoaccacia,andHeracleum mantegazzianum) and Natura 2000 habitat types from the Protected Landscape Area Kokořínsko have been intersected in ArcGIS and statistically analyzed. The data analysis was focused on (1) verification of the accuracy of the Natura 2000 habitat map layer, and the accordance with the habitats occupied by invasive species and (2) identification of a suitable scale of intersection between the habitat and species distribution. Data suitability was evaluated for the construction of the model on local scale. Based on the data, the invaded habitat types were described and the optimal scale grid was evaluated. The results show the suitability of Natura 2000 habitat types for modelling, however more input data (e.g. on soil types, elevation) are needed.

Download Full-text

Salinity tolerance of larval and juvenile broad whitefish (Coregonus nasus)

Canadian Journal of Zoology ◽

10.1139/z89-338 ◽

1989 ◽

Vol 67 (10) ◽

pp. 2392-2397 ◽

Cited By ~ 5

Author(s):

B. G. E. de March

Keyword(s):

Salinity Tolerance ◽

Species Distribution ◽

Laboratory Experiments ◽

Larval Fish ◽

The Arctic ◽

Distribution Data ◽

Migratory Fish ◽

Limited Information ◽

Time To Death ◽

Arctic Coast

In the absence of distribution data for juvenile broad whitefish, Coregonus nasus, laboratory experiments were designed to elucidate the salinity ranges that the species will tolerate. Larval fish (12–18 mm) died within 120 h at salinities of 12.5‰ and higher at both 5 and 10 °C, though more slowly at 5 °C. Salinities of 12.5 and 15‰, but no higher, were tolerated for 120 h at 15 °C. Larvae fed readily at 15 °C but not at 5 or 10 °C. Slightly larger and more-developed larvae (15–19 mm) were tolerant of 12.5‰ but died within 120 h at 15‰ at the same three temperatures. These fish fed more readily than the younger ones. Larger fish (33–68 mm) were generally tolerant of 15–20‰ but not of higher salinities in 120-h tolerance tests. Larger field-collected fish (27–200 mm) reacted similarly but were more tolerant of salinities between 20 and 27‰ in 96-h tests. Analysis of both experiments with larger fish suggests that time to death was inversely related to size as well as to salinity. Coregonus nasus does not seem to be more tolerant of saline conditions than other freshwater or migratory fish species. Experimental results combined with limited information about the species' distribution suggest that man-made constructions on the arctic coast might seriously affect dispersal or annual migrations.

Download Full-text

The persistence in time of distributional patterns in marine megafauna impacts zonal conservation strategies

10.1101/790634 ◽

2019 ◽

Cited By ~ 1

Author(s):

C. Lambert ◽

G. Dorémus ◽

V. Ridoux

Keyword(s):

Species Distribution ◽

Habitat Model ◽

Spatial Extent ◽

Conservation Strategies ◽

Distribution Data ◽

Bay Of Biscay ◽

Mobile Species ◽

Distributional Patterns ◽

Marine Megafauna ◽

Mpa Network

AbstractThe main type of zonal conservation approach corresponds to Marine Protected Areas (MPAs), which are spatially defined and generally static entities aiming at the protection of some target populations by the implementation of a management plan. For highly mobile species the relevance of an MPA over time might be hampered by temporal variations in distributions or home ranges. In the present work, we used habitat model-based predicted distributions of cetaceans and seabirds within the Bay of Biscay from 2004 to 2017 to characterise the aggregation and persistence of mobile species distributional patterns and the relevance of the existing MPA network. We explored the relationship between population abundance and spatial extent of distribution to assess the aggregation level of species distribution. We used the smallest spatial extent including 75% of the population present in the Bay of Biscay to define specific core areas of distributions, and calculated their persistence over the 14 studied years. We inspected the relevance of the MPA network with respect to aggregation and persistence. We found that aggregation and persistence are two independent features of marine megafauna distributions. Indeed, strong persistence was shown in both aggregated (bottlenose dolphins, auks) and loosely distributed species (northern gannets), while some species with aggregated distributions also showed limited year-to-year persistence in their patterns (black-legged kittiwakes). We thus have demonstrated that both aggregation and persistence have potential impact on the amount of spatio-temporal distributional variability encompassed within static MPAs. Our results exemplified the need to have access to a minimal temporal depth in the species distribution data when aiming to designate new site boundaries for the conservation of mobile species.

Download Full-text

Matrix application for multi-radar processing of radar data arrays

Radio Industry (Russia) ◽

10.21778/2413-9599-2020-30-3-99-111 ◽

2020 ◽

Vol 30 (3) ◽

pp. 99-111

Author(s):

D. A. Palguyev ◽

A. N. Shentyabin

Keyword(s):

Processing Time ◽

Radar Data ◽

Practical Implementation ◽

Relative Reduction ◽

Data Sets ◽

Processing Efficiency ◽

Complex Information ◽

Computational Resources ◽

Crucial Part ◽

Processing Device

In the processing of dynamically changing data, for example, radar data (RD), a crucial part is made by the representation of various data sets containing information about routes and signs of air objects. In the practical implementation of the computational process, it previously seemed natural that RD processing in data arrays was carried out by the elementwise search method. However, the representation of data arrays in the form of matrices and the use of matrix math allow optimal calculations to be formed during tertiary processing. Forming matrices and working with them requires a significant computational resource, so the authors can assume that a certain gain in calculation time may be achieved if there is a large amount of data in the arrays, at least several thousand messages. The article shows the sequences of the most frequently repeated operations of tertiary network processing, such as searching for and replacing an array element. The simulation results show that the processing efficiency (relative reduction of processing time and saving of computing resources) with the use of matrices, in comparison with elementwise search and replacement, increases in proportion to the number of messages received by the information processing device. The most significant gain is observed when processing several thousand messages (array elements). Thus, the use of matrices and the mathematical apparatus of matrix math for processing arrays of dynamically changing data can reduce processing time and save computational resources. The proposed matrix method of organizing calculations can also find its place in the modeling of complex information systems.

Download Full-text

Decline of unique Pontocaspian biodiversity in the Black Sea Basin: a review

10.22541/au.161559554.47968629/v1 ◽

2021 ◽

Author(s):

Aleksandre Gogaladze ◽

Mikhail Son ◽

Matteo Lattuada ◽

Vitaliy Anistratenko ◽

Vitaly Syomin ◽

...

Keyword(s):

Black Sea ◽

Species Distribution ◽

The Black Sea ◽

Distribution Data ◽

Past Century ◽

Detailed Knowledge ◽

The Past ◽

Mollusc Species ◽

Anthropogenic Threats ◽

Effective Conservation

Aim The unique aquatic Pontocaspian (PC) biota of the Black Sea Basin (BSB) is in decline. Lack of detailed knowledge on the status and trends of species, populations and communities hampers a thorough risk assessment and precludes effective conservation. This paper aims to review PC biodiversity trends using endemic molluscs as a model group. We aim to assess changes in PC habitats, community structure and species distribution over the past century and to identify direct anthropogenic threats. Location Black Sea Basin (Bulgaria, Romania, Moldova, Ukraine and Russia). Methods Presence/absence data of target mollusc species was assembled from literature, reports and personal observations. PC biodiversity trends in the NW BSB coastal regions were established by comparing 20th and 21st century occurrences. Direct drivers of habitat and biodiversity change were identified and documented. Results A very strong decline of PC species and communities during the past century is driven by a) damming of rivers, b) habitat modifications negatively affecting salinity gradients, c) pollution and eutrophication, d) invasive alien species and e) climate change. Four out of 10 studied regions, namely, the Danube Delta – Razim Lake system, Dniester Liman, Dnieper-South Bug Estuary and Taganrog Bay-Don Delta contain the entire spectrum of ecological conditions to support PC communities and still host threatened endemic PC mollusc species. Distribution data is incomplete, but the scale of deterioration of PC species and communities is evident from the assembled data, as are major direct threats. Main conclusions PC biodiversity in the BSB is profoundly affected by human activities. Standardised observation and collection data as well as precise definition of PC biota and habitats are necessary for targeted conservation actions. This study will help to set the research and policy agenda required to improve data collection to accommodate effective conservation of the unique PC biota.

Download Full-text