scholarly journals Converting Biomolecular Modelling Data Based on an XML Representation

2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Yudong Sun ◽  
Steve McKeever

SummaryBiomolecular modelling has provided computational simulation based methods for investigating biological processes from quantum chemical to cellular levels. Modelling such microscopic processes requires atomic description of a biological system and conducts in fine timesteps. Consequently the simulations are extremely computationally demanding. To tackle this limitation, different biomolecular models have to be integrated in order to achieve high-performance simulations. The integration of diverse biomolecular models needs to convert molecular data between different data representations of different models. This data conversion is often non-trivial, requires extensive human input and is inevitably error prone. In this paper we present an automated data conversion method for biomolecular simulations between molecular dynamics and quantum mechanics/molecular mechanics models. Our approach is developed around an XML data representation called BioSimML (Biomolecular Simulation Markup Language). BioSimML provides a domain specific data representation for biomolecular modelling which can effciently support data interoperability between different biomolecular simulation models and data formats.

2016 ◽  
Vol 2016 ◽  
pp. 1-19 ◽  
Author(s):  
Alexander I. Kozynchenko ◽  
Sergey A. Kozynchenko

The paper mostly focuses on the methodological and programming aspects of developing a versatile desktop framework to provide the available basis for the high-performance simulation of dynamical models of different kinds and for diverse applications. So the paper gives some basic structure for creating a dynamical simulation model in C++ which is built on the Win32 platform with an interactive multiwindow interface and uses the lightweight Visual C++ Express as a free integrated development environment. The resultant simulation framework could be a more acceptable alternative to other solutions developed on the basis of commercial tools like Borland C++ or Visual C++ Professional, not to mention the domain specific languages and more specialized ready-made software such as Matlab, Simulink, and Modelica. This approach seems to be justified in the case of complex research object-oriented dynamical models having nonstandard structure, relationships, algorithms, and solvers, as it allows developing solutions of high flexibility. The essence of the model framework is shown using a case study of simulation of moving charged particles in the electrostatic field. The simulation model possesses the necessary visualization and control features such as an interactive input, real time graphical and text output, start, stop, and rate control.


2020 ◽  
Author(s):  
Jamie Buck ◽  
Rena Subotnik ◽  
Frank Worrell ◽  
Paula Olszewski-Kubilius ◽  
Chi Wang

2021 ◽  
Author(s):  
Nicolas Le Guillarme ◽  
Wilfried Thuiller

1. Given the biodiversity crisis, we more than ever need to access information on multiple taxa (e.g. distribution, traits, diet) in the scientific literature to understand, map and predict all-inclusive biodiversity. Tools are needed to automatically extract useful information from the ever-growing corpus of ecological texts and feed this information to open data repositories. A prerequisite is the ability to recognise mentions of taxa in text, a special case of named entity recognition (NER). In recent years, deep learning-based NER systems have become ubiqutous, yielding state-of-the-art results in the general and biomedical domains. However, no such tool is available to ecologists wishing to extract information from the biodiversity literature. 2. We propose a new tool called TaxoNERD that provides two deep neural network (DNN) models to recognise taxon mentions in ecological documents. To achieve high performance, DNN-based NER models usually need to be trained on a large corpus of manually annotated text. Creating such a gold standard corpus (GSC) is a laborious and costly process, with the result that GSCs in the ecological domain tend to be too small to learn an accurate DNN model from scratch. To address this issue, we leverage existing DNN models pretrained on large biomedical corpora using transfer learning. The performance of our models is evaluated on four GSCs and compared to the most popular taxonomic NER tools. 3. Our experiments suggest that existing taxonomic NER tools are not suited to the extraction of ecological information from text as they performed poorly on ecologically-oriented corpora, either because they do not take account of the variability of taxon naming practices, or because they do not generalise well to the ecological domain. Conversely, a domain-specific DNN-based tool like TaxoNERD outperformed the other approaches on an ecological information extraction task. 4. Efforts are needed in order to raise ecological information extraction to the same level of performance as its biomedical counterpart. One promising direction is to leverage the huge corpus of unlabelled ecological texts to learn a language representation model that could benefit downstream tasks. These efforts could be highly beneficial to ecologists on the long term.


2014 ◽  
Vol 25 (1) ◽  
pp. 169-185
Author(s):  
Samuel Ángel Jaramillo Flórez ◽  
Yuli Fernanda Achipiz

The bioelectronics takes of the biology the optimized elements for to do a copy and to build technological mechanisms with functions based in that of body lives components. Telecommunications and biology present an analogy between the optical receivers and insects eyes, which forms are adequate to receipt signal since a transmitter, and these are been leaded to perfection by the nature during millions of years in the environment adaptation. The sizes and the forms depend of the direction of the waves and of the radiation pattern of these biotransmitters and bioreceivers (omatidies of insects eyes), which is similar as the optical communications emitters and photodetectors. The growth of the telecommunication services makes necessary the optimization of the bandwidth of the transmission channels. Although the optic transmission is considered like the ideal as for the attenuation and distortion characteristics that make that it possesses the better relation bandwidth - longitude, the demand of more transmission capacity forces to take advantage of them efficiently. High costs generated when deploying Optic Fiber Networks at the transport level, together with other factors that avoid PONs arriving to the home and/or office, have impulsed the design and implementation of partially optical networks (FITL), including an alternative that uses infrared light. This work explores the basis of these news access networks, and it is presented an optical communication transmission/reception system with optic channel of free space where has been modulated the transmitter laser through a set of spherical lens and optical fibers that expand the beam of light to different points of an indoor enclosure producing multiple punctual images located in positions that permit to determine and to optimize the bandwidth of the system. The computational simulation results are showed and are compared with those experimentally measured, indicating that this is an original method for to design emitters and receivers of high performance for optical communications.


2020 ◽  
Author(s):  
Bethany Growns ◽  
Kristy Martire

Forensic feature-comparison examiners in select disciplines are more accurate than novices when comparing visual evidence samples. This paper examines a key cognitive mechanism that may contribute to this superior visual comparison performance: the ability to learn how often stimuli occur in the environment (distributional statistical learning). We examined the relation-ship between distributional learning and visual comparison performance, and the impact of training about the diagnosticity of distributional information in visual comparison tasks. We compared performance between novices given no training (uninformed novices; n = 32), accu-rate training (informed novices; n = 32) or inaccurate training (misinformed novices; n = 32) in Experiment 1; and between forensic examiners (n = 26), informed novices (n = 29) and unin-formed novices (n = 27) in Experiment 2. Across both experiments, forensic examiners and nov-ices performed significantly above chance in a visual comparison task where distributional learning was required for high performance. However, informed novices outperformed all par-ticipants and only their visual comparison performance was significantly associated with their distributional learning. It is likely that forensic examiners’ expertise is domain-specific and doesn’t generalise to novel visual comparison tasks. Nevertheless, diagnosticity training could be critical to the relationship between distributional learning and visual comparison performance.


Author(s):  
Pravin Jagtap ◽  
Rupesh Nasre ◽  
V. S. Sanapala ◽  
B. S. V. Patnaik

Smoothed Particle Hydrodynamics (SPH) is fast emerging as a practically useful computational simulation tool for a wide variety of engineering problems. SPH is also gaining popularity as the back bone for fast and realistic animations in graphics and video games. The Lagrangian and mesh-free nature of the method facilitates fast and accurate simulation of material deformation, interface capture, etc. Typically, particle-based methods would necessitate particle search and locate algorithms to be implemented efficiently, as continuous creation of neighbor particle lists is a computationally expensive step. Hence, it is advantageous to implement SPH, on modern multi-core platforms with the help of High-Performance Computing (HPC) tools. In this work, the computational performance of an SPH algorithm is assessed on multi-core Central Processing Unit (CPU) as well as massively parallel General Purpose Graphical Processing Units (GP-GPU). Parallelizing SPH faces several challenges such as, scalability of the neighbor search process, force calculations, minimizing thread divergence, achieving coalesced memory access patterns, balancing workload, ensuring optimum use of computational resources, etc. While addressing some of these challenges, detailed analysis of performance metrics such as speedup, global load efficiency, global store efficiency, warp execution efficiency, occupancy, etc. is evaluated. The OpenMP and Compute Unified Device Architecture[Formula: see text] parallel programming models have been used for parallel computing on Intel Xeon[Formula: see text] E5-[Formula: see text] multi-core CPU and NVIDIA Quadro M[Formula: see text] and NVIDIA Tesla p[Formula: see text] massively parallel GPU architectures. Standard benchmark problems from the Computational Fluid Dynamics (CFD) literature are chosen for the validation. The key concern of how to identify a suitable architecture for mesh-less methods which essentially require heavy workload of neighbor search and evaluation of local force fields from neighbor interactions is addressed.


Electronics ◽  
2019 ◽  
Vol 8 (12) ◽  
pp. 1501
Author(s):  
Juan Ruiz-Rosero ◽  
Gustavo Ramirez-Gonzalez ◽  
Rahul Khanna

There is a large number of tools for the simulation of traffic and routes in public transport systems. These use different simulation models (macroscopic, microscopic, and mesoscopic). Unfortunately, these simulation tools are limited when simulating a complete public transport system, which includes all its buses and routes (up to 270 for the London Underground). The processing times for these type of simulations increase in an unmanageable way since all the relevant variables that are required to simulate consistently and reliably the system behavior must be included. In this paper, we present a new simulation model for public transport routes’ simulation called Masivo. It runs the public transport stops’ operations in OpenCL work items concurrently, using a multi-core high performance platform. The performance results of Masivo show a speed-up factor of 10.2 compared with the simulator model running with one compute unit and a speed-up factor of 278 times faster than the validation simulator. The real-time factor achieved was 3050 times faster than the 10 h simulated duration, for a public transport system of 300 stops, 2400 buses, and 456,997 passengers.


2015 ◽  
Author(s):  
Pablo Pareja-Tobes ◽  
Raquel Tobes ◽  
Marina Manrique ◽  
Eduardo Pareja ◽  
Eduardo Pareja-Tobes

Background. Next Generation Sequencing and other high-throughput technologies have brought a revolution to the bioinformatics landscape, by offering sheer amounts of data about previously unaccessible domains in a cheap and scalable way. However, fast, reproducible, and cost-effective data analysis at such scale remains elusive. A key need for achieving it is being able to access and query the vast amount of publicly available data, specially so in the case of knowledge-intensive, semantically rich data: incredibly valuable information about proteins and their functions, genes, pathways, or all sort of biological knowledge encoded in ontologies remains scattered, semantically and physically fragmented. Methods and Results. Guided by this, we have designed and developed Bio4j. It aims to offer a platform for the integration of semantically rich biological data using typed graph models. We have modeled and integrated most publicly available data linked with proteins into a set of interdependent graphs. Data querying is possible through a data model aware Domain Specific Language implemented in Java, letting the user write typed graph traversals over the integrated data. A ready to use cloud-based data distribution, based on the Titan graph database engine is provided; generic data import code can also be used for in-house deployment. Conclusion. Bio4j represents a unique resource for the current Bioinformatician, providing at once a solution for several key problems: data integration; expressive, high performance data access; and a cost-effective scalable cloud deployment model.


Sign in / Sign up

Export Citation Format

Share Document