scholarly journals KIST-NOMAD - a Repository to Manage Large Amounts of Computational Materials Science Data

2020 ◽  
Vol 58 (10) ◽  
pp. 728-739
Author(s):  
Samuel Boateng ◽  
Kwang Ryeol Lee ◽  
Deepika ◽  
Haneol Cho ◽  
Kyu Hwan Lee ◽  
...  

We introduce the Korea Institute of Science and Technology-Novel Materials Discovery (KISTNOMAD) platform, a materials data repository. We describe its functionality and novel features from an academic viewpoint. It is a data repository designed for computational material science, especially focusing on managing and sharing the results of molecular dynamics simulation results as well as quantum mechanical computations. It consists of three main components: a database, file storage, and web-based front end. The database hosts material properties, which are extracted from the computational results. The front end has a graphical user interface and an open application programming interface, which allow researchers to interact with the system more easily. KIST-NOMAD’s panel displays the searched results on a well-organized and research-oriented web page. All the open access data and files are available for downloading in comma-separated value format as well as zipped archives. This automated extraction function was developed by utilizing database parsers and JSON scripts. KISTNOMAD also has an efficient option to download simulation and computation results on a large-scale. All of the above functions are designed to satisfy academic and research demands, and make highthroughput screening available, while incorporating machine learning for computational material engineering. We finally stress that the repository platform is user-driven and user-friendly. It is clearly designed to follow the modern big-data architecture and re-use principles for scientific data, such as being findable, accessible, and interoperable.

Author(s):  
Aparna S. Varde ◽  
Shuhui Ma ◽  
Mohammed Maniruzzaman ◽  
David C. Brown ◽  
Elke A. Rundensteiner ◽  
...  

AbstractScientific data is often analyzed in the context of domain-specific problems, for example, failure diagnostics, predictive analysis, and computational estimation. These problems can be solved using approaches such as mathematical models or heuristic methods. In this paper we compare a heuristic approach based on mining stored data with a mathematical approach based on applying state-of-the-art formulae to solve an estimation problem. The goal is to estimate results of scientific experiments given their input conditions. We present a comparative study based on sample space, time complexity, and data storage with respect to a real application in materials science. Performance evaluation with real materials science data is also presented, taking into account accuracy and efficiency. We find that both approaches have their pros and cons in computational estimation. Similar arguments can be applied to other scientific problems such as failure diagnostics and predictive analysis. In the estimation problem in this paper, heuristic methods outperform mathematical models.


Nature ◽  
2020 ◽  
Vol 585 (7825) ◽  
pp. 357-362 ◽  
Author(s):  
Charles R. Harris ◽  
K. Jarrod Millman ◽  
Stéfan J. van der Walt ◽  
Ralf Gommers ◽  
Pauli Virtanen ◽  
...  

AbstractArray programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.


Polymers ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 926
Author(s):  
Matthew A. Bone ◽  
Terence Macquart ◽  
Ian Hamerton ◽  
Brendan J. Howlin

Materials science is beginning to adopt computational simulation to eliminate laboratory trial and error campaigns—much like the pharmaceutical industry of 40 years ago. To further computational materials discovery, new methodology must be developed that enables rapid and accurate testing on accessible computational hardware. To this end, the authors utilise a novel methodology concept of intermediate molecules as a starting point, for which they propose the term ‘symthon’ (The term ‘Symthon’ is being used as a simulation equivalent of the synthon, popularised by Dr Stuart Warren in ‘Organic Synthesis: The Disconnection Approach’, OUP: Oxford, 1983.) rather than conventional monomers. The use of symthons eliminates the initial monomer bonding phase, reducing the number of iterations required in the simulation, thereby reducing the runtime. A novel approach to molecular dynamics, with an NVT (Canonical) ensemble and variable unit cell geometry, was used to generate structures with differing physical and thermal properties. Additional script methods were designed and tested, which enabled a high degree of cure in all sampled structures. This simulation has been trialled on large-scale atomistic models of phenolic resins, based on a range of stoichiometric ratios of formaldehyde and phenol. Density and glass transition temperature values were produced, and found to be in good agreement with empirical data and other simulated values in the literature. The runtime of the simulation was a key consideration in script design; cured models can be produced in under 24 h on modest hardware. The use of symthons has been shown as a viable methodology to reduce simulation runtime whilst generating accurate models.


Author(s):  
Tony Hey ◽  
Keith Butler ◽  
Sam Jackson ◽  
Jeyarajan Thiyagalingam

This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such ‘Big Scientific Data’ comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initial examples of our ‘scientific machine learning’ benchmark suite and of the research challenges these benchmarks will enable. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.


SPIN ◽  
2015 ◽  
Vol 05 (04) ◽  
pp. 1540007 ◽  
Author(s):  
C. P. Chui ◽  
Wenqing Liu ◽  
Yongbing Xu ◽  
Yan Zhou

Molecular dynamics (MD) is a technique of atomistic simulation which has facilitated scientific discovery of interactions among particles since its advent in the late 1950s. Its merit lies in incorporating statistical mechanics to allow for examination of varying atomic configurations at finite temperatures. Its contributions to materials science from modeling pure metal properties to designing nanowires is also remarkable. This review paper focuses on the progress of MD in understanding the behavior of iron — in pure metal form, in alloys, and in composite nanomaterials. It also discusses the interatomic potentials and the integration algorithms used for simulating iron in the literature. Furthermore, it reveals the current progress of MD in simulating iron by exhibiting some results in the literature. Finally, the review paper briefly mentions the development of the hardware and software tools for such large-scale computations.


Author(s):  
R. R. Downs

The Group on Earth Observations (GEO) Data Management Principles (DMP) provide direction for managing geospatial data and related information products and services. Offering opportunities for enabling discovery, accessibility, usability, preservation, and curation, the GEO DMP challenge repositories, such as scientific archives and data centers, to improve practices that foster the use of Earth science data today and in the future. In addition, the Data Management Principles Implementation Guidelines (IG) offer many practical suggestions for implementing the DMP with examples that can inform the consideration of options for improving geospatial data management practices. Implementing such improvements offers value to the users of geospatial data by enabling data providers to support the use of the data products and services that they disseminate. Adopting these improvements also can assist repositories in their efforts to meet the requirements for attaining data repository certification, which offers value for repositories and their stakeholders. This article shows how repositories can improve data management practices for geospatial data by adopting the GEO DMP, with examples drawn from a scientific data center. Current and future opportunities for improving data management practices to attain data repository certification also are described along with practical approaches that repositories can adopt in the short term.


2019 ◽  
Author(s):  
Liqun Cao ◽  
Jinzhe Zeng ◽  
Mingyuan Xu ◽  
Chih-Hao Chin ◽  
Tong Zhu ◽  
...  

Combustion is a kind of important reaction that affects people's daily lives and the development of aerospace. Exploring the reaction mechanism contributes to the understanding of combustion and the more efficient use of fuels. Ab initio quantum mechanical (QM) calculation is precise but limited by its computational time for large-scale systems. In order to carry out reactive molecular dynamics (MD) simulation for combustion accurately and quickly, we develop the MFCC-combustion method in this study, which calculates the interaction between atoms using QM method at the level of MN15/6-31G(d). Each molecule in systems is treated as a fragment, and when the distance between any two atoms in different molecules is greater than 3.5 Å, a new fragment involved two molecules is produced in order to consider the two-body interaction. The deviations of MFCC-combustion from full system calculations are within a few kcal/mol, and the result clearly shows that the calculated energies of the different systems using MFCC-combustion are close to converging after the distance thresholds are larger than 3.5 Å for the two-body QM interactions. The methane combustion was studied with the MFCC-combustion method to explore the combustion mechanism of the methane-oxygen system.


2020 ◽  
Author(s):  
Jin Soo Lim ◽  
Jonathan Vandermause ◽  
Matthijs A. van Spronsen ◽  
Albert Musaelian ◽  
Christopher R. O’Connor ◽  
...  

Restructuring of interface plays a crucial role in materials science and heterogeneous catalysis. Bimetallic systems, in particular, often adopt very different composition and morphology at surfaces compared to the bulk. For the first time, we reveal a detailed atomistic picture of the long-timescale restructuring of Pd deposited on Ag, using microscopy, spectroscopy, and novel simulation methods. Encapsulation of Pd by Ag always precedes layer-by-layer dissolution of Pd, resulting in significant Ag migration out of the surface and extensive vacancy pits. These metastable structures are of vital catalytic importance, as Ag-encapsulated Pd remains much more accessible to reactants than bulk-dissolved Pd. The underlying mechanisms are uncovered by performing fast and large-scale machine-learning molecular dynamics, followed by our newly developed method for complete characterization of atomic surface restructuring events. Our approach is broadly applicable to other multimetallic systems of interest and enables the previously impractical mechanistic investigation of restructuring dynamics.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5260
Author(s):  
Yi-Bing Lin ◽  
Sheng-Lin Chou

Due to the fast evolution of Sensor and Internet of Things (IoT) technologies, several large-scale smart city applications have been commercially developed in recent years. In these developments, the contracts are often disputed in the acceptance due to the fact that the contract specification is not clear, resulting in a great deal of discussion of the gray area. Such disputes often occur in the acceptance processes of smart buildings, mainly because most intelligent building systems are expensive and the operations of the sub-systems are very complex. This paper proposes SpecTalk, a platform that automatically generates the code to conform IoT applications to the Taiwan Association of Information and Communication Standards (TAICS) specifications. SpecTalk generates a program to accommodate the application programming interface of the IoT devices under test (DUTs). Then, the devices can be tested by SpecTalk following the TAICS data formats. We describe three types of tests: self-test, mutual-test, and visual test. A self-test involves the sensors and the actuators of the same DUT. A mutual-test involves the sensors and the actuators of different DUTs. A visual-test uses a monitoring camera to investigate the actuators of multiple DUTs. We conducted these types of tests in commercially deployed applications of smart campus constructions. Our experiments in the tests proved that SpecTalk is feasible and can effectively conform IoT implementations to TACIS specifications. We also propose a simple analytic model to select the frequency of the control signals for the input patterns in a SpecTalk test. Our study indicates that it is appropriate to select the control signal frequency, such that the inter-arrival time between two control signals is larger than 10 times the activation delay of the DUT.


Author(s):  
Ibrahim Awad ◽  
Leila Ladani

Due to their superior mechanical and electrical properties, multiwalled carbon nanotubes (MWCNTs) have the potential to be used in many nano-/micro-electronic applications, e.g., through silicon vias (TSVs), interconnects, transistors, etc. In particular, use of MWCNT bundles inside annular cylinders of copper (Cu) as TSV is proposed in this study. However, the significant difference in scale makes it difficult to evaluate the interfacial mechanical integrity. Cohesive zone models (CZM) are typically used at large scale to determine the mechanical adherence at the interface. However, at molecular level, no routine technique is available. Molecular dynamic (MD) simulations is used to determine the stresses that are required to separate MWCNTs from a copper slab and generate normal stress–displacement curves for CZM. Only van der Waals (vdW) interaction is considered for MWCNT/Cu interface. A displacement controlled loading was applied in a direction perpendicular to MWCNT's axis in different cases with different number of walls and at different temperatures and CZM is obtained for each case. Furthermore, their effect on the CZM key parameters (normal cohesive strength (σmax) and the corresponding displacement (δn) has been studied. By increasing the number of the walls of the MWCNT, σmax was found to nonlinearly decrease. Displacement at maximum stress, δn, showed a nonlinear decrease as well with increasing the number of walls. Temperature effect on the stress–displacement curves was studied. When temperature was increased beyond 1 K, no relationship was found between the maximum normal stress and temperature. Likewise, the displacement at maximum load did not show any dependency to temperature.


Sign in / Sign up

Export Citation Format

Share Document