scholarly journals A Recommender for Choosing Data Systems based on Application Profiling and Benchmarking

2021 ◽  
Author(s):  
Elton Figueiredo de Souza Soares ◽  
Renan Souza ◽  
Raphael Melo Thiago ◽  
Marcelo de Oliveira Costa Machado ◽  
Leonardo Guerreiro Azevedo

In our data-driven society, there are hundreds of possible data systems in the market with a wide range of configuration parameters, making it very hard for enterprises and users to choose the most suitable data systems. There is a lack of representative empirical evidence to help users make an informed decision. Using benchmark results is a widely adopted practice, but like there are several data systems, there are various benchmarks. This ongoing work presents an architecture and methods of a system that supports the recommendation of the most suitable data system for an application. We also illustrates how the recommendation would work in a fictitious scenario.

2021 ◽  
Author(s):  
Kerstin Lehnert ◽  
Daven Quinn ◽  
Basil Tikoff ◽  
Douglas Walker ◽  
Sarah Ramdeen ◽  
...  

<div> <p>Management of geochemical data needs to consider the sequence of phases in the lifecycle of these data from field to lab to publication to archive. It also needs to address the large variety of chemical properties measured; the wide range of materials that are analyzed; the different ways, in which these materials may be prepared for analysis; the diversity of analytical techniques and instrumentation used to obtain analytical results; and the many ways used to calibrate and correct raw data, normalize them to standard reference materials, and otherwise treat them to obtain meaningful and comparable results. In order to extract knowledge from the data, they are then integrated and compared with other measurements, formatted for visualization, statistical analysis, or model generation, and finally cleaned and organized for publication and deposition in a data repository. Each phase in the geochemical data lifecycle has its specific workflows and metadata that need to be recorded to fully document the provenance of the data so that others can reproduce the results.</p> </div><div> <p>An increasing number of software tools are developed to support the different phases of the geochemical data lifecycle. These include electronic field notebooks, digital lab books, and Jupyter notebooks for data analysis, as well as data submission forms and templates. These tools are mostly disconnected and often require manual transcription or copying and pasting of data and metadata from one tool to the other. In an ideal world, these tools would be connected so that field observations gathered in a digital field notebook, such as sample locations and sampling dates, can be seamlessly send to an IGSN Allocating Agent to obtain a unique sample identifier with a QR code with a single click. The sample metadata would be readily accessible for the lab data management system that allows the researchers to capture information about the sample preparation, and that connects to the instrumentation to capture instrument settings and the raw data. The data would then be seamlessly accessed by data reduction software, visualized, and further compared to data from global databases that can be directly accessed. Ultimately, a few clicks will allow the user to format the data for publication and archiving.</p> </div><div> <p>Several data systems that support different stages in the lifecycle of samples and sample-based geochemical data have now come together to explore the development of standardized interfaces and APIs and consistent data and metadata schemas to link their systems into an efficient pipeline for geochemical data from the field to the archive. These systems include StraboSpot (www.strabospot.org; data system for digital collection, storage, and sharing of both field and lab data), SESAR (<span>www.geosamples.org</span>; sample registry and allocating agent for IGSN), EarthChem (www.earthchem.org; publishers and repository for geochemical data), Sparrow (sparrow-data.org; data system to organize analytical data and track project- and sample-level metadata), IsoBank (isobank.org; repository for stable isotope data), and MacroStrat (macrostrat.org; collaborative platform for geological data exploration and integration).</p> </div>


2021 ◽  
pp. 204141962199349
Author(s):  
Jordan J Pannell ◽  
George Panoutsos ◽  
Sam B Cooke ◽  
Dan J Pope ◽  
Sam E Rigby

Accurate quantification of the blast load arising from detonation of a high explosive has applications in transport security, infrastructure assessment and defence. In order to design efficient and safe protective systems in such aggressive environments, it is of critical importance to understand the magnitude and distribution of loading on a structural component located close to an explosive charge. In particular, peak specific impulse is the primary parameter that governs structural deformation under short-duration loading. Within this so-called extreme near-field region, existing semi-empirical methods are known to be inaccurate, and high-fidelity numerical schemes are generally hampered by a lack of available experimental validation data. As such, the blast protection community is not currently equipped with a satisfactory fast-running tool for load prediction in the near-field. In this article, a validated computational model is used to develop a suite of numerical near-field blast load distributions, which are shown to follow a similar normalised shape. This forms the basis of the data-driven predictive model developed herein: a Gaussian function is fit to the normalised loading distributions, and a power law is used to calculate the magnitude of the curve according to established scaling laws. The predictive method is rigorously assessed against the existing numerical dataset, and is validated against new test models and available experimental data. High levels of agreement are demonstrated throughout, with typical variations of <5% between experiment/model and prediction. The new approach presented in this article allows the analyst to rapidly compute the distribution of specific impulse across the loaded face of a wide range of target sizes and near-field scaled distances and provides a benchmark for data-driven modelling approaches to capture blast loading phenomena in more complex scenarios.


2021 ◽  
Vol 143 (3) ◽  
Author(s):  
Suhui Li ◽  
Huaxin Zhu ◽  
Min Zhu ◽  
Gang Zhao ◽  
Xiaofeng Wei

Abstract Conventional physics-based or experimental-based approaches for gas turbine combustion tuning are time consuming and cost intensive. Recent advances in data analytics provide an alternative method. In this paper, we present a cross-disciplinary study on the combustion tuning of an F-class gas turbine that combines machine learning with physics understanding. An artificial-neural-network-based (ANN) model is developed to predict the combustion performance (outputs), including NOx emissions, combustion dynamics, combustor vibrational acceleration, and turbine exhaust temperature. The inputs of the ANN model are identified by analyzing the key operating variables that impact the combustion performance, such as the pilot and the premixed fuel flow, and the inlet guide vane angle. The ANN model is trained by field data from an F-class gas turbine power plant. The trained model is able to describe the combustion performance at an acceptable accuracy in a wide range of operating conditions. In combination with the genetic algorithm, the model is applied to optimize the combustion performance of the gas turbine. Results demonstrate that the data-driven method offers a promising alternative for combustion tuning at a low cost and fast turn-around.


Author(s):  
Mohammed A. Alhossini ◽  
Collins G. Ntim ◽  
Alaa Mansour Zalata

This paper comprehensively reviews the current body of international accounting literature regarding advisory/monitoring committees and corporate outcomes. Specifically, it synthesizes, appraises, and extends current knowledge on the (a) theoretical (i.e., economic, accounting/corporate governance, sociological and socio-psychological) perspectives and (b) empirical evidence of the observable and less visible attributes at both the individual and committee levels and their link with a wide range (financial/non-financial) of corporate outcomes. Using the systematic literature review method, 304 articles from 59 journals in the fields of accounting and finance that were published between January 1992 and December 2018 are reviewed. The main findings are as follows. First and theoretically, agency theory is the most dominant applied theory/studies with no application of theory at all (descriptive), while the application of integrated theoretical frameworks is lacking in the reviewed articles. Secondly, the existing empirical evidence focusses excessively on (a) monitoring instead of advisory committees and (b) observable rather than less visible committee attributes. Thirdly, scarcity of cross-country studies along with methodological limitations relating to measurement inconsistencies, insufficiency of variables, and dominance of quantitative studies, among others, are identified. Finally, promising future research avenues are outlined.


2021 ◽  
Vol 17 (2) ◽  
pp. e1008635
Author(s):  
Gerrit Ansmann ◽  
Tobias Bollenbach

Many ecological studies employ general models that can feature an arbitrary number of populations. A critical requirement imposed on such models is clone consistency: If the individuals from two populations are indistinguishable, joining these populations into one shall not affect the outcome of the model. Otherwise a model produces different outcomes for the same scenario. Using functional analysis, we comprehensively characterize all clone-consistent models: We prove that they are necessarily composed from basic building blocks, namely linear combinations of parameters and abundances. These strong constraints enable a straightforward validation of model consistency. Although clone consistency can always be achieved with sufficient assumptions, we argue that it is important to explicitly name and consider the assumptions made: They may not be justified or limit the applicability of models and the generality of the results obtained with them. Moreover, our insights facilitate building new clone-consistent models, which we illustrate for a data-driven model of microbial communities. Finally, our insights point to new relevant forms of general models for theoretical ecology. Our framework thus provides a systematic way of comprehending ecological models, which can guide a wide range of studies.


2021 ◽  
Vol 108 (Supplement_7) ◽  
Author(s):  
Fatima Rahman ◽  
Alan Hales ◽  
Ryan Beegan ◽  
David Cable ◽  
David Rew

Abstract Background Many surgeons work within multidisciplinary cancer teams. The Somerset Cancer Register (SCR) is a national reporting system for service performance which is in use in more than 100 NHS Trusts. However, the core system has not yet been optimised for MDT users or for the surfacing of clinical data for research and other uses. Methods SCR replaced our legacy cancer reporting system in 2014. Working with the SCR developers, we integrated our cellular pathology and imaging records with the SCR MDT outputs. We subsequently developed SCR+ to optimise workflows for MDT coordinators and information presentation to clinical users.    Results Our HTML-enabled SCR+ software application displays all cancer patients by pathological type and year of presentation on dynamic histograms, for ease of visualisation and interaction. Every selected case is displayed in list order for each and every MDT meeting, with a fast hyperlink to our integral Lifelines EPR interface, to electronic pathology records back to 1990, and to our Breast Cancer Data System for relevant patients. Conclusions The SCR+ module transforms the access and visualisation of cancer workload across our Trust for all authorised MDT users, with appropriate data security. The agile programming methodology allowed us to build a sustainable cancer data system with further development potential. The product substantially enhances user experience, data recall and productivity over legacy systems. Close cooperation between clinically proficient  IT teams and clinicians as the end consumers of digital health data systems yields significant operational benefits at pace and with very modest costs.  


Author(s):  
Patrick Gelß ◽  
Stefan Klus ◽  
Jens Eisert ◽  
Christof Schütte

A key task in the field of modeling and analyzing nonlinear dynamical systems is the recovery of unknown governing equations from measurement data only. There is a wide range of application areas for this important instance of system identification, ranging from industrial engineering and acoustic signal processing to stock market models. In order to find appropriate representations of underlying dynamical systems, various data-driven methods have been proposed by different communities. However, if the given data sets are high-dimensional, then these methods typically suffer from the curse of dimensionality. To significantly reduce the computational costs and storage consumption, we propose the method multidimensional approximation of nonlinear dynamical systems (MANDy) which combines data-driven methods with tensor network decompositions. The efficiency of the introduced approach will be illustrated with the aid of several high-dimensional nonlinear dynamical systems.


2015 ◽  
Vol 282 (1818) ◽  
pp. 20152068 ◽  
Author(s):  
Veronika Bernhauerová ◽  
Luděk Berec ◽  
Daniel Maxin

Early male-killing (MK) bacteria are vertically transmitted reproductive parasites which kill male offspring that inherit them. Whereas their incidence is well documented, characteristics allowing originally non-MK bacteria to gradually evolve MK ability remain unclear. We show that horizontal transmission is a mechanism enabling vertically transmitted bacteria to evolve fully efficient MK under a wide range of host and parasite characteristics, especially when the efficacy of vertical transmission is high. We also show that an almost 100% vertically transmitted and 100% effective male-killer may evolve from a purely horizontally transmitted non-MK ancestor, and that a 100% efficient male-killer can form a stable coexistence only with a non-MK bacterial strain. Our findings are in line with the empirical evidence on current MK bacteria, explain their high efficacy in killing infected male embryos and their variability within and across insect taxa, and suggest that they may have evolved independently in phylogenetically distinct species.


Author(s):  
Madhumitha Ramachandran ◽  
Zahed Siddique

Abstract Rotary seals are found in many manufacturing equipment and machines used for various applications under a wide range of operating conditions. Rotary seal failure can be catastrophic and can lead to costly downtime and large expenses; so it is extremely important to assess the degradation of rotary seal to avoid fatal breakdown of machineries. Physics-based rotary seal prognostics require direct estimation of different physical parameters to assess the degradation of seals. Data-driven prognostics utilizing sensor technology and computational capabilities can aid in the in-direct estimation of rotary seals’ running condition unlike the physics-based approach. An important aspect of data-driven prognostics is to collect appropriate data in order to reduce the cost and time associated with the data collection, storage and computation. Seals in machineries operate in harsh conditions, especially in the oil field, seals are exposed to harsh environment and aggressive fluids which gradually reduces the elastic modulus and hardness of seals, resulting in lower friction torque and excessive leakage. Therefore, in this study we implement a data-driven prognostics approach which utilizes friction torque and leakage signals along with Multilayer Perceptron as a classifier to compare the performance of the two metrics in classifying the running condition of rotary seals. Friction torque was found to have a better performance than leakage in terms of differentiating the running condition of rotary seals throughout its service life. Although this approach was designed for seals in oil and gas industry, this approach can be implemented in any manufacturing industry with similar applications.


Sign in / Sign up

Export Citation Format

Share Document