Innovative Devops for Artificial Intelligence

AbstractDeveloping Artificial Intelligence is a labor intensive task. It implies both storage and computational resources. In this paper, we present a state-of-the-art service based infrastructure for deploying, managing and serving computational models alongside their respective data-sets and virtual environments. Our architecture uses key-based values to store specific graphs and datasets into memory for fast deployment and model training, furthermore leveraging the need for manual data reduction in the drafting and retraining stages. To develop the platform, we used clustering and orchestration to set up services and containers that allow deployment within seconds. In this article, we cover high performance computing concepts such as swarming, GPU resource management for model implementation in production environments with emphasis on standardized development to reduce integration tasks and performance optimization.

Download Full-text

MG-RAST version 4—lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis

Briefings in Bioinformatics ◽

10.1093/bib/bbx105 ◽

2017 ◽

Vol 20 (4) ◽

pp. 1151-1159 ◽

Cited By ~ 28

Author(s):

Folker Meyer ◽

Saurabh Bagchi ◽

Somali Chaterji ◽

Wolfgang Gerlach ◽

Ananth Grama ◽

...

Keyword(s):

Performance Optimization ◽

High Performance ◽

Lessons Learned ◽

Added Value ◽

Data Sets ◽

Real World Data ◽

Metagenome Analysis ◽

Trade Offs ◽

And Performance ◽

The Right

Abstract As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1–3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community’s data analysis tasks.

Download Full-text

AstroCatR: a mechanism and tool for efficient time series reconstruction of large-scale astronomical catalogues

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa1413 ◽

2020 ◽

Vol 496 (1) ◽

pp. 629-637

Author(s):

Ce Yu ◽

Kun Li ◽

Shanjiang Tang ◽

Chao Sun ◽

Bin Ma ◽

...

Keyword(s):

Time Series ◽

High Performance ◽

Large Scale ◽

Extrasolar Planets ◽

Time Series Data ◽

Series Data ◽

Data Sets ◽

Observation Data ◽

Data Volume ◽

And Performance

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.

Download Full-text

Robust Federated Learning via Collaborative Machine Teaching

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5826 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4075-4082

Author(s):

Yufei Han ◽

Xiangliang Zhang

Keyword(s):

Teaching And Learning ◽

Service Providers ◽

Teaching Method ◽

Training Data ◽

Data Sets ◽

Quality Of Data ◽

Training Set ◽

In The Wild ◽

Model Training ◽

Set Up

For federated learning systems deployed in the wild, data flaws hosted on local agents are widely witnessed. On one hand, given a large amount (e.g. over 60%) of training data are corrupted by systematic sensor noise and environmental perturbations, the performances of federated model training can be degraded significantly. On the other hand, it is prohibitively expensive for either clients or service providers to set up manual sanitary checks to verify the quality of data instances. In our study, we echo this challenge by proposing a collaborative and privacy-preserving machine teaching method. Specifically, we use a few trusted instances provided by teachers as benign examples in the teaching process. Our collaborative teaching approach seeks jointly the optimal tuning on the distributed training set, such that the model learned from the tuned training set predicts labels of the trusted items correctly. The proposed method couples the process of teaching and learning and thus produces directly a robust prediction model despite the extremely pervasive systematic data corruption. The experimental study on real benchmark data sets demonstrates the validity of our method.

Download Full-text

Sleep – A Game Changer in the Athletic World?

Swiss Sports & Exercise Medicine ◽

10.34045/ssem/2018/29 ◽

2018 ◽

Vol 66 (4) ◽

Keyword(s):

Performance Optimization ◽

High Performance ◽

Recovery Process ◽

Poor Sleep ◽

Recovery Method ◽

Lack Of Sleep ◽

And Performance ◽

The Individual ◽

Traditional Approaches ◽

And Control

The restorative qualities of sleep are fundamentally the basis of the individual athlete’s ability to recover and perform, and to optimally be able to challenge and control the effects of exercise regimes in high performance sport. Research consistently shows that a large percentage of the population fails to obtain the recommended 7–9 hours of sleep per night [17]. Moreover, recent years’ research has found that athletes have a high prevalence of poor sleep quality [6]. Given its implications on the recovery process, sleep affects the quality of the athlete’s training and outcome of competitions. Although an increasing number of recovery aids (such as cold baths, anti-inflammatory agents, high protein intake etc.) are available, recent years research show the important and irreplaceable role of sleep and that no recovery method can compensate for the lack of sleep. Every facet of an athlete’s life has the capacity to either create or take out energy, contribute to the overall stress level and subsequently the level of both recovery and performance. While traditional approaches to performance optimization focus simply on the physical stressors, this overview will highlight the benefits and the basic principles of sleep, its relation to recovery and performance, and provide input and reflect on what to consider when working with development and maintenance of athletic performance.

Download Full-text

Integrated Cloud Computing Environment for Upstream Geoscience Workflows

10.2118/204848-ms ◽

2021 ◽

Author(s):

Murtadha Al-Habib ◽

Yasser Al-Ghamdi

Keyword(s):

High Performance ◽

Large Scale ◽

End Users ◽

Data Sets ◽

Production Environment ◽

Remote Visualization ◽

Test Environment ◽

Petroleum Resources ◽

Customized Production ◽

And Performance

Abstract Extensive computing resources are required to leverage todays advanced geoscience workflows that are used to explore and characterize giant petroleum resources. In these cases, high-performance workstations are often unable to adequately handle the scale of computing required. The workflows typically utilize complex and massive data sets, which require advanced computing resources to store, process, manage, and visualize various forms of the data throughout the various lifecycles. This work describes a large-scale geoscience end-to-end interpretation platform customized to run on a cluster-based remote visualization environment. A team of computing infrastructure and geoscience workflow experts was established to collaborate on the deployment, which was broken down into separate phases. Initially, an evaluation and analysis phase was conducted to analyze computing requirements and assess potential solutions. A testing environment was then designed, implemented and benchmarked. The third phase used the test environment to determine the scale of infrastructure required for the production environment. Finally, the full-scale customized production environment was deployed for end users. During testing phase, aspects such as connectivity, stability, interactivity, functionality, and performance were investigated using the largest available geoscience datasets. Multiple computing configurations were benchmarked until optimal performance was achieved, under applicable corporate information security guidelines. It was observed that the customized production environment was able to execute workflows that were unable to run on local user workstations. For example, while conducting connectivity, stability and interactivity benchmarking, the test environment was operated for extended periods to ensure stability for workflows that require multiple days to run. To estimate the scale of the required production environment, varying categories of users’ portfolio were determined based on data type, scale and workflow. Continuous monitoring of system resources and utilization enabled continuous improvements to the final solution. The utilization of a fit-for-purpose, customized remote visualization solution may reduce or ultimately eliminate the need to deploy high-end workstations to all end users. Rather, a shared, scalable and reliable cluster-based solution can serve a much larger user community in a highly performant manner.

Download Full-text

HRV in an Integrated Hardware/Software System Using Artificial Intelligence to Provide Assessment, Intervention and Performance Optimization

Autonomic Nervous System Monitoring - Heart Rate Variability ◽

10.5772/intechopen.89042 ◽

2020 ◽

Cited By ~ 1

Author(s):

Robert L. Drury

Keyword(s):

Artificial Intelligence ◽

Performance Optimization ◽

Software System ◽

And Performance

Download Full-text

Research on Real-Time Monitoring and Performance Optimization of Suspension System in Maglev Train

Applied Sciences ◽

10.3390/app112411952 ◽

2021 ◽

Vol 11 (24) ◽

pp. 11952

Author(s):

Xu Zhou ◽

Tao Wen ◽

Zhiqiang Long

Keyword(s):

Real Time ◽

Performance Optimization ◽

High Performance ◽

Suspension System ◽

Control Performance ◽

Real Time Monitoring ◽

Maglev Train ◽

Dynamic Compensator ◽

And Performance ◽

Residual Generator

With the success of the commercial operation of the maglev train, the demand for real-time monitoring and high-performance control of the maglev train suspension system is also increasing. Therefore, a framework for performance monitoring and performance optimization of the maglev train suspension system is proposed in this article. This framework consists of four parts: plant, feedback controller, residual generator, and dynamic compensator. Firstly, after the system model is established, the nominal controller is designed to ensure the stability of the system. Secondly, the observer-based residual generator is identified offline based on the input and output data without knowing the accurate model of the system, which avoids the interference of the unmodeled part. Thirdly, the control performance is monitored and evaluated in real time by analyzing the residual and executing the judgment logic. Fourthly, when the control performance of the system is degraded or not satisfactory, the dynamic compensator based on the residual is updated online iteratively to optimize the control performance. Finally, the proposed framework and theory are verified on the single suspension experimental platform and the results show the effectiveness.

Download Full-text

bíogo: a simple high-performance bioinformatics toolkit for the Go language

10.1101/005033 ◽

2014 ◽

Cited By ~ 6

Author(s):

R Daniel Kortschak ◽

David L Adelson

Keyword(s):

High Performance ◽

Large Scale ◽

Large Data ◽

Biological Data ◽

Data Sets ◽

Barriers To Entry ◽

Data Types ◽

Concurrent Processing ◽

Computationally Intensive ◽

And Performance

bíogo is a framework designed to ease development and maintenance of computationally intensive bioinformatics applications. The library is written in the Go programming language, a garbage-collected, strictly typed compiled language with built in support for concurrent processing, and performance comparable to C and Java. It provides a variety of data types and utility functions to facilitate manipulation and analysis of large scale genomic and other biological data. bíogo uses a concise and expressive syntax, lowering the barriers to entry for researchers needing to process large data sets with custom analyses while retaining computational safety and ease of code review. We believe bíogo provides an excellent environment for training and research in computational biology because of its combination of strict typing, simple and expressive syntax, and high performance.

Download Full-text

Second-wave AI and Afro-existential norms

Filosofia Theoretica Journal of African Philosophy Culture and Religions ◽

10.4314/ft.v9i3.4 ◽

2021 ◽

Vol 9 (3) ◽

pp. 49-64

Author(s):

Abiola Azeez ◽

Tosin Adeate

Keyword(s):

Artificial Intelligence ◽

Big Data ◽

Great Influence ◽

Human Existence ◽

Human Intelligence ◽

Data Sets ◽

Shared Experience ◽

Set Up ◽

Specialized Hardware ◽

Second Wave

The idea of afro-existentialism connotes how Africans make sense of living and the meaning and meaninglessness attached to human existence. Different phenomena inform the way humans interpret existence, and one of such in the contemporary period, with great influence on Africans, is human involvement with non-human intelligence (AI), in its different eruptions. This paper focuses on the second-wave AI, which is a period of improved simulation of natural intelligence, whose singularity principle hypothesizes individualist motives. The paper asks, to what extent do Afroexistential norms accommodate second-wave AI? Partly in disagreement with the claim that AI is for everyone, we argue that second-wave artificial intelligence weakly adapts to Afro-existential practices, which is largely communal, emphasizing shared experience. We justify this claim by arguing that Western ethical patterns, which inform the features of the second-wave AI such as statistical patterns, smart algorithm, specialized hardware, and big data sets, emerge from individualist notions. This paper argues that second-wave AI trends do not reflect African norms of existence being factored into ordering algorithmic patterns that set up AI systems and programs. We infer that Afro-existential practices unsettles with the individualist principle which underlines second-wave AI and therefore, a conversation around the development and application of communal interpretation of AI is important.

Download Full-text

The Development of Embedded Rotating Machine Fault Diagnosis System Based on Qt/Embedded

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.328-330.933 ◽

2011 ◽

Vol 328-330 ◽

pp. 933-938

Author(s):

Ze Min Zhou ◽

Chun Liang Zhang ◽

Yue Hua Xiong

Keyword(s):

Fault Diagnosis ◽

Data Storage ◽

Diagnostic Model ◽

Diagnosis System ◽

Rotating Machine ◽

Almost Everywhere ◽

Fault Diagnosis System ◽

Model Training ◽

And Performance ◽

Set Up

As the constantly growth of embedded Technology, embedded products put into use in almost everywhere around our life. This paper mainly introduce how to set up the Qt/Embedded Cross-compiling environment in Linux system platform, and the development and transplant of embedded fault diagnosis system use development tools of Qt/Embedded. This system achieve those functions, including fault display, signal analysis, diagnostic model training and data storage or transmittal, by graphical interfaces designing and performance function addition. And finally create the application document which applies to ARM framework embedded processor by Cross-compiling. In the end, the application document is transplanted and run in the target platform, and testing the diagnosis effects.

Download Full-text