scholarly journals Carry-Propagation-Adder-Factored Gemmini Systolic Array for Machine Learning Acceleration

Electronics ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 652
Author(s):  
Kashif Inayat ◽  
Jaeyong Chung

Systolic arrays are the primary part of modern deep learning accelerators and are being used widely in real-life applications such as self-driving cars. This paper presents a novel factored systolic array, where the carry propagation adder for accumulation and the rounding logic are extracted out from each processing element, which reduces the area, power and delay of the processing elements substantially. The factoring is performed in the column-wise manner and the cost of the factored logic, placed at each column output, is amortized by the processing elements in a column. We demonstrate the proposed factoring in an open source systolic array, Gemmini. The factoring technique does not change the functionality of the base design and is transparent to applications. We show that the proposed technique leads to substantial reduction in area and delay up to 45.3% and 23.7%, respectively, compared to the Gemmini baseline.


2020 ◽  
Author(s):  
Velimir Ilić ◽  
Alessandro Bertolini ◽  
Fabio Bonsignorio ◽  
Dario Jozinović ◽  
Tomasz Bulik ◽  
...  

<p>The analysis of low-frequency gravitational waves (GW) data is a crucial mission of GW science and the performance of Earth-based GW detectors is largely influenced by ability of combating the low-frequency ambient seismic noise and other seismic influences. This tasks require multidisciplinary research in the fields of seismic sensing, signal processing, robotics, machine learning and mathematical modeling.<br><br>In practice, this kind of research is conducted by large teams of researchers with different expertise, so that project management emerges as an important real life challenge in the projects for acquisition, processing and interpretation of seismic data from GW detector site. A prominent example that successfully deals with this aspect could be observed in the COST Action G2Net (CA17137 - A network for Gravitational Waves, Geophysics and Machine Learning) and its seismic research group, which counts more than 30 members. </p><div>In this talk we will review the structure of the group, present the goals and recent activities of the group, and present new methods for combating the seismic influences at GW detector site that will be developed and applied within this collaboration.</div><div> <p> </p> <p>This publication is based upon work from CA17137 - A network for Gravitational Waves, Geophysics and Machine Learning, supported by COST (European Cooperation in Science and Technology).</p> </div>



Author(s):  
María Dolores Torres ◽  
Aurora Torres Soto ◽  
Carlos Alberto Ochoa Ortiz Zezzatti ◽  
Eunice E. Ponce de León Sentí ◽  
Elva Díaz Díaz ◽  
...  

This chapter presents the implementation of a Genetic Algorithm into a framework for machine learning that deals with the problem of identifying the factors that impact the health state of newborns in Mexico. Experimental results show a percentage of correct clustering for unsupervised learning of 89%, a real life training matrix of 46 variables, was reduced to only 25 that represent 54% of its original size. Moreover execution time is about one and a half minutes. Each risk factor (of neonatal health) found by the algorithm was validated by medical experts. The contribution to the medical field is invaluable, since the cost of monitoring these features is minimal and it can reduce neonatal mortality in our country.



Author(s):  
Anca Sailer ◽  
Ruchi Mahindru ◽  
Yang Song ◽  
Xing Wei

Problem determination and resolution (PDR) is at the core of Incident and Problem Management. PDR is the process of detecting an anomaly in a monitored system, identifying the nature of the anomaly in view of routing to the appropriate support team, determining the root cause responsible for the anomaly and fixing or eliminating the cause of the problem. The cost of PDR represents a substantial part of operational costs, and faster, more effective PDR can contribute to a substantial reduction in system administration costs. The methodologies described by the authors in this chapter relate to automation of critical aspects of PDR, such as problem classification for targeted diagnosis and structuring of solved problem tickets for offering systematized resolution to the support personnel.



2021 ◽  
Vol 14 (3) ◽  
pp. 1-21
Author(s):  
Roy Abitbol ◽  
Ilan Shimshoni ◽  
Jonathan Ben-Dov

The task of assembling fragments in a puzzle-like manner into a composite picture plays a significant role in the field of archaeology as it supports researchers in their attempt to reconstruct historic artifacts. In this article, we propose a method for matching and assembling pairs of ancient papyrus fragments containing mostly unknown scriptures. Papyrus paper is manufactured from papyrus plants and therefore portrays typical thread patterns resulting from the plant’s stems. The proposed algorithm is founded on the hypothesis that these thread patterns contain unique local attributes such that nearby fragments show similar patterns reflecting the continuations of the threads. We posit that these patterns can be exploited using image processing and machine learning techniques to identify matching fragments. The algorithm and system which we present support the quick and automated classification of matching pairs of papyrus fragments as well as the geometric alignment of the pairs against each other. The algorithm consists of a series of steps and is based on deep-learning and machine learning methods. The first step is to deconstruct the problem of matching fragments into a smaller problem of finding thread continuation matches in local edge areas (squares) between pairs of fragments. This phase is solved using a convolutional neural network ingesting raw images of the edge areas and producing local matching scores. The result of this stage yields very high recall but low precision. Thus, we utilize these scores in order to conclude about the matching of entire fragments pairs by establishing an elaborate voting mechanism. We enhance this voting with geometric alignment techniques from which we extract additional spatial information. Eventually, we feed all the data collected from these steps into a Random Forest classifier in order to produce a higher order classifier capable of predicting whether a pair of fragments is a match. Our algorithm was trained on a batch of fragments which was excavated from the Dead Sea caves and is dated circa the 1st century BCE. The algorithm shows excellent results on a validation set which is of a similar origin and conditions. We then tried to run the algorithm against a real-life set of fragments for which we have no prior knowledge or labeling of matches. This test batch is considered extremely challenging due to its poor condition and the small size of its fragments. Evidently, numerous researchers have tried seeking matches within this batch with very little success. Our algorithm performance on this batch was sub-optimal, returning a relatively large ratio of false positives. However, the algorithm was quite useful by eliminating 98% of the possible matches thus reducing the amount of work needed for manual inspection. Indeed, experts that reviewed the results have identified some positive matches as potentially true and referred them for further investigation.



Polymers ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 353
Author(s):  
Kun-Cheng Ke ◽  
Ming-Shyan Huang

Conventional methods for assessing the quality of components mass produced using injection molding are expensive and time-consuming or involve imprecise statistical process control parameters. A suitable alternative would be to employ machine learning to classify the quality of parts by using quality indices and quality grading. In this study, we used a multilayer perceptron (MLP) neural network along with a few quality indices to accurately predict the quality of “qualified” and “unqualified” geometric shapes of a finished product. These quality indices, which exhibited a strong correlation with part quality, were extracted from pressure curves and input into the MLP model for learning and prediction. By filtering outliers from the input data and converting the measured quality into quality grades used as output data, we increased the prediction accuracy of the MLP model and classified the quality of finished parts into various quality levels. The MLP model may misjudge datapoints in the “to-be-confirmed” area, which is located between the “qualified” and “unqualified” areas. We classified the “to-be-confirmed” area, and only the quality of products in this area were evaluated further, which reduced the cost of quality control considerably. An integrated circuit tray was manufactured to experimentally demonstrate the feasibility of the proposed method.



2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Zhikuan Zhao ◽  
Jack K. Fitzsimons ◽  
Patrick Rebentrost ◽  
Vedran Dunjko ◽  
Joseph F. Fitzsimons

AbstractMachine learning has recently emerged as a fruitful area for finding potential quantum computational advantage. Many of the quantum-enhanced machine learning algorithms critically hinge upon the ability to efficiently produce states proportional to high-dimensional data points stored in a quantum accessible memory. Even given query access to exponentially many entries stored in a database, the construction of which is considered a one-off overhead, it has been argued that the cost of preparing such amplitude-encoded states may offset any exponential quantum advantage. Here we prove using smoothed analysis that if the data analysis algorithm is robust against small entry-wise input perturbation, state preparation can always be achieved with constant queries. This criterion is typically satisfied in realistic machine learning applications, where input data is subjective to moderate noise. Our results are equally applicable to the recent seminal progress in quantum-inspired algorithms, where specially constructed databases suffice for polylogarithmic classical algorithm in low-rank cases. The consequence of our finding is that for the purpose of practical machine learning, polylogarithmic processing time is possible under a general and flexible input model with quantum algorithms or quantum-inspired classical algorithms in the low-rank cases.



Author(s):  
Amrik Singh ◽  
K.R. Ramkumar

Due to the advancement of medical sensor technologies new vectors can be added to the health insurance packages. Such medical sensors can help the health as well as the insurance sector to construct mathematical risk equation models with parameters that can map the real-life risk conditions. In this paper parameter analysis in terms of medical relevancy as well in terms of correlation has been done. Considering it as ‘inverse problem’ the mathematical relationship has been found and are tested against the ground truth between the risk indicators. The pairwise correlation analysis gives a stable mathematical equation model can be used for health risk analysis. The equation gives coefficient values from which classification regarding health insurance risk can be derived and quantified. The Logistic Regression equation model gives the maximum accuracy (86.32%) among the Ridge Bayesian and Ordinary Least Square algorithms. Machine learning algorithm based risk analysis approach was formulated and the series of experiments show that K-Nearest Neighbor classifier has the highest accuracy of 93.21% to do risk classification.



2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Miles L. Timpe ◽  
Maria Han Veiga ◽  
Mischa Knabenhans ◽  
Joachim Stadel ◽  
Stefano Marelli

AbstractIn the late stages of terrestrial planet formation, pairwise collisions between planetary-sized bodies act as the fundamental agent of planet growth. These collisions can lead to either growth or disruption of the bodies involved and are largely responsible for shaping the final characteristics of the planets. Despite their critical role in planet formation, an accurate treatment of collisions has yet to be realized. While semi-analytic methods have been proposed, they remain limited to a narrow set of post-impact properties and have only achieved relatively low accuracies. However, the rise of machine learning and access to increased computing power have enabled novel data-driven approaches. In this work, we show that data-driven emulation techniques are capable of classifying and predicting the outcome of collisions with high accuracy and are generalizable to any quantifiable post-impact quantity. In particular, we focus on the dataset requirements, training pipeline, and classification and regression performance for four distinct data-driven techniques from machine learning (ensemble methods and neural networks) and uncertainty quantification (Gaussian processes and polynomial chaos expansion). We compare these methods to existing analytic and semi-analytic methods. Such data-driven emulators are poised to replace the methods currently used in N-body simulations, while avoiding the cost of direct simulation. This work is based on a new set of 14,856 SPH simulations of pairwise collisions between rotating, differentiated bodies at all possible mutual orientations.



2021 ◽  
Vol 20 (5s) ◽  
pp. 1-20
Author(s):  
Hyungmin Cho

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.



Sign in / Sign up

Export Citation Format

Share Document