Coded machine learning: Joint informed replication and learning for linear regression

Author(s):  
Shahroze Kabir ◽  
Frederic Sala ◽  
Guy Van den Broeck ◽  
Lara Dolecek
Soil Systems ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 41
Author(s):  
Tulsi P. Kharel ◽  
Amanda J. Ashworth ◽  
Phillip R. Owens ◽  
Dirk Philipp ◽  
Andrew L. Thomas ◽  
...  

Silvopasture systems combine tree and livestock production to minimize market risk and enhance ecological services. Our objective was to explore and develop a method for identifying driving factors linked to productivity in a silvopastoral system using machine learning. A multi-variable approach was used to detect factors that affect system-level output (i.e., plant production (tree and forage), soil factors, and animal response based on grazing preference). Variables from a three-year (2017–2019) grazing study, including forage, tree, soil, and terrain attribute parameters, were analyzed. Hierarchical variable clustering and random forest model selected 10 important variables for each of four major clusters. A stepwise multiple linear regression and regression tree approach was used to predict cattle grazing hours per animal unit (h ha−1 AU−1) using 40 variables (10 per cluster) selected from 130 total variables. Overall, the variable ranking method selected more weighted variables for systems-level analysis. The regression tree performed better than stepwise linear regression for interpreting factor-level effects on animal grazing preference. Cattle were more likely to graze forage on soils with Cd levels <0.04 mg kg−1 (126% greater grazing hours per AU), soil Cr <0.098 mg kg−1 (108%), and a SAGA wetness index of <2.7 (57%). Cattle also preferred grazing (88%) native grasses compared to orchardgrass (Dactylis glomerata L.). The result shows water flow within the landscape position (wetness index), and associated metals distribution may be used as an indicator of animal grazing preference. Overall, soil nutrient distribution patterns drove grazing response, although animal grazing preference was also influenced by aboveground (forage and tree), soil, and landscape attributes. Machine learning approaches helped explain pasture use and overall drivers of grazing preference in a multifunctional system.


2021 ◽  
Vol 11 (9) ◽  
pp. 3866
Author(s):  
Jun-Ryeol Park ◽  
Hye-Jin Lee ◽  
Keun-Hyeok Yang ◽  
Jung-Keun Kook ◽  
Sanghee Kim

This study aims to predict the compressive strength of concrete using a machine-learning algorithm with linear regression analysis and to evaluate its accuracy. The open-source software library TensorFlow was used to develop the machine-learning algorithm. In the machine-earning algorithm, a total of seven variables were set: water, cement, fly ash, blast furnace slag, sand, coarse aggregate, and coarse aggregate size. A total of 4297 concrete mixtures with measured compressive strengths were employed to train and testing the machine-learning algorithm. Of these, 70% were used for training, and 30% were utilized for verification. For verification, the research was conducted by classifying the mixtures into three cases: the case where the machine-learning algorithm was trained using all the data (Case-1), the case where the machine-learning algorithm was trained while maintaining the same number of training dataset for each strength range (Case-2), and the case where the machine-learning algorithm was trained after making the subcase of each strength range (Case-3). The results indicated that the error percentages of Case-1 and Case-2 did not differ significantly. The error percentage of Case-3 was far smaller than those of Case-1 and Case-2. Therefore, it was concluded that the range of training dataset of the concrete compressive strength is as important as the amount of training dataset for accurately predicting the concrete compressive strength using the machine-learning algorithm.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Prasanna Date ◽  
Davis Arthur ◽  
Lauren Pusey-Nazzaro

AbstractTraining machine learning models on classical computers is usually a time and compute intensive process. With Moore’s law nearing its inevitable end and an ever-increasing demand for large-scale data analysis using machine learning, we must leverage non-conventional computing paradigms like quantum computing to train machine learning models efficiently. Adiabatic quantum computers can approximately solve NP-hard problems, such as the quadratic unconstrained binary optimization (QUBO), faster than classical computers. Since many machine learning problems are also NP-hard, we believe adiabatic quantum computers might be instrumental in training machine learning models efficiently in the post Moore’s law era. In order to solve problems on adiabatic quantum computers, they must be formulated as QUBO problems, which is very challenging. In this paper, we formulate the training problems of three machine learning models—linear regression, support vector machine (SVM) and balanced k-means clustering—as QUBO problems, making them conducive to be trained on adiabatic quantum computers. We also analyze the computational complexities of our formulations and compare them to corresponding state-of-the-art classical approaches. We show that the time and space complexities of our formulations are better (in case of SVM and balanced k-means clustering) or equivalent (in case of linear regression) to their classical counterparts.


Author(s):  
Mert Gülçür ◽  
Ben Whiteside

AbstractThis paper discusses micromanufacturing process quality proxies called “process fingerprints” in micro-injection moulding for establishing in-line quality assurance and machine learning models for Industry 4.0 applications. Process fingerprints that we present in this study are purely physical proxies of the product quality and need tangible rationale regarding their selection criteria such as sensitivity, cost-effectiveness, and robustness. Proposed methods and selection reasons for process fingerprints are also justified by analysing the temporally collected data with respect to the microreplication efficiency. Extracted process fingerprints were also used in a multiple linear regression scenario where they bring actionable insights for creating traceable and cost-effective supervised machine learning models in challenging micro-injection moulding environments. Multiple linear regression model demonstrated %84 accuracy in predicting the quality of the process, which is significant as far as the extreme process conditions and product features are concerned.


2021 ◽  
Vol 9 ◽  
Author(s):  
Fu-Sheng Chou ◽  
Laxmi V. Ghimire

Background: Pediatric myocarditis is a rare disease. The etiologies are multiple. Mortality associated with the disease is 5–8%. Prognostic factors were identified with the use of national hospitalization databases. Applying these identified risk factors for mortality prediction has not been reported.Methods: We used the Kids' Inpatient Database for this project. We manually curated fourteen variables as predictors of mortality based on the current knowledge of the disease, and compared performance of mortality prediction between linear regression models and a machine learning (ML) model. For ML, the random forest algorithm was chosen because of the categorical nature of the variables. Based on variable importance scores, a reduced model was also developed for comparison.Results: We identified 4,144 patients from the database for randomization into the primary (for model development) and testing (for external validation) datasets. We found that the conventional logistic regression model had low sensitivity (~50%) despite high specificity (&gt;95%) or overall accuracy. On the other hand, the ML model struck a good balance between sensitivity (89.9%) and specificity (85.8%). The reduced ML model with top five variables (mechanical ventilation, cardiac arrest, ECMO, acute kidney injury, ventricular fibrillation) were sufficient to approximate the prediction performance of the full model.Conclusions: The ML algorithm performs superiorly when compared to the linear regression model for mortality prediction in pediatric myocarditis in this retrospective dataset. Prospective studies are warranted to further validate the applicability of our model in clinical settings.


Author(s):  
Ivanna Baturynska

Additive manufacturing (AM) is an attractive technology for manufacturing industry due to flexibility in design and functionality, but inconsistency in quality is one of the major limitations that does not allow utilizing this technology for production of end-use parts. Prediction of mechanical properties can be one of the possible ways to improve the repeatability of the results. The part placement, part orientation, and STL model properties (number of mesh triangles, surface, and volume) are used to predict tensile modulus, nominal stress and elongation at break for polyamide 2200 (also known as PA12). EOS P395 polymer powder bed fusion system was used to fabricate 217 specimens in two identical builds (434 specimens in total). Prediction is performed for XYZ, XZY, ZYX, and Angle orientations separately, and all orientations together. The different non-linear models based on machine learning methods have higher prediction accuracy compared with linear regression models. Linear regression models have prediction accuracy higher than 80% only for Tensile Modulus and Elongation at break in Angle orientation. Since orientation-based modeling has low prediction accuracy due to a small number of data points and lack of information about material properties, these models need to be improved in the future based on additional experimental work.


Author(s):  
Kalva Sindhu Priya

Abstract: In the present scenario, it is quite aware that almost every field is moving into machine based automation right from fundamentals to master level systems. Among them, Machine Learning (ML) is one of the important tool which is most similar to Artificial Intelligence (AI) by allowing some well known data or past experience in order to improve automatically or estimate the behavior or status of the given data through various algorithms. Modeling a system or data through Machine Learning is important and advantageous as it helps in the development of later and newer versions. Today most of the information technology giants such as Facebook, Uber, Google maps made Machine learning as a critical part of their ongoing operations for the better view of users. In this paper, various available algorithms in ML is given briefly and out of all the existing different algorithms, Linear Regression algorithm is used to predict a new set of values by taking older data as reference. However, a detailed predicted model is discussed clearly by building a code with the help of Machine Learning and Deep Learning tool in MATLAB/ SIMULINK. Keywords: Machine Learning (ML), Linear Regression algorithm, Curve fitting, Root Mean Squared Error


Sign in / Sign up

Export Citation Format

Share Document