scholarly journals A Novel Fast Searching Algorithm Based on Least Square Regression

2021 ◽  
Vol 35 (1) ◽  
pp. 93-98
Author(s):  
Ratna Kumari Challa ◽  
Siva Prasad Chintha ◽  
B. Reddaiah ◽  
Kanusu Srinivasa Rao

Currently, the machine learning group is well-understood and commonly used for predictive modelling and feature generation through linear methodologies such as reversals, principal analysis and canonical correlation analyses. All these approaches are typically intended to capture fascinating subspaces in the original space of high dimensions. These methods have all a closed-form approach because of its simple linear structures, which makes estimation and theoretical analysis for small datasets very straightforward. However, it is very common for a data set to have millions or trillions of samples and features in modern machine learning problems. We deal with the problem of fast estimation from large volumes of data for ordinary squares. The search operation is a very important operation and it is useful in many applications. Some applications when the data set size is large, the linear search takes the time which is proportional to the size of the data set. Binary search and interpolation search performs good for the search of elements in the data set in O(logn) and ⋅O(log(⋅logn)) respectively in the worst case. Now, in this paper, an effort is made to develop a novel fast searching algorithm based on the least square regression curve fitting method. The algorithm is implemented and its execution results are analyzed and compared with binary search and interpolation search performance. The proposed model is compared with the traditional methods and the proposed fast searching algorithm exhibits better performance than the traditional models.

2021 ◽  
Author(s):  
Wan Sieng Yeo

Abstract The textile bleaching process uses a hydrogen peroxide (H2O2) solution in alkali pH associated with high temperature is the commonly used bleaching procedure in cotton fabric manufacture. The purpose of the bleaching process is to remove the natural colour from cotton to obtain a permanent white colour before dyeing or shape matching. Normally, the visual ratings of whiteness on the cotton are measured by the whiteness index (WI). Notice that lesser research study is focusing on chemical predictive modelling of the WI of cotton fabric than its experimental study. Predictive analytics using predictive modelling can forecast the outcomes that can lead to better-informed cotton quality assurance and control decisions. Up to date, limited study applying least square support vector regression (LSSVR) based model in the textile domain. Hence, the present study was aimed to develop the LSSVR-based model, namely multi-output LSSVR (MLSSVR) using bleaching process variables to predict the WI of cotton. The predictive accuracy of the MLSSVR model is measured by root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2), and its results are compared with other regression models including partial least square regression, predictive fuzzy model, locally weighted partial least square regression and locally weighted kernel partial least square regression. The results indicate that the MLSSVR model performed better than other models in predicting the WI as it has 60–1209% lower values of RMSE and MAE as well as it provided the highest R2 values which are up to 0.9985.


2018 ◽  
Vol 7 (3.12) ◽  
pp. 960
Author(s):  
Anila. M ◽  
G Pradeepini

The most commonly used prediction technique is Ordinary Least Squares Regression (OLS Regression). It has been applied in many fields like statistics, finance, medicine, psychology and economics. Many people, specially Data Scientists using this technique know that it has not gone with enough training to apply it and should be checked why & when it can or can’t be applied.It’s not easy task to find or explain about why least square regression [1] is faced much criticism when trained and tried to apply it. In this paper, we mention firstly about fundamentals of linear regression and OLS regression along with that popularity of LS method, we present our analysis of difficulties & pitfalls that arise while OLS method is applied, finally some techniques for overcoming these problems.  


Author(s):  
Cung Lian Sang ◽  
Bastian Steinhagen ◽  
Jonas Dominik Homburg ◽  
Michael Adams ◽  
Marc Hesse ◽  
...  

In Ultra-wideband (UWB)-based wireless ranging or distance measurement, differentiation between line-of-sight~(LOS), non-line-of-sight~(NLOS), and multi-path (MP) conditions are important for precise indoor localization. This is because the accuracy of the reported measured distance in UWB ranging systems is directly affected by the measurement conditions (LOS, NLOS or MP). However, the major contributions in literature only address the binary classification between LOS and NLOS in UWB ranging systems. The MP condition is usually ignored. In fact, the MP condition also has a significant impact on the ranging errors of the UWB compared to the direct LOS measurement results. Though, the magnitudes of the error contained in MP conditions are generally lower than completely blocked NLOS scenarios. This paper addresses machine learning techniques for identification of the mentioned three classes (LOS, NLOS, and MP) in the UWB indoor localization system using an experimental data-set. The data-set was collected in different conditions at different scenarios in indoor environments. Using the collected real measurement data, we compare three machine learning (ML) classifiers, i.e., support vector machine (SVM), random forest (RF) based on an ensemble learning method, and multilayer perceptron (MLP) based on a deep artificial neural network, in terms of their performance. The results show that applying ML methods in UWB ranging systems are effective in identification of the above-mentioned three classes. In specific, the overall accuracy reaches up to 91.9% in the best-case scenario and 72.9% in the worst-case scenario. Regarding the F1-score, it is 0.92 in the best-case and 0.69 in the worst-case scenario. For reproducible results and further exploration, we (will) provide the publicly accessible experimental research data discussed in this paper at PUB - Publications at Bielefeld University. The evaluations of the three classifiers are conducted using the open-source python machine learning library scikit-learn.


2021 ◽  
Author(s):  
Mengdi Song ◽  
Massyl Gheroufella ◽  
Paul Chartier

Abstract In subsea pipelines projects, the design of rigid spool and jumper can be a challenging and time-consuming task. The selected spool layout for connecting the pipelines to the subsea structures, including the number of bends and leg lengths, must offer the flexibility to accommodate the pipeline thermal expansion, the pipe-lay target box and misalignments associated with the post-lay survey metrology and spool fabrication. The analysis results are considerably affected by many uncertainties involved. Consequently, a very large amount of calculations is required to assess the full combination of uncertainties and to capture the worst-case scenario. Rather than applying the deterministic solution, this paper uses machine learning prediction to significantly improve the efficiency of the design process. In addition, thanks to the fast predictive model using machine learning algorithms, the uncertainty quantification and propagation analysis using probabilistic statistical method becomes feasible in terms of CPU time and can be incorporated into the design process to evaluate the reliability of the outputs. The latter allows us to perform a systematic probabilistic design by considering a certain level of acceptance on the probability of failure, for example as per DNVGL design code. The machine learning predictive modelling and the reliability analysis based upon the probability distribution of the uncertainties are introduced and explained in this paper. Some project examples are shown to highlight the method’s comprehensive nature and efficient characteristics.


Author(s):  
Miao Yu ◽  
Dimitrios Kollias ◽  
James Wingate ◽  
Niro Siriwardena ◽  
Stefanos Kollias

A novel machine learning approach is presented in this paper, based on extracting latent information and using it to assist decision making on ambulance attendance and conveyance to a hospital. The approach includes two steps: in the first, a forward model analyzes the clinical and, possibly, non-clinical factors (explanatory variables), predicting whether positive decisions (response variables) should be given to the ambulance call, or not; in the second, a backward model analyzes the latent variables extracted from the forward model to infer the decision making procedure. The forward model is implemented through a machine, or deep learning technique, whilst the backward model is implemented through unsupervised learning. An experimental study is presented, which illustrates the obtained results, by investigating emergency ambulance calls to people in nursing and residential care homes, over a one-year period, using an anonymized data set provided by East Midlands Ambulance Service in United Kingdom.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 482
Author(s):  
Miao Yu ◽  
Dimitrios Kollias ◽  
James Wingate ◽  
Niro Siriwardena ◽  
Stefanos Kollias

A novel machine learning approach is presented in this paper, based on extracting latent information and using it to assist decision making on ambulance attendance and conveyance to a hospital. The approach includes two steps: in the first, a forward model analyzes the clinical and, possibly, non-clinical factors (explanatory variables), predicting whether positive decisions (response variables) should be given to the ambulance call, or not; in the second, a backward model analyzes the latent variables extracted from the forward model to infer the decision making procedure. The forward model is implemented through a machine, or deep learning technique, whilst the backward model is implemented through unsupervised learning. An experimental study is presented, which illustrates the obtained results, by investigating emergency ambulance calls to people in nursing and residential care homes, over a one-year period, using an anonymized data set provided by East Midlands Ambulance Service in United Kingdom.


2021 ◽  
Vol 11 (2) ◽  
pp. 536
Author(s):  
Mohammed Amin Benbouras ◽  
Alexandru-Ionut Petrisor

Several attempts have been made for estimating the vital swelling index parameter conducted by the expensive and time-consuming Oedometer test. However, they have only focused on the neuron network neglecting other advanced methods that could have increased the predictive capability of models. In order to overcome this limitation, the current study aims to elaborate an alternative model for estimating the swelling index from geotechnical physical parameters. The reliability of the approach is tested through several advanced machine learning methods like Extreme Learning Machine, Deep Neural Network, Support Vector Regression, Random Forest, LASSO regression, Partial Least Square Regression, Ridge Regression, Kernel Ridge, Stepwise Regression, Least Square Regression, and genetic Programing. These methods have been applied for modeling samples consisting of 875 Oedometer tests. Firstly, principal component analysis, Gamma test, and forward selection are utilized to reduce the input variable numbers. Afterward, the advanced techniques have been applied for modeling the proposed optimal inputs, and their accuracy models were evaluated through six statistical indicators and using K-fold cross validation approach. The comparative study shows the efficiency of FS-RF model. This elaborated model provided the most appropriate prediction, closest to the experimental values compared with other models and formulae proposed by the previous studies.


2018 ◽  
Vol 1 (1) ◽  
pp. 52 ◽  
Author(s):  
Mohamed Tareq Hossain ◽  
Zubair Hassan ◽  
Sumaiya Shafiq ◽  
Abdul Basit

This study investigates the impact of Ease of Doing Business on Inward FDI over the period from 2011 to 2015 across the globe. This study measures ease of doing business using starting a business, getting credit, registering property, paying taxes and enforcing contracts. The research used a sample of 177 countries from 190 countries listed in World Bank. Least square regression model via E-views software used to examine causal relationship. The study found that ease of doing business indicators ‘Enforcing Contracts’ was found to have a positive significant impact on Inward FDI. Nevertheless, ‘Getting Credit’ and ‘Registering Property’ were found to have a negative significant impact on Inward FDI. However, ‘Starting a Business’ and ‘Paying Taxes’ have no significant impact on Inward FDI in the studied timeframe of this research. The findings of the study suggested the ease of doing business enables inward FDI through better contract enforcements, getting credit and registering property. The findings of the research will assist international managers and companies to know the importance of ease of doing business when investing in foreign countries through FDI.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


Author(s):  
Jun Pei ◽  
Zheng Zheng ◽  
Hyunji Kim ◽  
Lin Song ◽  
Sarah Walworth ◽  
...  

An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function’s ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluate the relevant importance for each atom pair using traditional means. With the introduction of machine learning methods, it has become possible to determine the relative importance for each atom pair present in a scoring function. In this work, we use the Random Forest (RF) method to refine a pair potential developed by our laboratory (GARF6) by identifying relevant atom pairs that optimize the performance of the potential on our given task. Our goal is to construct a machine learning (ML) model that can accurately differentiate the native ligand binding pose from candidate poses using a potential refined by RF optimization. We successfully constructed RF models on an unbalanced data set with the ‘comparison’ concept and, the resultant RF models were tested on CASF-2013.5 In a comparison of the performance of our RF models against 29 scoring functions, we found our models outperformed the other scoring functions in predicting the native pose. In addition, we used two artificial designed potential models to address the importance of the GARF potential in the RF models: (1) a scrambled probability function set, which was obtained by mixing up atom pairs and probability functions in GARF, and (2) a uniform probability function set, which share the same peak positions with GARF but have fixed peak heights. The results of accuracy comparison from RF models based on the scrambled, uniform, and original GARF potential clearly showed that the peak positions in the GARF potential are important while the well depths are not. <br>


Sign in / Sign up

Export Citation Format

Share Document