scholarly journals Prioritizing 2nd and 3rd order interactions via support vector ranking using sensitivity indices on static Wnt measurements - Part A [work in progress]

2016 ◽  
Author(s):  
shriprakash sinha

It is widely known that the sensitivity analysis plays a major role in computing the strength of the influence of involved factors in any phenomena under investigation. When applied to expression profiles of various intra/extracellular factors that form an integral part of a signaling pathway, the variance and density based analysis yields a range of sensitivity indices for individual as well as various combinations of factors. These combinations denote the higher order interactions among the involved factors that might be of interest in the working mechanism of the pathway. For example, in a range of fourth order combinations among the various factors of the Wnt pathway, it would be easy to assess the influence of the destruction complex formed by APC, AXIN, CSKI and GSK3 interaction. In this work, after estimating the individual effects of factors for a higher order combination, the individual indices are considered as discriminative features. A combination, then is a multivariate feature set in higher order (>2). With an excessively large number of factors involved in the pathway, it is difficult to search for important combinations in a wide search space over different orders. Exploiting the analogy of prioritizing webpages using ranking algorithms, for a particular order, a full set of combinations of interactions can then be prioritized based on these features using a powerful ranking algorithm via support vectors. The computational ranking sheds light on unexplored combinations that can further be investigated using hypothesis testing based on wet lab experiments. Here, the basic framework and results obtained on 2nd and 3rd order interactions on a toy example data set is presented. Subsequent manuscripts will examine higher order interactions in detail. Part B of this work deals with the time series data.


2016 ◽  
Author(s):  
shriprakash sinha

AbstractIt is widely known that the sensitivity analysis plays a major role in computing the strength of the influence of involved factors in any phenomena under investigation. When applied to expression profiles of various intra/extracellular factors that form an integral part of a signaling pathway, the variance and density based analysis yields a range of sensitivity indices for individual as well as various combinations of factors. These combinations denote the higher order interactions among the involved factors, that might be of interest in the working mechanism of the pathway. For example, there are 19 types of WNTs and 10 FZDs with their 2ndorder combinations high enough and it is not possible to know which one to test first (except for those for which wet lab validations have been confirmed). But the effect of these combinations vary over time as measurements of fold changes and deviations in fold changes vary. In this work, after estimating the individual effects of factors for a higher order combination, the individual indices are considered as discriminative features. A combination, then is a multivariate feature set in higher order (>=2). With an excessively large number of factors involved in the pathway, it is difficult to search for important combinations in a wide search space over different orders. Exploiting the analogy of prioritizing webpages using ranking algorithms, for a particular order, a full set of combinations of interactions can then be prioritized based on these features using a powerful ranking algorithm via support vectors. Recording the changing rankings of the combinations over time points and durations, reveals how higher order interactions behave within the pathway and when and where an intervention might be necessary to influence the pathway. This could lead to development of time based therapeutic interventions. Based on a small dataset in time, we were able to generate the rankings of the 2ndorder combinations between WNTs and FZDs at different time snap shots and for different duration or time periods. Code has been made available on Google drive athttps://drive.google.com/folderview?id=0B7Kkv8wlhPU-V1Fkd1dMSTd5ak0&usp=sharingSignificanceThe search and wet lab testing of unknown biological hypotheses in the form of combinations of various intra/extracellular factors that are involved in a signaling pathway, costs a lot in terms of time, investment and energy. To reduce this cost of search in a vast combinatorial space, a pipeline has been developed that prioritises these list of combinations so that a biologist can narrow down their investigation. The pipeline uses kernel based sensitivity indices to capture the influence of the factors in a pathway and employs powerful support vector ranking algorithm. The generic workflow and future improvements are bound to cut down the cost for many wet lab experiments and reveal unknown/untested biological hypothesis.



Forecasting paddy production is considered as a difficult problem in the real world due to in deterministic behavior of the nature. Specifically, rice production is forecasted for a leading year for overall planning of the crop, utilization of the agricultural resources and the rice production management. Likewise, the key challenge of the forecasting rice production is to create a realistic model that can able to handle the critical time series data and forecast with minor error. Prognostication of the Future data is highly correlated with the time series data set. If the accuracy of your prediction is more appropriate, then the value of the forecast will improve as well. This paper represents a new technique depends on Higher Order Fuzzy Logical Relationship. Here, Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE) are used to estimate the errors of predicted data. historical data relating to the rice production of 1981 to 2003 is used as secondary data and the error of the predicted data is further reduced using different soft computing technique.



Author(s):  
Zuriani Mustaffa ◽  
Yuhanis Yusof ◽  
Siti Sakira Kamaruddin

This paper presents an enhanced Artificial Bee Colony (eABC) based on Lévy Probability Distribution (LPD) and conventional mutation. The purposes of enhancement are to enrich the searching behavior of the bees in the search space and prevent premature convergence. Such an approach is used to improve the performance of the original ABC in optimizing the embedded hyper-parameters of Least Squares Support Vector Machines (LSSVM). Later on, a procedure is put forward to serve as a prediction tool to solve prediction task. To evaluate the efficiency of the proposed model, crude oil prices data was employed as empirical data and a comparison against four approaches were conducted, which include standard ABC-LSSVM, Genetic Algorithm-LSSVM (GA-LSSVM), Cross Validation-LSSVM (CV-LSSVM), and conventional Back Propagation Neural Network (BPNN). From the experiment that was conducted, the proposed eABC-LSSVM shows encouraging results in optimizing parameters of interest by producing higher prediction accuracy for employed time series data.  



Electronics ◽  
2019 ◽  
Vol 8 (9) ◽  
pp. 919
Author(s):  
Ruidong Wu ◽  
Bing Liu ◽  
Jiafeng Fu ◽  
Mingzhu Xu ◽  
Ping Fu ◽  
...  

Online training of Support Vector Regression (SVR) in the field of machine learning is a computationally complex algorithm. Due to the need for multiple iterative processing in training, SVR training is usually implemented on computer, and the existing training methods cannot be directly implemented on Field-Programmable Gate Array (FPGA), which restricts the application range. This paper reconstructs the training framework and implementation without precision loss to reduce the total latency required for matrix update, reducing time consumption by 90%. A general ε-SVR training system with low latency is implemented on Zynq platform. Taking the regression of samples in two-dimensional as an example, the maximum acceleration ratio is 27.014× compared with microcontroller platform and the energy consumption is 12.449% of microcontroller. From the experiments for the University of California, Riverside (UCR) time series data set. The regression results obtain excellent regression effects. The minimum coefficient of determination is 0.996, and running time is less than 30 ms, which can meet the requirements of different applications for real-time regression.



Author(s):  
Hoang T. P. Thanh ◽  
◽  
Phayung Meesad ◽  

Predicting the behaviors of the stock markets are always an interesting topic for not only financial investors but also scholars and professionals from different fields, because successful prediction can help investors to yield significant profits. Previous researchers have shown the strong correlation between financial news and their impacts to the movements of stock prices. This paper proposes an approach of using time series analysis and text mining techniques to predict daily stock market trends. The research is conducted with the utilization of a database containing stock index prices and news articles collected from Vietnam websites over 3 years from 2010 to 2012. A robust feature selection and a strong machine learning algorithm are able to lift the forecasting accuracy. By combining Linear Support Vector Machine Weight and Support Vector Machine algorithm, this proposed approach can enhance the prediction accuracy significantly above those of related research approaches. The results show that data set represented by 42 features achieves the highest accuracy by using one-against-one Support Vector Machines (up to 75%) and one-against-one method outperforms one-againstall method in almost all case studies.



2020 ◽  
Vol 39 (5) ◽  
pp. 6419-6430
Author(s):  
Dusan Marcek

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.



2019 ◽  
Vol 33 (3) ◽  
pp. 187-202
Author(s):  
Ahmed Rachid El-Khattabi ◽  
T. William Lester

The use of tax increment financing (TIF) remains a popular, yet highly controversial, tool among policy makers in their efforts to promote economic development. This study conducts a comprehensive assessment of the effectiveness of Missouri’s TIF program, specifically in Kansas City and St. Louis, in creating economic opportunities. We build a time-series data set starting 1990 through 2012 of detailed employment levels, establishment counts, and sales at the census block-group level to run a set of difference-in-differences with matching estimates for the impact of TIF at the local level. Although we analyze the impact of TIF on a wide set of indicators and across various industry sectors, we find no conclusive evidence that the TIF program in either city has a causal impact on key economic development indicators.



AI ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 48-70
Author(s):  
Wei Ming Tan ◽  
T. Hui Teo

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.



2021 ◽  
Vol 13 (3) ◽  
pp. 67
Author(s):  
Eric Hitimana ◽  
Gaurav Bajpai ◽  
Richard Musabe ◽  
Louis Sibomana ◽  
Jayavel Kayalvizhi

Many countries worldwide face challenges in controlling building incidence prevention measures for fire disasters. The most critical issues are the localization, identification, detection of the room occupant. Internet of Things (IoT) along with machine learning proved the increase of the smartness of the building by providing real-time data acquisition using sensors and actuators for prediction mechanisms. This paper proposes the implementation of an IoT framework to capture indoor environmental parameters for occupancy multivariate time-series data. The application of the Long Short Term Memory (LSTM) Deep Learning algorithm is used to infer the knowledge of the presence of human beings. An experiment is conducted in an office room using multivariate time-series as predictors in the regression forecasting problem. The results obtained demonstrate that with the developed system it is possible to obtain, process, and store environmental information. The information collected was applied to the LSTM algorithm and compared with other machine learning algorithms. The compared algorithms are Support Vector Machine, Naïve Bayes Network, and Multilayer Perceptron Feed-Forward Network. The outcomes based on the parametric calibrations demonstrate that LSTM performs better in the context of the proposed application.



Author(s):  
Gudipally Chandrashakar

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.



Sign in / Sign up

Export Citation Format

Share Document