stochastic gradient descent
Recently Published Documents





2022 ◽  
Vol 40 (4) ◽  
pp. 1-32
Jinze Wang ◽  
Yongli Ren ◽  
Jie Li ◽  
Ke Deng

Factorization models have been successfully applied to the recommendation problems and have significant impact to both academia and industries in the field of Collaborative Filtering ( CF ). However, the intermediate data generated in factorization models’ decision making process (or training process , footprint ) have been overlooked even though they may provide rich information to further improve recommendations. In this article, we introduce the concept of Convergence Pattern, which records how ratings are learned step-by-step in factorization models in the field of CF. We show that the concept of Convergence Patternexists in both the model perspective (e.g., classical Matrix Factorization ( MF ) and deep-learning factorization) and the training (learning) perspective (e.g., stochastic gradient descent ( SGD ), alternating least squares ( ALS ), and Markov Chain Monte Carlo ( MCMC )). By utilizing the Convergence Pattern, we propose a prediction model to estimate the prediction reliability of missing ratings and then improve the quality of recommendations. Two applications have been investigated: (1) how to evaluate the reliability of predicted missing ratings and thus recommend those ratings with high reliability. (2) How to explore the estimated reliability to adjust the predicted ratings to further improve the predication accuracy. Extensive experiments have been conducted on several benchmark datasets on three recommendation tasks: decision-aware recommendation, rating predicted, and Top- N recommendation. The experiment results have verified the effectiveness of the proposed methods in various aspects.

Ahmad AL Smadi ◽  
Atif Mehmood ◽  
Ahed Abugabah ◽  
Eiad Almekhlafi ◽  
Ahmad Mohammad Al-smadi

<p>In computer vision, image classification is one of the potential image processing tasks. Nowadays, fish classification is a wide considered issue within the areas of machine learning and image segmentation. Moreover, it has been extended to a variety of domains, such as marketing strategies. This paper presents an effective fish classification method based on convolutional neural networks (CNNs). The experiments were conducted on the new dataset of Bangladesh’s indigenous fish species with three kinds of splitting: 80-20%, 75-25%, and 70-30%. We provide a comprehensive comparison of several popular optimizers of CNN. In total, we perform a comparative analysis of 5 different state-of-the-art gradient descent-based optimizers, namely adaptive delta (AdaDelta), stochastic gradient descent (SGD), adaptive momentum (Adam), adaptive max pooling (Adamax), Root mean square propagation (Rmsprop), for CNN. Overall, the obtained experimental results show that Rmsprop, Adam, Adamax performed well compared to the other optimization techniques used, while AdaDelta and SGD performed the worst. Furthermore, the experimental results demonstrated that Adam optimizer attained the best results in performance measures for 70-30% and 80-20% splitting experiments, while the Rmsprop optimizer attained the best results in terms of performance measures of 70-25% splitting experiments. Finally, the proposed model is then compared with state-of-the-art deep CNNs models. Therefore, the proposed model attained the best accuracy of 98.46% in enhancing the CNN ability in classification, among others.</p>

PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262009
Rui Zhang ◽  
Hejia Song ◽  
Qiulan Chen ◽  
Yu Wang ◽  
Songwang Wang ◽  

Objectives This study intends to build and compare two kinds of forecasting models at different time scales for hemorrhagic fever incidence in China. Methods Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory Neural Network (LSTM) were adopted to fit monthly, weekly and daily incidence of hemorrhagic fever in China from 2013 to 2018. The two models, combined and uncombined with rolling forecasts, were used to predict the incidence in 2019 to examine their stability and applicability. Results ARIMA (2, 1, 1) (0, 1, 1)12, ARIMA (1, 1, 3) (1, 1, 1)52 and ARIMA (5, 0, 1) were selected as the best fitting ARIMA model for monthly, weekly and daily incidence series, respectively. The LSTM model with 64 neurons and Stochastic Gradient Descent (SGDM) for monthly incidence, 8 neurons and Adaptive Moment Estimation (Adam) for weekly incidence, and 64 neurons and Root Mean Square Prop (RMSprop) for daily incidence were selected as the best fitting LSTM models. The values of root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of the models combined with rolling forecasts in 2019 were lower than those of the direct forecasting models for both ARIMA and LSTM. It was shown from the forecasting performance in 2019 that ARIMA was better than LSTM for monthly and weekly forecasting while the LSTM was better than ARIMA for daily forecasting in rolling forecasting models. Conclusions Both ARIMA and LSTM could be used to build a prediction model for the incidence of hemorrhagic fever. Different models might be more suitable for the incidence prediction at different time scales. The findings can provide a good reference for future selection of prediction models and establishments of early warning systems for hemorrhagic fever.

2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Mahesh Babu Mariappan ◽  
Kanniga Devi ◽  
Yegnanarayanan Venkataraman ◽  
Ming K. Lim ◽  
Panneerselvam Theivendren

PurposeThis paper aims to address the pressing problem of prediction concerning shipment times of therapeutics, diagnostics and vaccines during the ongoing COVID-19 pandemic using a novel artificial intelligence (AI) and machine learning (ML) approach.Design/methodology/approachThe present study used organic real-world therapeutic supplies data of over 3 million shipments collected during the COVID-19 pandemic through a large real-world e-pharmacy. The researchers built various ML multiclass classification models, namely, random forest (RF), extra trees (XRT), decision tree (DT), multilayer perceptron (MLP), XGBoost (XGB), CatBoost (CB), linear stochastic gradient descent (SGD) and the linear Naïve Bayes (NB) and trained them on striped datasets of (source, destination, shipper) triplets. The study stacked the base models and built stacked meta-models. Subsequently, the researchers built a model zoo with a combination of the base models and stacked meta-models trained on these striped datasets. The study used 10-fold cross-validation (CV) for performance evaluation.FindingsThe findings reveal that the turn-around-time provided by therapeutic supply logistics providers is only 62.91% accurate when compared to reality. In contrast, the solution provided in this study is up to 93.5% accurate compared to reality, resulting in up to 48.62% improvement, with a clear trend of more historic data and better performance growing each week.Research limitations/implicationsThe implication of the study has shown the efficacy of ML model zoo with a combination of base models and stacked meta-models trained on striped datasets of (source, destination and shipper) triplets for predicting the shipment times of therapeutics, diagnostics and vaccines in the e-pharmacy supply chain.Originality/valueThe novelty of the study is on the real-world e-pharmacy supply chain under post-COVID-19 lockdown conditions and has come up with a novel ML ensemble stacking based model zoo to make predictions on the shipment times of therapeutics. Through this work, it is assumed that there will be greater adoption of AI and ML techniques in shipment time prediction of therapeutics in the logistics industry in the pandemic situations.

2022 ◽  
Vol 8 (1) ◽  
pp. 9
Bruno Sauvalle ◽  
Arnaud de La Fortelle

The goal of background reconstruction is to recover the background image of a scene from a sequence of frames showing this scene cluttered by various moving objects. This task is fundamental in image analysis, and is generally the first step before more advanced processing, but difficult because there is no formal definition of what should be considered as background or foreground and the results may be severely impacted by various challenges such as illumination changes, intermittent object motions, highly cluttered scenes, etc. We propose in this paper a new iterative algorithm for background reconstruction, where the current estimate of the background is used to guess which image pixels are background pixels and a new background estimation is performed using those pixels only. We then show that the proposed algorithm, which uses stochastic gradient descent for improved regularization, is more accurate than the state of the art on the challenging SBMnet dataset, especially for short videos with low frame rates, and is also fast, reaching an average of 52 fps on this dataset when parameterized for maximal accuracy using acceleration with a graphics processing unit (GPU) and a Python implementation.

2022 ◽  
Vol 15 ◽  
Sarada Krithivasan ◽  
Sanchari Sen ◽  
Swagath Venkataramani ◽  
Anand Raghunathan

Training Deep Neural Networks (DNNs) places immense compute requirements on the underlying hardware platforms, expending large amounts of time and energy. We propose LoCal+SGD, a new algorithmic approach to accelerate DNN training by selectively combining localized or Hebbian learning within a Stochastic Gradient Descent (SGD) based training framework. Back-propagation is a computationally expensive process that requires 2 Generalized Matrix Multiply (GEMM) operations to compute the error and weight gradients for each layer. We alleviate this by selectively updating some layers' weights using localized learning rules that require only 1 GEMM operation per layer. Further, since localized weight updates are performed during the forward pass itself, the layer activations for such layers do not need to be stored until the backward pass, resulting in a reduced memory footprint. Localized updates can substantially boost training speed, but need to be used judiciously in order to preserve accuracy and convergence. We address this challenge through a Learning Mode Selection Algorithm, which gradually selects and moves layers to localized learning as training progresses. Specifically, for each epoch, the algorithm identifies a Localized→SGD transition layer that delineates the network into two regions. Layers before the transition layer use localized updates, while the transition layer and later layers use gradient-based updates. We propose both static and dynamic approaches to the design of the learning mode selection algorithm. The static algorithm utilizes a pre-defined scheduler function to identify the position of the transition layer, while the dynamic algorithm analyzes the dynamics of the weight updates made to the transition layer to determine how the boundary between SGD and localized updates is shifted in future epochs. We also propose a low-cost weak supervision mechanism that controls the learning rate of localized updates based on the overall training loss. We applied LoCal+SGD to 8 image recognition CNNs (including ResNet50 and MobileNetV2) across 3 datasets (Cifar10, Cifar100, and ImageNet). Our measurements on an Nvidia GTX 1080Ti GPU demonstrate upto 1.5× improvement in end-to-end training time with ~0.5% loss in Top-1 classification accuracy.

2022 ◽  
Vol 17 ◽  
Xinyi Liao ◽  
Xiaomei Gu ◽  
Dejun Peng

Background: Many malaria infections are caused by Plasmodium falciparum. Accurate classification of the proteins secreted by the malaria parasite, which are essential for the development of anti-malarial drugs, is essential. Objective: To accurately classify the proteins secreted by the malaria parasite. Methods: Therefore, in order to improve the accuracy of the prediction of plasmodium secreted proteins, we established a classification model MGAP-SGD. MonodikGap features (k=7) of the secreted proteins were extracted, and then the optimal features were selected by the AdaBoost method. Finally, based on the optimal set of secreted proteins, the model was used to predict the secreted proteins using the stochastic gradient descent (SGD) algorithm. Results: Our model uses a 10-fold cross-validation set and independent test set in the stochastic gradient descent (SGD) classifier to validate the model, and the accuracy rates are 98.5859% and 97.973%, respectively. Conclusion: This also fully proves that the effectiveness and robustness of the prediction results of the MGAP-SGD model can meet the prediction needs of the secreted proteins of plasmodium.

Healthcare ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 109
Mohammad T. Abou-Kreisha ◽  
Humam K. Yaseen ◽  
Khaled A. Fathy ◽  
Ebeid A. Ebeid ◽  
Kamal A. ElDahshan

In this paper, we approach the problem of detecting and diagnosing COVID-19 infections using multisource scan images including CT and X-ray scans to assist the healthcare system during the COVID-19 pandemic. Here, a computer-aided diagnosis (CAD) system is proposed that utilizes analysis of the CT or X-ray to diagnose the impact of damage in the respiratory system per infected case. The CAD was utilized and optimized by hyper-parameters for shallow learning, e.g., SVM and deep learning. For the deep learning, mini-batch stochastic gradient descent was used to overcome fitting problems during transfer learning. The optimal parameter list values were found using the naïve Bayes technique. Our contributions are (i) a comparison among the detection rates of pre-trained CNN models, (ii) a suggested hybrid deep learning with shallow machine learning, (iii) an extensive analysis of the results of COVID-19 transition and informative conclusions through developing various transfer techniques, and (iv) a comparison of the accuracy of the previous models with the systems of the present study. The effectiveness of the proposed CAD is demonstrated using three datasets, either using an intense learning model as a fully end-to-end solution or using a hybrid deep learning model. Six experiments were designed to illustrate the superior performance of our suggested CAD when compared to other similar approaches. Our system achieves 99.94, 99.6, 100, 97.41, 99.23, and 98.94 accuracy for binary and three-class labels for the CT and two CXR datasets.

2022 ◽  
Vol 2022 ◽  
pp. 1-16
Sandhya Sharma ◽  
Sheifali Gupta ◽  
Deepali Gupta ◽  
Sapna Juneja ◽  
Gaurav Singal ◽  

The challenges involved in the traditional cloud computing paradigms have prompted the development of architectures for the next generation cloud computing. The new cloud computing architectures can generate and handle huge amount of data, which was not possible to handle with the help of traditional architectures. Deep learning algorithms have the ability to process this huge amount of data and, thus, can now solve the problem of the next generation computing algorithms. Therefore, these days, deep learning has become the state-of-the-art approach for solving various tasks and most importantly in the field of recognition. In this work, recognition of city names is proposed. Recognition of handwritten city names is one of the potential research application areas in the field of postal automation For recognition using a segmentation-free approach (Holistic approach). This proposed work demystifies the role of convolutional neural network (CNN), which is one of the methods of deep learning technique. Proposed CNN model is trained, validated, and analyzed using Adam and stochastic gradient descent (SGD) optimizer with a batch size of 2, 4, and 8 and learning rate (LR) of 0.001, 0.01, and 0.1. The model is trained and validated on 10 different classes of the handwritten city names written in Gurmukhi script, where each class has 400 samples. Our analysis shows that the CNN model, using an Adam optimizer, batch size of 4, and a LR of 0.001, has achieved the best average validation accuracy of 99.13.

Water ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 99
Won Jin Lee ◽  
Eui Hoon Lee

Runoff in urban streams is the most important factor influencing urban inundation. It also affects inundation in other areas as various urban streams and rivers are connected. Current runoff predictions obtained using a multi-layer perceptron (MLP) exhibit limited accuracy. In this study, the runoff of urban streams was predicted by applying an MLP using a harmony search (MLPHS) to overcome the shortcomings of MLPs using existing optimizers and compared with the observed runoff and the runoff predicted by an MLP using a real-coded genetic algorithm (RCGA). Furthermore, the results of the MLPHS were compared with the results of the MLP with existing optimizers such as the stochastic gradient descent, adaptive gradient, and root mean squared propagation. The runoff of urban steams was predicted based on the discharge of each pump station and rainfall information. The results obtained with the MLPHS exhibited the smallest error of 39.804 m3/s when compared to the peak value of the observed runoff. The MLPHS gave more accurate runoff prediction results than the MLP using the RCGA and that using existing optimizers. The accurate prediction of the runoff in an urban stream using an MLPHS based on the discharge of each pump station is possible.

Sign in / Sign up

Export Citation Format

Share Document