Acceleration of Reinforcement Learning by Estimating State Transition Probability Model

Author(s):  
Shinji FUJII ◽  
Kei SENDA ◽  
Syusuke MANO
2021 ◽  
Vol 17 (1) ◽  
pp. e1008598
Author(s):  
Samuel Planton ◽  
Timo van Kerkoerle ◽  
Leïla Abbih ◽  
Maxime Maheu ◽  
Florent Meyniel ◽  
...  

Working memory capacity can be improved by recoding the memorized information in a condensed form. Here, we tested the theory that human adults encode binary sequences of stimuli in memory using an abstract internal language and a recursive compression algorithm. The theory predicts that the psychological complexity of a given sequence should be proportional to the length of its shortest description in the proposed language, which can capture any nested pattern of repetitions and alternations using a limited number of instructions. Five experiments examine the capacity of the theory to predict human adults’ memory for a variety of auditory and visual sequences. We probed memory using a sequence violation paradigm in which participants attempted to detect occasional violations in an otherwise fixed sequence. Both subjective complexity ratings and objective violation detection performance were well predicted by our theoretical measure of complexity, which simply reflects a weighted sum of the number of elementary instructions and digits in the shortest formula that captures the sequence in our language. While a simpler transition probability model, when tested as a single predictor in the statistical analyses, accounted for significant variance in the data, the goodness-of-fit with the data significantly improved when the language-based complexity measure was included in the statistical model, while the variance explained by the transition probability model largely decreased. Model comparison also showed that shortest description length in a recursive language provides a better fit than six alternative previously proposed models of sequence encoding. The data support the hypothesis that, beyond the extraction of statistical knowledge, human sequence coding relies on an internal compression using language-like nested structures.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Yuling Hong ◽  
Yingjie Yang ◽  
Qishan Zhang

PurposeThe purpose of this paper is to solve the problems existing in topic popularity prediction in online social networks and advance a fine-grained and long-term prediction model for lack of sufficient data.Design/methodology/approachBased on GM(1,1) and neural networks, a co-training model for topic tendency prediction is proposed in this paper. The interpolation based on GM(1,1) is employed to generate fine-grained prediction values of topic popularity time series and two neural network models are considered to achieve convergence by transmitting training parameters via their loss functions.FindingsThe experiment results indicate that the integrated model can effectively predict dense sequence with higher performance than other algorithms, such as NN and RBF_LSSVM. Furthermore, the Markov chain state transition probability matrix model is used to improve the prediction results.Practical implicationsFine-grained and long-term topic popularity prediction, further improvement could be made by predicting any interpolation in the time interval of popularity data points.Originality/valueThe paper succeeds in constructing a co-training model with GM(1,1) and neural networks. Markov chain state transition probability matrix is deployed for further improvement of popularity tendency prediction.


Sign in / Sign up

Export Citation Format

Share Document