We proposed an improved variational Bayesian exploration-based active Sarsa (VBE-ASAR) algorithm, which tries to balance the exploration and exploitation dilemma, and speeds up the convergence rate. First, in the learning process, variational Bayesian method is adopted to measure the information gain, which is used as an exploration factor to construct an internal reward function for heuristic exploration. In addition, before the learning process, in order to improve the exploration performance, transfer learning is used to initialize the value function, where Bisimulation metric is introduced to measure the distance between two states from the source MDP and the target MDP, respectively. Finally, we apply the proposed algorithm to the cliff walking problem, and compare with the Sarsa algorithm, the Q-Learning algorithm, the VFT-Sarsa algorithm and the Bayesian Sarsa (BS) algorithm. Experimental results show that the VBE-ASAR algorithm has a faster learning rate.