Aiming at optimizing polymer flooding, we establish an optimal control model of polymer flooding, which has an objective function of the net present value (NPV) involving the effect of polymer injection. An improved Monte Carlo gradient approximation (MCGA) algorithm, based on the idea of the ensemble-based optimization (EnOpt) scheme to solve the problem of strongly fluctuating perturbation gradients, is proposed by introducing the covariance matrix of the control vectors to filter and smooth the searching direction. A synthetic heterogeneous reservoir model is built to test the performance of the algorithms including the improved MCGA, standard MCGA and finite difference stochastic approximation (FDSA) algorithm. For the results, the improved MCGA gets closer to the optimal NPV of FDSA than the standard algorithm, and shows the high efficiency of saving calculation time compared with the FDSA. The value of NPV increases more than 20% for the improved algorithm, and the optimal production rates, injection rates, polymer concentrations, polymer slug sizes are obtained simultaneously. This paper subsequently discusses the influence of different time step sizes and oil prices. It can be concluded that moderate step size and relatively low oil price are applicable. Finally, an actual block application of the improved MCGA shows a 11.3% increase of NPV and a 6.5% increase of field oil production total (FOPT), showing the feasibility in optimizing real reservoirs.