This chapter presents application of reinforcement learning to drug dosing personalization in treatment of chronic conditions. Reinforcement learning is a machine learning paradigm that mimics the trialand- error skill acquisition typical for humans and animals. In treatment of chronic illnesses, finding the optimal dose amount for an individual is also a process that is usually based on trial-and-error. In this chapter, the author focuses on the challenge of personalized anemia treatment with recombinant human erythropoietin. The author demonstrates the application of a standard reinforcement learning method, called Q-learning, to guide the physician in selecting the optimal erythropoietin dose. The author further addresses the issue of random exploration in Q-learning from the drug dosing perspective and proposes a “smart” exploration method. Finally, the author performs computer simulations to compare the outcomes from reinforcement learning-based anemia treatment to those achieved by a standard dosing protocol used at a dialysis unit.