Effective disaster management is required for the peoples who are trapped in the disaster scenario but unfortunately when disaster situation occurs the infrastructure support is no longer available to the rescue team. Ad hoc networks which are infrastructure-less networks can easily deploy in such situation. In disaster area mobility model, disaster area is divided into different zones such as incident zone, casualty treatment zones, transport areas, hospital zones, etc. Also, in order to tackle high mobility of nodes and frequent failure of links in a network, there is a need of adaptive routing protocol. Reinforcement learning is used to design such adaptive routing protocol which shows good improvement in packet delivery ratio, delay and average energy consumed.