An adaptive revisit interval selection (RIS) in multifunction radars is an integral part of efficient time budget management (TBM). In this paper, the RIS problem is formulated as a Markov decision process (MDP) with unknown state transition probabilities and reward distributions. A reward function is proposed to minimize the tracking load (TL) while maintaining the track loss probability (TLP) at a tolerable level. The reinforcement learning (RL) problem is solved using the Q-learning algorithm with an epsilon-greedy policy. Compared to a baseline algorithm, the RL approach was capable of maintaining the tracks while reducing the tracking load significantly.