Establishing a line-of-sight link between autonomous underwater vehicles (AUVs) is an unavoidable challenge for realizing high data rate optical communication in ocean exploration. We propose a method for link establishment by maintaining the relative position and orientation between AUVs. Using a reinforcement learning algorithm, we search for the policy that can suppress external disturbances and optimize the link establishment efficiency. To evaluate the performance of the proposed method, we prepared a hovering AUV to conduct the link establishment experiments. The reinforcement learning policy trained in a simulation environment was deployed on the AUV in real environments. In field experiments, our approach successfully performed the link establishment from the hovering AUV to an autonomous surface vehicle. Based on the experimental results, we evaluate the performance of the AUV in executing the link establishment policy. Comparisons with existing optical search-based link establishment methods are presented.