A multi-hop energy harvesting wireless sensor network (EH-WSNs) is a key enabler for future communication systems such as the internet-of-things. Optimal power management and routing selection are important for the operation and successful deployment of EH-WSNs. Characterizing the optimal policies increases significantly with the number of nodes in the network. In this paper, optimal control policy is devised based on minimum-delay transmission in a multi-hop EH-WSN using reinforcement learning (RL). The WSN consists of M EH sensor nodes aiming to transmit their data to a sink node with a minimum delay. Each sensor node is equipped with a battery of limited capacity to save the harvested energy and a data buffer of limited size to store both the sensed and relayed data from neighboring nodes. Centralized and distributed RL algorithms are considered for EH-WSNs. In the centralized RL algorithm the control action is taken at a central unit using the state information of all sensor nodes. In the distributed RL algorithm the control action is taken locally at each sensor node using its state of information and the state information of neighboring nodes. The proposed RL algorithms are based on the state-action-reward-state-action (SARSA) algorithm. Simulation results demonstrate the merits of the proposed algorithms.
- action-value-function approximation
- energy harvesting
- reinforcement learning
- Wireless sensor network