TY - JOUR

T1 - Optimal Radio Frequency Energy Harvesting with Limited Energy Arrival Knowledge

AU - Zou, Zhenhua

AU - Gidmark, Anders

AU - Charalambous, Themistoklis

AU - Johansson, Mikael

PY - 2016/12/1

Y1 - 2016/12/1

N2 - We develop optimal sleeping and harvesting policies for radio frequency (RF) energy harvesting devices, formalizing the following intuition: when the ambient RF energy is low, devices consume more energy being awake than what can be harvested and should enter sleep mode; when the ambient RF energy is high, on the other hand, it is essential to wake up and harvest. Toward this end, we consider a scenario with intermittent energy arrivals described by a two-state Gilbert-Elliott Markov chain model. The challenge is that the state of the Markov chain can only be observed during the harvesting action, and not while in sleep mode. Two scenarios are studied under this model. In the first scenario, we assume that the transition probabilities of the Markov chain are known and formulate the problem as a partially observable Markov decision process (POMDP). We prove that the optimal policy has a threshold structure and derive the optimal decision parameters. In the practical scenario where the ratio between the reward and the penalty is neither too large nor too small, the POMDP framework and the threshold-based optimal policies are very useful for finding non-Trivial optimal sleeping times. In the second scenario, we assume that the Markov chain parameters are unknown and formulate the problem as a Bayesian adaptive POMDP and propose a heuristic posterior sampling algorithm to reduce the computational complexity. The performance of our approaches is demonstrated via numerical examples.

AB - We develop optimal sleeping and harvesting policies for radio frequency (RF) energy harvesting devices, formalizing the following intuition: when the ambient RF energy is low, devices consume more energy being awake than what can be harvested and should enter sleep mode; when the ambient RF energy is high, on the other hand, it is essential to wake up and harvest. Toward this end, we consider a scenario with intermittent energy arrivals described by a two-state Gilbert-Elliott Markov chain model. The challenge is that the state of the Markov chain can only be observed during the harvesting action, and not while in sleep mode. Two scenarios are studied under this model. In the first scenario, we assume that the transition probabilities of the Markov chain are known and formulate the problem as a partially observable Markov decision process (POMDP). We prove that the optimal policy has a threshold structure and derive the optimal decision parameters. In the practical scenario where the ratio between the reward and the penalty is neither too large nor too small, the POMDP framework and the threshold-based optimal policies are very useful for finding non-Trivial optimal sleeping times. In the second scenario, we assume that the Markov chain parameters are unknown and formulate the problem as a Bayesian adaptive POMDP and propose a heuristic posterior sampling algorithm to reduce the computational complexity. The performance of our approaches is demonstrated via numerical examples.

KW - ambient radio frequency energy

KW - Bayesian inference

KW - Energy harvesting

KW - learning

KW - partially observable Markov decision process

UR - http://www.scopus.com/inward/record.url?scp=85009732782&partnerID=8YFLogxK

U2 - 10.1109/JSAC.2016.2600364

DO - 10.1109/JSAC.2016.2600364

M3 - Article

AN - SCOPUS:85009732782

VL - 34

SP - 3528

EP - 3539

JO - IEEE Journal on Selected Areas in Communications

JF - IEEE Journal on Selected Areas in Communications

SN - 0733-8716

IS - 12

M1 - 7543484

ER -