Book contents
- Frontmatter
- Contents
- Preface
- Part I Cognitive radio communications and cooperation
- Part II Resource awareness and learning
- 10 Reinforcement learning for energy-aware communications
- 11 Repeated games and learning for packet forwarding
- 12 Dynamic pricing games for routing
- 13 Connectivity-aware network lifetime optimization
- 14 Connectivity-aware network maintenance and repair
- Part III Securing mechanism and strategies
- References
- Index
10 - Reinforcement learning for energy-aware communications
from Part II - Resource awareness and learning
Published online by Cambridge University Press: 06 December 2010
- Frontmatter
- Contents
- Preface
- Part I Cognitive radio communications and cooperation
- Part II Resource awareness and learning
- 10 Reinforcement learning for energy-aware communications
- 11 Repeated games and learning for packet forwarding
- 12 Dynamic pricing games for routing
- 13 Connectivity-aware network lifetime optimization
- 14 Connectivity-aware network maintenance and repair
- Part III Securing mechanism and strategies
- References
- Index
Summary
This chapter considers the problem of average throughput maximization relative to the total energy consumed in packetized sensor communications. A near-optimal transmission strategy that chooses the optimal modulation level and transmit power while adapting to the incoming traffic rate, buffer condition, and channel condition is presented. Many approaches require the state transition probability, which may be hard to obtain in a practical situation. Therefore, we are motivated to utilize a class of learning algorithms, called reinforcement learning (RL), to obtain the near-optimal policy in point-to-point communication and a good transmission strategy in multi-node scenarios. For comparison purposes, stochastic models are developed to obtain the optimal strategy in point-to-point communication. We show that the learned policy is close to the optimal policy. We further extend the algorithm to solve the optimization problem in a multi-node scenario by independent learning. We compare the learned policy with a simple policy, whereby the agent chooses the highest possible modulation and selects the transmit power that achieves a predefined signal-to-interference ratio (SIR) given one particular modulation. The learning algorithm achieves more than twice the throughput per energy of the simple policy, particularly in the high-packet-arrival-rate regime. Besides the good performance, the RL algorithm results in a simple, systematic, self-organized, and distributed way to decide the transmission strategy.
Introduction
Recent advances in micro-electro-mechanical-system (MEMS) technology and wireless communications have made possible the large-scale deployment of wireless sensor networks (WSNs), which consist of small, low-cost sensors with powerful processing and networking capabilities.
- Type
- Chapter
- Information
- Cognitive Radio Networking and SecurityA Game-Theoretic View, pp. 249 - 269Publisher: Cambridge University PressPrint publication year: 2010