| |
Abstract:
In packet switches, packets queue at switch inputs and contend
for outputs. The contention arbitration policy directly affects
switch performance. The best policy depends on the current state
of the switch and current traffic patterns. This problem is hard
because the state space, possible transitions, and set of actions
all grow exponentially with the size of the switch. We present a
reinforcement learning formulation of the problem that decomposes
the value function into many small independent value functions
and enables an efficient action selection.
|