OSDS-JOURNAL-2 - Stochastic Games with One-Step Delayed...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stochastic Games with One-Step Delayed Sharing Information Pattern Abstract In this paper we generalize the one-step delay sharing information pattern from team theory to the context of non-cooperative games with constraints. We give two different equivalent completely observable stochastic game formulations for the partially observable stochastic game with one-step delayed sharing information pattern. One of these formulations is the one with smallest possible state and action spaces. Because of the way the state-space is defined for these two formulations, their transition structure has certain important characteristics which inhibit some standard results on the existence of stationary strategy equilibria from being applicable to this case. We analyze these characteristics and identify conditions under which we can have equilibria in stationary strategies for the players. We then look at constrained team problem and address the problem of the necessity of joint randomization by suggesting the use of time sharing policies which although not stationary, but are essentially pure at each stage. We illustrate our framework for the game problem through an example in communication networks, namely, the non-cooperative Slotted Aloha with one step delayed sharing information pattern, for which we compute the Nash equilibrium strategies. I. INTRODUCTION Dynamic team decision problems with decentralized information and markovian state transitions can be formulated as a Partially Observable Markov Decision Process (PO-MDP), that can be solved using dynamic programming once we transform it to an equivalent Completely Observable Markov Decision Process (CO-MDP); see [7], [4], [5], [6]. The problem is that this transformation comes at the cost of enlarging the state space. In many problems involving decentralized information, the whole history has to be taken as a state which implies that the state space grows exponentially in the time horizon. An important challenge has been to identify information structures for which the dimension of the state space does not grow. The one step delayed sharing information pattern (OSDS) is shown to be one such information structure [6]. Hsu and Marcus [4] have established a framework for solving the decentralized partially observable markov decision process with OSDS information pattern. Our first contribution is to provide such a framework for constrained non-cooperative stochastic games with OSDS information pattern. As a by- product we are also able to extend the framework to the constrained decentralized PO-MDP, the so called team problem. We propose a completely observable stochastic game that is equivalent to the original one and that has the minimal set of states and actions. Formally, we show that the original partially observable stochastic game can be transformed into an equivalent completely observable stochastic game with the state space defined over the sufficient statistic for the estimation of the current global state. 2...
View Full Document

Page1 / 20

OSDS-JOURNAL-2 - Stochastic Games with One-Step Delayed...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online