Game theory is increasingly relevant in reinforcement learning where we have multiple agents. Understand the concept of Nash Equilibrium.

## Game Theory¶

- Mathematics of conflict.
- Multiple agents.
- Used in economics.
- Increasingly a part of AI/ML.

**1. Two-player zero-sum finite deterministic game of perfect information**

- A's points: leaves.
- B's points: negative of A.
- Pure strategy: 2
- A: maximize
- B: minimize

- Pure strategy: 2

**Minimax $\equiv$ Maximax Strategy**

- Both A and B would consider the worst case counter strategy.
- A: maximize points.
- B: minimize points.

- There would always have an optimal pure strategy for each player.
- You are assuming they are doing the same thing.
- You are also assuming they are assuming everyone else are doing the same thing.

**2. Two-player zero-sum finite non-deterministic game of perfect information**

- Pure Strategy: -2
- A would maximize
- B would minimize

- Pure Strategy: -2

**3. Two-player zero-sum finite non-deterministic game of hidden information**

- Mini-poker
- A is dealt a card, red or black with a probability of 50% each
- Red is bad for A
- Black is good for A

- A may resign if given a red card: - 20 for A
- Else A can hold:
- B resigns: + 10
- B sees:
- If red: - 40
- If black: + 30

- Else A can hold:

- A is dealt a card, red or black with a probability of 50% each

- Minimax is not equal to Maximax in this scenario.
- A's strategy depends on what B would do.
- B's strategy depends on what A would do.
- We can use a
**mixed strategy**here.- There's a distribution over strategies.

- Assume p is the probability of being a "holder" and A has a mixed strategy.
- B: resigner
- A's expected profit:
- 10p + (1-p)(-5) = 15p - 5

- A's expected profit:
- B: seer
- A's expected profit:
- -5p + (1-p)(5) = -10p + 5

- A's expected profit:
**We equate the equations to find out that we have p=0.4, where the expected value of game is +1.**

- B: resigner

**4. Two-player non-zero-sum finite non-deterministic game of hidden information**

- Typical scenario where the district attorney offers a deal to both criminals to turn on one another.
- No matter the response of the other prisoner, both would choose to defect. There is a dominance of a particularly strategy here.
- This is also known as the Prisoner's Dilemma.

**Nash Equilibrium**

- n players with strategies $S_1, S_2 ... S_n$
- $S^*_1 \in S_1 ... S^*_n \in S_n$ are in
**Nash Equilibrium**if- $\forall_i \ \ S^*_i = argmax_{s_i} utility_i(S^*_1 ... S^*_n)$ there is no reason for anyone to change.

**Nash Equilibrium Notes**

- In the n-player pure strategy game, if elimination of strictly dominated strategies elimates all but one combination, that combination is the unique NE.
- Any NE will survive elimination of strictly dominated strategies.
- If n is finite and for all strategies which are finite, then there exist at least one NE.
- If you have n-repeated game, you have n-repeated NE.