## Abstract

In this chapter, we consider a class of two-player nonzero-sum stochastic games with incomplete information, which is inspired by recent applications of game theory in network security. We develop fully distributed reinforcement learning algorithms, which require for each player a minimal amount of information regarding the other player. At each time, each player can be in an active mode or in a sleep mode. If a player is in an active mode, the player updates the strategy and estimates of unknown quantities using a specific pure or hybrid learning pattern. The players' intelligence and rationality are captured by the weighted linear combination of different learning patterns.We use stochastic approximation techniques to show that, under appropriate conditions, the pure or hybrid learning schemes with random updates can be studied using their deterministic ordinary differential equation (ODE) counterparts. Convergence to state-independent equilibria is analyzed for special classes of games, namely, games with two actions, and potential games. Results are applied to network security games between an intruder and an administrator, where the noncooperative behaviors are characterized well by the features of distributed hybrid learning.

Original language | English (US) |
---|---|

Title of host publication | Reinforcement Learning and Approximate Dynamic Programming for Feedback Control |

Publisher | John Wiley and Sons |

Pages | 303-329 |

Number of pages | 27 |

ISBN (Print) | 9781118104200 |

DOIs | |

State | Published - Feb 7 2013 |

## Keywords

- Games and learning algorithms, attacker/IDS
- Hybrid learning in games, network security
- Multiagent games, learning and control
- New paradigm of hybrid learning CODIPAS-RL
- Players' information limits, of payoff functions

## ASJC Scopus subject areas

- Engineering(all)