TY - CHAP

T1 - Introduction to Partially Observed MDPs

AU - Zhu, Quanyan

AU - Xu, Zhiheng

N1 - Publisher Copyright:
© 2020, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2020

Y1 - 2020

N2 - In our cross-layer design, we use different models to capture the properties of different layers. As stated in Chap. 8, we can use an MDP model to capture the dynamical movements of the cyber layer. However, in the scenario, we assume that the defender can observe the cyber state at each cyber time instant. In real applications, it is challenging to obtain the full information of the cyber state directly. Hence, the MDP cannot capture the incomplete knowledge of the cyber states. In this chapter, we will introduce a Partially Observed Markov Decision Process (POMDP) to capture the uncertainty of the cyber state. In a POMDP, instead of observing the states, we have an observation, whose distribution depends on the state. Therefore, we use this information to build a Hidden Markov Model (HMM) filter, which can construct a belief of the states. Based on the belief, we aim to find an optimal policy to minimize an expected cost.

AB - In our cross-layer design, we use different models to capture the properties of different layers. As stated in Chap. 8, we can use an MDP model to capture the dynamical movements of the cyber layer. However, in the scenario, we assume that the defender can observe the cyber state at each cyber time instant. In real applications, it is challenging to obtain the full information of the cyber state directly. Hence, the MDP cannot capture the incomplete knowledge of the cyber states. In this chapter, we will introduce a Partially Observed Markov Decision Process (POMDP) to capture the uncertainty of the cyber state. In a POMDP, instead of observing the states, we have an observation, whose distribution depends on the state. Therefore, we use this information to build a Hidden Markov Model (HMM) filter, which can construct a belief of the states. Based on the belief, we aim to find an optimal policy to minimize an expected cost.

UR - http://www.scopus.com/inward/record.url?scp=85096373017&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85096373017&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-60251-2_10

DO - 10.1007/978-3-030-60251-2_10

M3 - Chapter

AN - SCOPUS:85096373017

T3 - Advances in Information Security

SP - 139

EP - 145

BT - Advances in Information Security

PB - Springer

ER -