Abstract
Summary form only given. Uniformization permits the replacement of a semi-Markov decision process (SMDP) by a Markov chain exhibiting the same average rewards for simple (nonrandomized) policies. However, uniformization can be accepted as valid only for simple policies. Uniformization is generalized to yield consistent results for stationary policies also. These results are applied to constrained optimization of SMDP, in which stationary (randomized) policies appear naturally.
Original language | English (US) |
---|---|
Title of host publication | Unknown Host Publication Title |
Publisher | IEEE |
Pages | 86 |
Number of pages | 1 |
State | Published - 1986 |
ASJC Scopus subject areas
- General Engineering