Ústav teorie informace a automatizace

Jste zde

Bibliografie

Conference Paper (Czech conference)

Second Order Optimality in Markov and Semi-Markov Decision Processes

Sladký Karel

: Conference Proceedings. 37th International Conference on Mathematical Methods in Economics 2019, p. 338-343 , Eds: Houda M., Remeš R.

: MME 2019: International Conference on Mathematical Methods in Economics /37./, (České Budějovice, CZ, 20190911)

: GA18-02739S, GA ČR

: semi-Markov processes with rewards, discrete and continuous-time Markov reward chains, risk-sensitive optimality, average reward and variance over time

: http://library.utia.cas.cz/separaty/2019/E/sladky-0517875.pdf

(eng): Semi-Markov decision processes can be considered as an extension of discrete- and continuous-time Markov reward models. Unfortunately, traditional optimality criteria as long-run average reward per time may be quite insufficient to characterize the problem from the point of a decision maker. To this end it may be preferable if not necessary to select more sophisticated criteria that also reflect variability-risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean-variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. Such approach has been already studied for very special classes of semi-Markov decision processes, in particular, for Markov decision processes in discrete - and continuous-time setting. In this note these approaches are summarized and possible extensions to the wider class of semi-Markov decision processes is discussed. Attention is mostly restricted to uncontrolled models in which the chain is aperiodic and contains a single class of recurrent states. Considering finite time horizons, explicit formulas for the first and second moments of total reward as well as for the corresponding variance are produced.

: BB

: 10103

07.01.2019 - 08:39