Institute of Information Theory and Automation

You are here

Is the optimal decision making with learning able to win over multi-armed bandits?

Type of Work: 
ÚTIA AV ČR, v.v.i., department AS, 266052274
Decision making under uncertainty, Bayesian learning, adaptive control, expoitation and exploration

1. Learn basics of dynamic decision making under uncertainty. 2. Learn basics of Bayesian learning. 3. Review existing approaches balancing exploitation with exploration. 4. Select or propose the most promising ones and experimentlally verify their properties. 5. Make as general as possible conclusions or hypotheses about inspected decision strategies as the basis for the further research. 


Probabilistic dynamic systems are appplied in technology, transportation, economics, medicine, electronic democracy etc. They are able to model complex technologies, lumphatic systems or group of automata known as one-arm bandits. Often, structure of the model is known but its parameters are unknown and has to be learnt. Often it has to be done jointly with influencing the system, with control. This creates an interesting and difficult problem as the chosen decisions (inputs, actions) influence both the system and learning efficiency. Reaching of a feasible balance between the corresponding expoitation and exploration is an interesting long standing problem that can and should be addressed with the novel fully probabilistic design of decision strategies.  

Topic at FJFI ČVUT but it can be solved at other faculties or universities.

Selected parts: 1.V. Peterka, Bayesian approach to system identification, in P. Eykhoff ed., Trends and Progress in System Identification, p. 239-304, Pergamon Press, Oxford, 1981. 2. M. Kárný, T.V.Guy, Fully probabilistic control design, Systems & Control Letters, 55:4, 259-265, 2006 3. M. Kárný et al, Optimized Bayesian Dynamic Advising: Theory and Algorithms, Springer, London, 2006 4. M. Kárný et al: Dynamic Decision Making" Fully Probabilistic Design,

2018-08-13 09:18