Colloquium in Mathematics and Computer Science,

Trading value and information in MDPs

Tuesday, August 16 | 14:00 | Science Building 8, Room 424

Interactions between an organism and its environment are commonly treated in the framework of Markov Decision Processes (MDP). While standard MDP is aimed at maximizing expected future rewards (“value”), the flow of information between the agent and its environment is generally ignored. In this talk, I will focus on the information involved in the process of action selection (“control information”) and show how it can be combined with the reward measure in a unified way.