CS 4804 Homework #7
Date Assigned: November 27, 2006
Date Due: December 4, 2006, in class before class starts
- (100 points) Consider a 13-state Markov chain identified by states s_0, s_1, s_2, ..., s_12.
For every state, there is only one possible action. From state s_i, where i is between 2 and 12
(inclusive), this action has equal probability of reaching states s_(i-1) and s_(i-2).
State s_1 deterministically transitions to s_0. State s_0 is an absorbing state,
meaning the action causes a transition from s_0 to itself. The rewards are -3 for all
transitions except for the one from s_1 to s_0 (here the reward is -2)
and for the transition from s_0 to itself (here the reward is -1). Assume a discount factor
of gamma=0.9, and determine the value V of each of the 13 states.