CS 4804 Homework #7

Date Assigned: November 27, 2006
Date Due: December 4, 2006, in class before class starts

(100 points) Consider a 13-state Markov chain identified by states s_0, s_1, s_2, ..., s_12. For every state, there is only one possible action. From state s_i, where i is between 2 and 12 (inclusive), this action has equal probability of reaching states s_(i-1) and s_(i-2). State s_1 deterministically transitions to s_0. State s_0 is an absorbing state, meaning the action causes a transition from s_0 to itself. The rewards are -3 for all transitions except for the one from s_1 to s_0 (here the reward is -2) and for the transition from s_0 to itself (here the reward is -1). Assume a discount factor of gamma=0.9, and determine the value V of each of the 13 states.