Assignment #2

The papers for this week focus on the problem of planning or, more generally, action selection given a current state, a transition model, and a goal set. We can turn such problems into a special case of an MDP in which:

Consider a domain with:

Both value iteration and state-space search methods can be used to solve these problems. Let's try to understand their relative computational complexity. Please provide brief answers to the following questions in your writeup.

Value iteration:

Breadth-first search:
If you're doing any kind of state-space search, such as Breadth-First Search (BFS), if at all possible you use some form of visited list or dynamic programming(DP).

Comparison

Heuristic
Heuristics can reduce the "effective branching factor" of BFS by pruning or postponing consideration of some action choices at each state in the search. Consider a heuristic that cuts the branching factor from m down to a*n for 0 < a <= 1.

In the next two classes, we'll consider model-based decision-making in stochastic domains and continuous domains. A bit later in the term, we'll consider learning with long-horizons and the relative merits of model-based vs model-free learning, and the importance of good explorations strategies.

Papers
Provide a short discussion of each of the assigned papers (listed under Course Materials). Keep in mind the implications of your answers from above. Also, include questions for discussion in class.

Upload a single PDF file through Stellar by Feb 12 at 10 am.