Warm up
You may want to refer to background texts on breadth-first search, value iteration, and dynamic programming to brush up on those concepts (e.g., see Sutton & Barto: Chapter 4).
Bonet & Geffner: Heuristic Search Planner
Heuristics tend to be derived from “relaxed” versions of the problem that are:
Iterated Width (IW) Search
This paper introduces a breadth-first search method based on a measure of “novelty.”
Explain in your own words what novelty means. How is this definition related to the intuitive meaning of novelty?
Why does this method accelerate search on real problems? What property of real problems is this method exploiting? Why does pruning based on low novelty make sense?
Planning with Pixels
This paper uses simple pixel colors as the features for an IW search. What might be better features to use?
This method works surprisingly well on Atari games. What issues do you think it might encounter if you tried to apply it to more complex video games, or real world robotics problems?
In the next two classes, we'll consider model-based decision-making in stochastic domains and continuous domains. A bit later in the term, we'll consider learning with long-horizons and the relative merits of model-based vs model-free learning, and the importance of good explorations strategies.
Upload a single PDF file through Canvas by Feb 13 at 1pm.