050.680 Learning Theory

<

The fifth (and final) problem set

April 19, 2007 by Bob Frank

A final problem set is now available, ready for downloading. This problem site invites you to explore the application of Markov chains (in the guise of bigram models) to modeling phonotactic knowledge. This archive contains Matlab functions and data files you will find useful in doing this problem set.

Markov Chains

April 8, 2007 by Bob Frank

The handout for Markov chain discussion can be downloaded here. For more detailed discussion about the properties of Markov chains with lots of useful examples and discussion, not to mention proofs of the theorems discussed in class, see chapter 11 of Grinstead and Snell's Introduction to Probability textbook. On n-gram models and HMMs, you should read through chapter 4, chapter 5 (especially section 5.5) and chapter 6 (at least through section 6.5) of the second edition of Jurafsky and Martin's Speech and Language Processing textbook.

For a discussion of part of speech tagging from the MaxEnt perspective we talked about a couple of weeks ago, have a read through Ratnaparkhi's 1996 ACL paper A Maximum Entropy Model for Part of Speech Tagging.

The fourth problem set

March 19, 2007 by Bob Frank

A new problem set is ready for you, to add to your clustering and Matlab pleasure. It's available for download here. To do this problem set, you will need to make use of the matlab functions contained in this archive. If you are intersted in hearing examples of the vowels that you'll be clustering, take a stroll to the UCLA phonetics lab.

Readings for next week

Feb 24, 2007 by Bob Frank

Next week, we'll be starting our discussion of clustering. As we mentioned in class yesterday, you should read the papers by Kuhl and by Maye & Gerken (both on the readings page) for some CogSci motivations. We'll begin our technical discusion on Monday.

The second problem set

Feb 18, 2007 by Bob Frank

The next problem set is available here (and due on Friday). This will require some Matlab programming on your part, but don't panic! Adam has prepared some helpful handouts, and you may find useful the links listed under Resources to the left on this page.

Readings for next week

Feb 3, 2007 by Bob Frank

The articles for next week are all available on the readings page. Look first at the article by Resnik, then the one by Brill and Kapur, and finally the one by Peperkamp et al. As announced yesterday, we probably won't be getting to these until next Friday.

Update on problem set

Jan 29, 2007 by Paul Smolensky

In Problem 3 of the first problem set, the “error set” refers to the set of sample points where the hypothesis diverges from the truth.

It might be a good idea to take a serious crack at Problem 3 before Adam’s problem session (Wed at 2) so you can get help if needed.

Finally, the most relevant bits of the two chapters from Machine Learning are these: Chapter 2: Section 2.5-2.7, pp. 29-45
Chapter 7: through Section 7.4, pp. 201-220

PAC learning and problem set

Jan 26, 2007 by Bob Frank

Background reading for Paul's discussion of PAC learning can be found in two chapters from Mitchell's text Machine Learning. For your reading pleasure, here are chapter 2 and chapter 7.

The first problem set, distributed today and due next friday, is available here.

Gold's theorem

Jan 24, 2007 by Bob Frank

There is a very nice and readable introductory paper on Indentification in the Limit, Gold's theorem and its (mis)interpretation in Cognitive Science by Kent Johnson entitled (appropriately enough) Gold's Theorem and Cognitive Science, Philosophy of Science 71, 571 – 592. I encourage everyone to have a look!