Introduces the theory and application of modern, computationally-based methods for exploring and drawing inferences from data. Covers re-sampling methods, non-parametric regression, prediction, and dimension reduction and clustering. Specific topics include Monte Carlo simulation, bootstrap cross-validation, splines, local weighted regression, CART, random forests, neural networks, support vector machines, and hierarchical clustering. De-emphasizes proofs and replaces them with extended discussion of interpretation of results and simulation and data analysis for illustration.
After completing this course, a student will be able to understand the theoretical basis for the current methods used in statistical analysis.
140.646-648 or 140.611-12 or 140.621-24 or 140.651-54 or 140.671-74; working knowledge of calculus
- T. Hastie, R. Tibshirani, and J. H. Fried. (2001) The Elements of Statistical Learning. Springer-Verlag: New York.
- Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S-Plus. Springer-Verlag: New York.
- Brian D. Ripley. (1996) Pattern Recognition and Neural Networks. Cambridge University Press.
Method of student evaluation based on homeworks, quizzes, and a final project.