Day 41 - More course overview

A master list of algorithms, sorted by response/predictor type:
Response Predictor Visualization Sorting Abstraction Model
1num n - histogram,
boxplot
- beyond,within -
1cat c - hierarchy - balance -
contingency
table
c - mosaic correspondence
analysis
predictive
(homogeniety)
binomial,lift
1aov n1c boxplots by median predictive
1. same variance
2. diff't variance
normal
regression
tree
nc -- predictive tree
classification
tree
cc -- predictive tree
response
table
n2c profile by effects - additive,bilinear,normal
linear
regression
nn scatter -- linear,additive,bilinear,normal
classification cn cplot
(colored scatterplot)
- discriminative
projection
logistic regression,
nearest neighbor
multivariate n- star, parallel PCA PCA projection,
clustering
multiv.normal


Classification methods

Histogram method
  • For classifying batches
  • L(pop) = sum_x n_x log p(x | pop)
  • Anomaly detection: L(pop) < t
  • Fast to train and test
  • Special case of linear classifier
  • Highly constrained
Tree
  • Recursive partitioning
  • Flexible but crude
  • Fast
  • Concise
  • Unstable
Logistic regression
  • Slow to train, fast to test
  • Linear classifier
  • Somewhat constrained
  • Feature expansion weakens constraints
K-nearest neighbor
  • Flexible
  • Slow
  • Not concise
  • Stable
  • Hurt by irrelevants

Tom Minka
Last modified: Tue Dec 04 13:18:56 Eastern Standard Time 2001