36-315 Statistical Graphics and Visualization, Spring 2003
Instructor:
Tom Minka, Statistics Dept, Baker Hall 228D, minka@stat.cmu.edu
Teaching Assistant:
Fang Chen, Baker Hall A60D, fangc@stat.cmu.edu
Lectures: Monday and Wednesday, 12:30-1:20, Doherty 2105
Computer labs: Tuesday, 12:30-1:20, Wean 5202
Overview
Graphs are not decorations; they are a powerful mechanism for representing
and interpreting data. They can provide more information than statistical
tests and are often more convincing. Graphs are often the quickest path to
winning an argument and producing action. This course teaches the methods
and principles which will allow you to realize the full potential of graphics.
Course Objectives
In this course you will learn:
-
How to critically interpret graphics appearing in the popular press,
academic publications, and software packages.
-
How to choose the right graph for the point you are trying to make
or, if necessary, how to design a new kind of graph.
-
How to create statistical graphics using the R software package.
-
How to analyze data and answer statistical questions with graphs.
Relation to Data Mining
Data Mining (36-350) is a companion course offered in the fall which
focuses on data analysis through visualization and modeling. It
covers more methods than covered here and is more advanced,
requiring a greater knowledge of statistics. But 36-350 does not
delve into graphical perception, maps, dynamic graphics, or interactive
graphics.
Schedule
The schedule is organized around data of increasing dimension:
1-D, 2-D, 3-D, and beyond.
-
univariate data
-
histograms, dot strips, density estimates, boxplots, quantile-quantile plots
-
pies, bars, dotcharts
-
visual perception of magnitudes
-
bivariate data, time series
-
scatterplots, curve fitting, line graphs
-
visual perception of curves
-
three-dimensional data
-
using perspective, glyphs, and colors
-
surface plots vs. contour plots
-
visual perception of color
-
maps
-
map projections, map coloring, map smoothing
-
animated maps
-
hyper-variate data
-
dynamic graphics, fly-throughs
-
interactive graphics, brushing
-
projection and slicing
Format
-
The course is taught in a lecture format on Monday and Wednesday and
via hands-on practice on a Tuesday computer lab.
-
Lectures will contain the material needed to complete the homeworks and lab
assignments. There is no text for the course.
Consequently, ATTENDANCE and participation in class is
critical for learning.
-
There will be a weekly OFFICE HOUR
where you can meet one-on-one with the instructor and/or teaching assistants.
Its schedule will be determined at the beginning of the semester.
-
In the computer labs, you will learn how to create statistical graphics,
under supervision of the lab assistants.
Computer labs are mandatory. Each Tuesday, a LAB
ASSIGNMENT will be handed out that must be
completed during the lab period.
There is nothing to hand in for the assignment; instead,
you must get the attention of the lab assistants
who will check your results and give you credit for the lab.
If you can't finish in time,
you will get partial credit for what you have completed.
Computer labs will use a free software package called R, which is similar to S-plus.
Unlike Data Desk or Minitab, R is a full-fledged programming language, and
can perform an unlimited set of operations.
-
HOMEWORK will be assigned weekly. The purpose of these assignments is to
improve your understanding of the methods and their results.
They are also scheduled to encourage you to keep up with the class.
Homework will involve answering questions
related to the lectures and creating graphics similar to what you practiced
in lab.
It will be posted on the web
and can be handed in or emailed to us.
Reading and homework should take about six hours per week.
-
There will be a FINAL PROJECT for the course,
due during the final exam period.
Grading
Final grade breakdown:
-
Homework: 30%
-
Labs: 30%
-
Final project: 40%
Each homework assignment will be worth 100 points. These points will be
divided approximately equally among each of the parts of the assignment.
The lowest homework grade will be dropped except if it is the last
assignment of the semester which is mandatory. The remaining homework
grades will be used to compute the homework average.
The same procedure is used for computer lab grades.
Extensions:
- The standard extensions
(medical, university event, or religious holiday)
must be accompanied by an official form as described in the student handbook.
- One un-official homework/lab extension may be taken
during the term.
- Other late homework will not be accepted.
All work and computer code must be your own.
Sharing code or answers will result in zero credit and a letter to your dean.
See the CMU Student Handbook
on
Cheating
and Plagiarism.