# 36-315 Statistical Graphics and Visualization, Spring 2003

Instructor:
Tom Minka, Statistics Dept, Baker Hall 228D, minka@stat.cmu.edu

Teaching Assistant:
Fang Chen, Baker Hall A60D, fangc@stat.cmu.edu

Lectures: Monday and Wednesday, 12:30-1:20, Doherty 2105
Computer labs: Tuesday, 12:30-1:20, Wean 5202

## Overview

Graphs are not decorations; they are a powerful mechanism for representing and interpreting data. They can provide more information than statistical tests and are often more convincing. Graphs are often the quickest path to winning an argument and producing action. This course teaches the methods and principles which will allow you to realize the full potential of graphics.

## Course Objectives

In this course you will learn:
1. How to critically interpret graphics appearing in the popular press, academic publications, and software packages.
2. How to choose the right graph for the point you are trying to make or, if necessary, how to design a new kind of graph.
3. How to create statistical graphics using the R software package.
4. How to analyze data and answer statistical questions with graphs.

## Relation to Data Mining

Data Mining (36-350) is a companion course offered in the fall which focuses on data analysis through visualization and modeling. It covers more methods than covered here and is more advanced, requiring a greater knowledge of statistics. But 36-350 does not delve into graphical perception, maps, dynamic graphics, or interactive graphics.

## Schedule

The schedule is organized around data of increasing dimension: 1-D, 2-D, 3-D, and beyond.
• univariate data
• histograms, dot strips, density estimates, boxplots, quantile-quantile plots
• pies, bars, dotcharts
• visual perception of magnitudes
• bivariate data, time series
• scatterplots, curve fitting, line graphs
• visual perception of curves
• three-dimensional data
• using perspective, glyphs, and colors
• surface plots vs. contour plots
• visual perception of color
• maps
• map projections, map coloring, map smoothing
• animated maps
• hyper-variate data
• dynamic graphics, fly-throughs
• interactive graphics, brushing
• projection and slicing

## Format

• The course is taught in a lecture format on Monday and Wednesday and via hands-on practice on a Tuesday computer lab.

• Lectures will contain the material needed to complete the homeworks and lab assignments. There is no text for the course. Consequently, ATTENDANCE and participation in class is critical for learning.

• There will be a weekly OFFICE HOUR where you can meet one-on-one with the instructor and/or teaching assistants. Its schedule will be determined at the beginning of the semester.

• In the computer labs, you will learn how to create statistical graphics, under supervision of the lab assistants. Computer labs are mandatory. Each Tuesday, a LAB ASSIGNMENT will be handed out that must be completed during the lab period. There is nothing to hand in for the assignment; instead, you must get the attention of the lab assistants who will check your results and give you credit for the lab. If you can't finish in time, you will get partial credit for what you have completed.

Computer labs will use a free software package called R, which is similar to S-plus. Unlike Data Desk or Minitab, R is a full-fledged programming language, and can perform an unlimited set of operations.

• HOMEWORK will be assigned weekly. The purpose of these assignments is to improve your understanding of the methods and their results. They are also scheduled to encourage you to keep up with the class.

Homework will involve answering questions related to the lectures and creating graphics similar to what you practiced in lab. It will be posted on the web and can be handed in or emailed to us. Reading and homework should take about six hours per week.

• There will be a FINAL PROJECT for the course, due during the final exam period.

• Homework: 30%
• Labs: 30%
• Final project: 40%

Each homework assignment will be worth 100 points. These points will be divided approximately equally among each of the parts of the assignment.

The lowest homework grade will be dropped except if it is the last assignment of the semester which is mandatory. The remaining homework grades will be used to compute the homework average. The same procedure is used for computer lab grades.

Extensions:

• The standard extensions (medical, university event, or religious holiday) must be accompanied by an official form as described in the student handbook.
• One un-official homework/lab extension may be taken during the term.
• Other late homework will not be accepted.
All work and computer code must be your own. Sharing code or answers will result in zero credit and a letter to your dean. See the CMU Student Handbook on Cheating and Plagiarism.