Initiative in Innovative Computing @ Harvard
Seminar Series
Wednesday, October 10, 2007; 4:00pm
60 Oxford Street, Room 330
Leland Wilkinson, Systat Software Inc. and University of Illinois at
Chicago
Seminar Title: Automated Visualization of Large Datasets Using the
Grammar of Graphics Foundation
Abstract: Visualization has at least three purposes: 1) the
inspection of raw data, 2) the assessment of assumptions underlying
fitted models, 3) the presentation of fitted models. Automated
visualization (AV) is an attempt to serve these purposes through
intelligent automation of visualization and statistical methods.
While AV might be designed to serve all three purposes equally well,
its most suitable applications involve the early stages of a
discovery process. AV, however sophisticated, should not replace the
interactive process underlying the development and fitting of models
themselves.
AV involves the development of autonomous agents capable of creating
appropriate and informative visualizations based on a rich variety of
data sources. It can be especially useful for providing initial views
of data sources that are too large to comprehend in a single grasp.
AV can help in discerning missing values, irregularities, anomalies,
coding errors, and other effects that might bias the fitting of
models or refinement of judgments based on data.
Even the simplest visualization rests on a formal model. Thus, the
development of AV for data discovery and exploration requires methods
for devising and applying algebraic, semantic, statistical, and
aesthetic components. We find such a system in the Grammar of Graphics.
The Grammar of Graphics is the title of a book and a framework for
developing intelligent visualizations of statistical and scientific
data. Joint work with Graham Wills, Dan Rope, and a team at SPSS has
led to the implementation of a scalable visualization library based
on the book. And joint work with Anushka Anand and Robert Grossman at
UIC has led to the development of a novel algorithm (originally
proposed by John Tukey) for detecting patterns in high-dimensional
datasets. This work has been combined in an application called
AutoVis, which can detect and display significant patterns in an
unusually large variety of small and large datasets.
Upcoming IIC seminars
Continue to stay up to date with our IIC Seminar Schedule.
Parking is available in the 52 Oxford Street Garage. Please tell the
attendant that you are attending the IIC Seminar.
_______________________________________________
iic-seminars mailing list
iic-seminars(a)calists.harvard.edu
http://calists.harvard.edu/mailman/listinfo/iic-seminars