You are cordially invited to the next IIC Colloquium Wednesday, Feb. !0:
************************
Machine Learning and Statistics on Massive Astronomical Datasets
Feb. 10, 1010, 4:00 pm
Room G115, Maxwell Dworkin, 33 Oxford Street, Cambridge
Alexander Gray
Professor of Mathematics and Computer Science
Computational Science and Engineering Division
College of Computing
Georgia Institute of Technology
Abstract
I’ll describe algorithms and data structures for allowing the most
powerful machine learning and multivariate statistical methods, which
often scale quadratically or even cubically with the number of data
points, to be performed many orders of magnitude faster than naive
implementations. Such techniques can make previously impossible
statistical analyses tractable on the scale of entire astronomical sky
surveys. I will discuss scalable algorithms we have developed for n-
point correlations, friends-of-friends, nearest-neighbors, kernel
density estimation, nonparametric Bayes classification, principal
component analysis, local linear regression, isometric non-negative
matrix factorization, hidden Markov models, k-means, support vector
machine-like classifiers, Gaussian process regression, and Gaussian
graphical model inference, among others. In addition to techniques
inspired by computational geometry, fast multipole methods, and Monte
Carlo integration, we employ a distributed framework which can be
thought of as a higher-order version of Google’s MapReduce. Our
algorithms have enabled several first-of-a-kind large-scale
cosmological analyses.
Bio
Alexander Gray received bachelor's degrees in Applied Mathematics and
Computer Science from UC Berkeley and a PhD in Computer Science from
Carnegie Mellon University, and worked in the Machine Learning Systems
Group of NASA's Jet Propulsion Laboratory for 6 years. He currently
directs the FASTlab (Fundamental Algorithmic and Statistical Tools
Laboratory,
http://www.fast-lab.org/) at Georgia Tech, consisting of
25 people including 14 PhD students, which works on the problem of how
to perform machine learning/data mining/statistics on massive
datasets, and related problems in scientific computing and applied
mathematics. Employing a multi-disciplinary array of technical ideas
(from machine learning, nonparametric statistics, convex optimization,
linear algebra, discrete algorithms and data structures, computational
geometry, computational physics, Monte Carlo methods, distributed
computing, and automated theorem proving), the lab has developed the
current fastest algorithms for several fundamental statistical
methods, and also develops new machine learning methods for difficult
aspects of real-world data, such as in astrophysics and biology. This
work has enabled high-profile scientific results which have been
featured in Science and Nature and has earned an NSF CAREER award, two
best paper awards, and two best paper award nominations. He has given
tutorials for the field and invited talks on efficient algorithms for
machine learning at venues including ICML, NIPS and SIAM Data Mining,
as well as in applied mathematics and astronomy.
---------------
Refreshments will be served at 3:45 pm.
Mark your calendar for these upcoming events:
Monday, Feb. 22, noon, Maxwell Dworkin 333: SciGPU Seminar, Ron
Babich, Boston University
Wednesday, Feb. 24, 4 pm, Maxwell Dworkin G115: IIC Colloquium, Bruce
Boghosian, Tufts University
Wednesday, March 4, 4 pm, Maxwell Dworkin G115: Distinguished Lecture
in Computational Science, Erik Winfree, Caltech
For more information about IIC colloquia and other events :
http://iic.harvard.edu/events/upcoming
______________________________________________________________________________________________
iic-colloquium mailing list
iic-colloquium(a)seas.harvard.edu
https://lists.deas.harvard.edu/mailman/listinfo/iic-colloquium