Date: Friday, March 27, 2015
Time: Lunch: 12:30pm; Talk: 1pm
Location: Science Ctr. Hall E, 1 Oxford Street, Cambridge MA 02138
Speaker: Jeff Bilmes, Professor of Electrical Engineering at the University of Washington
Title: Summarizing Large Data Sets
Abstract: The recent growth of available data is both a blessing and a curse for the field
of data science. While large data sets can lead to improved predictive accuracy and can
motivate research in parallel computing, they can also be plagued with redundancy, leading
to wasted computation. In this talk we will discuss a class of approaches to data
summarization and subset selection based on submodular functions. We will see how a form
of "combinatorial dependence" over data sets can be naturally induced via
submodular functions, and how resulting submodular programs (that often have approximation
guarantees) can yield practical and high-quality data summarization strategies. The
effectiveness of this approach will be demonstrated based on results from a wide range of
applications, including document summarization, machine learning training data subset
selection (for speech recognition, machine translation, and handwritten digit
recognition), image summarization, and assay selection in functional genomics.
Speaker Bio: Jeffrey A. Bilmes is a professor in the Department of Electrical Engineering
at the University of Washington, Seattle and an adjunct professor in the Department of
Computer Science and Engineering and the Department of Linguistics. He received his Ph.D.
in Computer Science from the University of California, Berkeley. He is a 2001 NSF Career
award winner, a 2002 CRA Digital Government Fellow, a 2008 NAE Gilbreth Lectureship award
recipient, and a 2012/2013 ISCA Distinguished Lecturer. Prof. Bilmes has been working on
submodularity in machine learning for more than twelve years. He received the best paper
award at ICML 2013 and a best paper award at NIPS 2013 for work in this area. Prof.
Bilmes is also a recipient of a 25-year paper award from the International Conference on
Supercomputing for his 1997 paper on high-performance matrix optimization. Prof. Bilmes
has authored the graphical models toolkit (GMTK), a dynamic graphical-model based software
system that is widely used in speech and language processing, bioinformatics, and
human-activity recognition.
Free and open to the public. No registration required.
***********************
UPCOMING SEMINARS
4/10 Budhendra
Bhaduri<http://web.ornl.gov/sci/gist/staff_bios/staff_bhaduri.shtml>
(Oak Ridge National
Laboratory<http://www.ornl.gov/>--- Geographic Information
Science and Technology)
4/24 Christian
Rudder<http://www.okcupid.com/about> (OkCupid)
Click
here<https://lists.seas.harvard.edu/mailman/listinfo/iacs-events> to subscribe
to our events list.
_______________________________________________
Iacs-events mailing list
Iacs-events(a)seas.harvard.edu
https://lists.seas.harvard.edu/mailman/listinfo/iacs-events