Talk Title:
How Not to Read a Million Books: Text Mining, and Reading the Unreadable
Speaker:
John M. Unsworth, Dean and Professor, Graduate School of Library and
Information Sciences, University of Illinois-Champaign
When:
Monday, October 20, 2008; 4:30pm
Where:
Barker Center, Thompson Room
Abstract:
“The Spectacles”
Christian Morgenstern
Korf reads avidly and fast.
Therefore he detests the vast
bombast of the repetitious,
twelvefold needless, injudicious.
Most affairs are settled straight
just in seven words or eight;
in as many tapeworm phrases
one can prattle on like blazes.
Hence he lets his mind invent
a corrective instrument:
Spectacles whose focal strength
shortens texts of any length.
Thus, a poem such as this,
so beglassed one would just -- miss.
Thirty-three of them will spark
nothing but a question mark.
(“Die Brille” from Galgenlieder, 1905)
Korf is the kind of reader for which some text-mining tools are
intended: someone who surely would profoundly approve of text-
summarization technology, for example--the sort of thing that tells
you what a newspaper article is about, so you don't have to go
through the tiresome and inkstained exercise of actually reading it.
In the Mellon-funded MONK project (MONK stands for Metadata Offer New
Knowledge) we seek to use text-mining techniques as a provocation
for reading, as well as to cast the net for that provocation much
more Broadly than one could do without computers.
In other words, although users may end up reading, even reading
closely, they begin by not reading, or by doing what Franco Moretti
calls distant reading, pointing out that when we begin by reading, we
can only take into account "a minimal fraction of the literary
field . . . a canon of two hundred novels, for instance, sounds very
large for nineteenth-century Britain . . . but is still less than one
per cent of the novels that were actually published: twenty
thousand, thirty, more, no one really
knows—and close reading won’t help here, a novel a day every day of
the year would take a century or so" (Maps, Graphs, and Trees).
_______________________________________________
iic-colloquium mailing list
iic-colloquium(a)calists.harvard.edu
http://calists.harvard.edu/mailman/listinfo/iic-colloquium