Workshop: Book scanning, proofreading, and advanced reuse

Friday 11th of November, 2011
L100 (Språkbanken)

by Lars Aronsson

This workshop presents new ways that old books can be used, after they have been digitized, based on experience from Wikisource and Project Runeberg.

By digitizing old books and journals, the Internet can provide access not only to current knowledge, but also to our history. But you can do more with electronic text than just reading it. Freely licensed or public domain works that have been digitized are a quickly growing playground for open source experiments.

Digitization is often described in terms of its problems: Textual quality and copyright. Two large-scale projects, Google Books and the Internet Archive, have different approaches to such problems. This workshop instead focuses on the potential reuse of the digitized works:

Indexing, cataloging, and cross referencing
Collaborative proofreading
Old encyclopedias and their reuse in Wikipedia
Full text search
Time travel and language history, Google n-gram viewer
Who wrote what
Who reviewed what, citation analysis
Parallel corpus alignment

In order for us to be able to plan the workshop sessions and which rooms to put them in, the deadline for letting us know about your attendence of any of the workshops are October 28th.


Concurrent events:

Next (up to 3) talks in the same room (L100 (Språkbanken)):

Events that start after this one (within 30 minutes):