Edit: Caveat -- while this information was given at a staff meeting, it was announced by Andrew Herkovic (see below) that it was a 'public forum' and that employees could talk about any issues/facts/etc. that were communicated in a public forum with people outside of Stanford. On the basis of this, I posted the following information.
Today at work, there was an all-hands meeting for all library employees who wanted to learn more about the Google/Stanford deal. The talk was conducted by Catherine Tierney (Technical Services) and Andrew Herkovic (Foundation Relations & Strategic Projects) of SUL/AIR.
Google book digitization project -- 5 partners (Harvard, Stanford, UMich, NYPL, Oxford)
Stanford has NOT made a commitment to digitize all of its books, but we "will do as much as we can for as long as we can."
No exact number of books to be digitized given
Oversized materials (such as atlases) have been digitized in the prototype and have been discussed as part of the project, as will accompanying materials to books (maps, fold-outs, etc.), at least in theory
What Google will provide to the public --
- Works in copyright won't be fully available
- For copyrighted works -- there will be a click-through to the appropriate OCLC WorldCat record
Approximately 10% of Stanford's overall collection is clearly out of copyright; other material in the public domain (such as U.S. government documents) will be included in the project
Google will be responsible for determining what's in copyright and what's not if there are any questionable materials and copyright will drive what will be fully displayed
There's no special provision to fully display material in the last 20 years of copyright
Foreign language texts, including non-Roman languages, will be included
Google will be digitizing Stanford's material on Google's property, using their equipment/protocols and with their staff; the company has not yet been forthcoming as to how the process of digitization will be implemented in detail; however, Google's process is characterized as "industrial-strength digitization"
Google will be responsible for quality control of the scans
A format for the scans has not been decided
De-duplication is not a part of the process, at this time; Stanford is interested in having multiple copies of the same material across various partners
Google is being "coy" about standards and specs; minimums have been given, but little to no fixed specs
We believe that Google will be doing full OCR and indexing of everything they scan for us
Stanford may not mount everything that Google gives to us, but we won't reject scans for having less than perfect accuracy, either
Stanford will receive copies of Stanford's books but won't necessarily be getting the scans from Google's other partners; SUL is under contractual obligations to Google, so we won't/can't give away the digital materials to other projects, such as the Internet Archive or Project Gutenberg; however, we may be able to share our copies with other educational institutions
SUL isn't sure how the digitization will impact ILL
Funding: the scanning will be funded by Google, but the transfer of books to and from Google will have separate library funding
There are currently no plans on publicizing the protocols/process to outside institutions -- it will depend in part on the legal landscape that Google/Stanford faces prior to and during implementation
Factors in choosing which collections (or parts of collections) will be digitized:
- Current physical space of materials
- Percent of material out of copyright
- Will the collections end up in SAL3 [Stanford Auxiliary Library 3 in Livermore]
- How other projects (such as the Hoover monographs move) could be impacted
- Interest by publishers to make copyrighted works fully available
The plan is to start with just a few thousand books; the project will be implemented in stages
Material that is already in electronic format will not be excluded, ideally, but it may later become a factor in what material is chosen
In the short term, Stanford users will get access to the digitized texts via Google Print, just like general users -- in the long term, the scans/digital page images will be mounted on Stanford's servers with enhanced access, as part of SUL's Digital Repository
Impact on Stanford's users:
+ Materials to be scanned will be officially checked out; any books that aren't currently barcoded will have to be routed internally for barcoding before being sent to Google
+ SUL is considering arrangements for alternative access to materials that is in the process of being digitized, but there are no hard plans yet
+ Material may require metered access by Stanford users depending upon copyright issues
+ Each books that is digitized will be KEPT -- our patrons will continue to need those physical books and we will provide for them
+ We've retained the right to send or refuse to digitize material that we believe is too brittle/fragile to survive the process
More information about the project, including a FAQ, will be available at the SUL/AIR website in the near future