« Rawblog: Colin Powell | Main | Colin Powell and Transparency »

Rawblog: Dan Clancy from Google Book Search

SLA Public Policy Update - 6/16/09
Dan Clancy
Google Book Search

Quick overview of Google Book Search
Google's Mission
To organize the world's info and mark it universally accessible & useful

1 Consumer Product, 2 sources
Partner Program - working directly with publishers, permission granted expressly from rightholders
Library Project - working with libraries

Typical library collection
Public Domain
Less than 20%

Unclear/orphan works/out-of-print, post-1923

Definitely in copyright/in-print
Less than 10%

3 views: full (public domain), partner, snippet

Blending - adding Google Book Search results into the general search results.
Most books are getting discovered quickly, especially partner books

Google Book Settlement
The agreement settles existing US lawsuits against Google Book Search
Agreement is btw. Google and a class including all US copyright holders (authors, publishers, etc.) for books that Google has scanned or are intending to scan
Agreement only pertains to uses of these books in the US
includes intl books
does not include periodical/serials

Extensive process to notify rightholders of the agreement
Rightholders have choice as to their participation
* Opt out of the settlement
* Remove books from scanning
* Select desired access models
Google is authorized to scan, index & make non-display uses for all books
A set of Access models are defined for providing access to the content of the books
For in-print books, the access models are by default off, for out-of-pring books the access models are by default turned on.
Books Rights Registry formed as an indy org to represent rightholders and to collect and distribute income (held for 5 years)

Access Models

Preview Uses (current model)

Online Consumer Access
- Enable user to buy online access to single work
- Default pricing set by Google using algorithmic pricing model to maximize revenue for each individual work

Institutional Subscription
FTE pricing for universities and other groups to buy for their users

Public Access Terminal
One free on-site terminal for all public and university libraries in US

Additional models:
Print-on-demand, etc.

Expanding Access to Knowledge
Anyone, anywhere in the US will be able to search, preview & purchase millions of out-of-print books through Book Search
From snippet view (3 snippets only, user must get physical book) to preview & purchase (institutional subscriptions & public access service license)
Less people are hunting down the physical books, and they aren't easy to access

Additional Topics / Components of the Settlement
Books Rights Registry
- Claiming data
Research Corpus (text and book research methods)
Public Domain and Government Works
What happens if Google goes away
All the library partners have the right to get the scans of the books and find an alternative partner to make them available

What does a hit on the Book Rights Registry mean?
The registry is a superset of metadata from various sources, but doesn't mean that the book HAS been scanned or WILL be scanned.
There is some underclustering of various editions of the same work

If a rightsholder wants to opt out, does he/she/it have to cite every title & edition?

How does the Google Book Rights Registry compare with what the Copyright Office is doing?
BRR is an independent rights database, but it does not supersede/supplant the registration info held by the Copyright Office.

Can international users access books via the GBS?
Google wants to provide access to international users of international books

The settlement was fostered to help develop a comprehensive rights database that will also help other providers
Google wants to foster competition and is pro open-source, open-access, open-s
The BRR is non-exclusive

Problem of orphan works bigger with photos, pamphlets, ephemera than with books, but Google still supports orphan work legislation

What does "85% access" mean?
Failsafe provision: certain services that Google has to make available - search, Institutional Sub, free terminal, Find in Library
If less than 85% of the books are not made accessible/available, then an alternative provider may step in and take

Who and what will decide what ends up available? How to deal with censorship?
Google has no intention to censor their material
Plans to make all scans, barring opt-out by rightsholders, available

Creative Commons?
Still figuring out a solution, but plans to recognize CC-licensed works for appropriate level of display/access

Government works?
Hasn't been fully displaying fed docs because potential liability over copyrighted works excerpted in a gov doc
Settlement allows Google to treat fed docs as being in the public domain

Google also working on making works available in different language and for those with "print disabilities"

What is the likelihood of libraries allowing other providers to scan works? Would those alternatives be covered by the Settlement?
There is lots of replication and there are other scanning initiatives. Google thinks this encourages more scanning initiatives and competition, but they would not be covered under the settlement

As material goes into the public domain, how will Google manage the change?
Google has already distributed the rights renewals registry for works between 1923-1960
Once material goes into the public domain, it's not part of the BRR ...
Still looking for a scalable method of finding more public-domain books

Google Book Search will become a monopoly because no one will be able to catch up (question from audience)
Google is investing in this and they believe that others should
They believe that this project is covered by fair use, and the settlement does not erode Google's position on fair use
There are other players in this,

What of privacy?
Privacy wasn't discussed in the Settlement because it didn't seem like the right conversation to hold with publishers and authors
The right way to do privacy is an agreement between Google and users
Libraries aren't rolling over the privacy issue, even with the current agreements
Finding a balance btw. privacy, security and user features
Google is still designing the system
Not planning to have different privacy policies with different organizations - engineering nightmare

What will happen with user data?
No individual authentication for the Institutional Subscription

Library trust towards Google - are we partners or parts of the PR campaign?
Google is used to having their products embraced by so many users so quickly
Will be working on developing relationships with partners and user institution

Details about the public access terminal?
It is a partial solution for access in public libraries.
1 terminal per building, minimum, per Settlement
The BRR may ultimately provide more
Will not be a dedicated terminal (i.e. only for GBS), but will be authenticated as the free access terminal

Pricing - what will keep the Institutional Subscription affordable?
Pricing needs to satisfy 2 objectives: 1) get revenue for rightsholder and 2) ensure broad distribution
Long tail product -- getting students to see and like the product, eventually become consumer purchasers
Google sees the brand being part of the bottom line, doesn't want to lose customer loyalty

Clancy: one of the things that's most exciting about this is the ability to expand the collections of libraries so that someone in Brownsville, TX can access some of the same material as a student at Harvard, Michigan, Stanford, etc. Hopes people will be encouraged to go to the primary source.

What is Internet Archive doing? Does the settlement affect what it's doing?
Most of what IA has scanned is public domain but has started scanning post-1923 orphan works. Google doesn't know what an ophan work is and doesn't want to create new ones.

What happens in 10-15 years if Google isn't doing as well?
The largest costs are the upfront/scanning costs, so the question in 10-15 years is whether Google will continue to provide access.

What happens with new works?
Will probably work with publishers via the partner program


TrackBack URL for this entry:


This is helpful - I'm glad you went. I'm happy to see that they'll be able to treat fed documents as public domain. Thanks.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)