« Stanford and the Serials Crisis | Main | Reviving Marion »

Strategic value of XML

I've been remiss. I went to hear a talk given by Roy Tennant over a month ago, wrote copious notes ... and promptly forgot all about it.

Following is a not-at-all comprehensive summation, with a link to the slide presentation he used in tandem with his talk. This probably isn't new to many, but I haven't studied XML yet, so I hope to go back to this if and when I get a better sense of how XML can and is used.

SLA San Andreas Chapter Meeting
January 21, 2004 at Exponent
Guest Speaker: Roy Tennant, California Digital Library

XML: The Strategic Opportunity

XML holds the same potential now that the Internet had 15 years ago
Focus of the talk: the good, useful things that libraries are doing (or should be doing) to solve problems (caveat: examples drawn from academic environments, not special libraries)

The 5-cent tour of XML:
* XML is a way to make up our tags in an online metadata environment that have innate hierarchy -- the tags must be well-formed (opening and closing tags that nest, etc.)
* Valid XML is made up of specified tags with specified rules that match a schema (the set of rules for validation
* XML is stricter than HTML -- bad HTML can still display content in a web browser (the browser will ignore the bad code), while bad XML will prevent any display of content.
* XSLT: stylesheets for XML

XML Challenges

  • Only librarians like to search, everyone else likes to find

  • Our users want more information about books

  • Our users want services tailored to their needs and desires

  • We must do more with less

  • Our bibliographic infrastructure is increasingly unable to get the job done

  • We must deal with a variety of metadata systems now

Users want more information about books -- via web services
SOAP & REST (protocols/standards for web services)
SOAP -- Simple Object Access Protocol: a lightweight way to exchange encoded information between applications
REST -- REpresentational State Transfer: a URL-based (HTTP Get) way of sending a SOAP request and receiving an XML-encoded response
Amazon and Google can be searched via web services

We must do more with less
Automated content draws people to new content
RSS feeds - RSS (Rich Site Summary/Really Simple Syndication): useful for current awareness

The current bibliographic infrastructure is not working
Foundation: MARC syntax, MARC elements and AACR2 application rules -- non-intuitive, not used outside of the library world
The fundamental question: Does it get the job done?
The answer: The job has changed --
* Former sole function -- inventory control; now -- resource discovery
* Multiple, diverse metadata streams
* Online delivery
* Multiple file formats
Major mission creep
Non-ILS metadata systems are spawning everywhere: electronic research databases, archival systems, insitutional repositories

Can we do better? YES
Technological advances, cheap storage and different needs are all on our side

A new bibliographic infrastructure is needed:

  • Multiple bibliographic schemata: it won't all fit into MARC

  • A transfer schema: an XML schema for ingesting, storing and transfering bibliographic data, such as METS

  • Application rules

  • A review of best practices

  • Concentrating on enrichment services

  • Tools

  • Crosswalks

Is changing worth it?
We can encompass more information and do things for more people, but we need to recreate our bibliographic foundations to deal with different systems (ex.: D-Space, e-Scholarship, OAIster)


Thank you for posting this summary and the link to his PowerPoint presentation. I am planning to take XML through SJSU SLIS this fall because it is listed as one of the top 9 technology trends by the ALA Lita group.