Some libraries close books to Google, Microsoft

Indexing on competitive search engines remains an obstacle

Some libraries are choosing to pay to have their content digitized by the Open Content Alliance rather than having it scanned for free by Google or Microsoft, which refuse to allow access to the materials by rival search engines.

The Boston Library Consortium (BLC) is teaming with the Open Content Alliance (OCA) to build a library of digital materials that will be freely available via the Internet.

The BLC is composed of 19 academic and research libraries in Massachusetts, Connecticut, New Hampshire and Rhode Island. The consortium is digitizing all its content published before 1923. Content published before that date is considered in the public domain and not subject to copyright laws.

The cost for digitizing is US$0.10 per page, and the BLC is funding the effort at a cost of US$845,000 over two years. The work is also being supplemented by the OCA, which received a US$2 million grant from the Alfred P. Sloan Foundation. Part of that grant will be used to digitize the John Adams Collection at the Boston Public Library, a member of the consortium.

The OCA was developed by the Internet Archive and search company Yahoo in early 2005 as a way to preserve a variety of content, such as digitized collections and multimedia. Yahoo doesn't have a stand-alone book-search service.

The issue involves access to the digitized material. Search companies such as Google and Microsoft will scan the books for free, but want to restrict access for competitive reasons. The consortium wants access to its books available to anyone and in any search engine.

BLC Executive Director Barbara Preece said her organization selected the OCA because it kept the content search-engine neutral.

The OCA allows "you to hold onto your content and do whatever you want to do to your content, and it can be searched by any search engine whatsoever," Preece said. "OCA was the best way for us to go to keep our content open. Google pretty much decides who you can share your content with. With OCA, it doesn't matter what search engine you use to search the material. Google and Microsoft are interested in search, and the OCA is more interested in content and helping libraries handle their content the way they want to."

Google spokesman Gabriel Stricker said the company designed its Book Search to promote the sharing and use of the content the company is digitizing, where appropriate. He said for books in the public domain, Google provides full access to the material, including the ability to read a book in its entirety, download a PDF to a computer and print a work for free. He said there are restrictions for books still under copyright to ensure that copyright holders are protected.

"The libraries we work with receive copies of all the digital files that they can use to serve their students, faculties and partners," Stricker said in an e-mail. He added that libraries are also free to work with other organizations to digitize their content. Stricker did not directly respond to concerns that Google refuses to allow the material it digitizes to be available through other search engines.

Jay Girotto, group program manager for Microsoft's Live Book Search, said his company has been involved with the OCA since October 2005.

"Microsoft put in much more than US$2 million to fund the creation of a mass digitization program that could actually work," Girotto said. "We digitized about 100,000 books under the OCA principles, and we were hoping there would be other significant financial contributors." However, that didn't happen, he said.

"We saw many people in the library community willing to adopt Google's more restrictive stance around book search and sign up with Google, and we were faced with a decision about what to do," he said. "We were essentially providing most of the capital that was building out [the program], but there were really no restrictions on Google taking the output of the process -- the image file, the [optical character recognition] file and the metadata -- and simply having the same use to it that Microsoft had."

Girotto said Microsoft last November decided to put one restriction on the use of the material it was digitizing, which was that the material couldn't be used by its commercial competitors, including Google, Yahoo and Ask.com. But Microsoft still doesn't restrict distribution of copies of the books it digitizes for academic use among institutions, he said, although Google maintains this restriction.

Join the PC World newsletter!

Error: Please check your email address.

Struggling for Christmas presents this year? Check out our Christmas Gift Guide for some top tech suggestions and more.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Linda Rosencrance

Computerworld

Most Popular Reviews

Follow Us

Best Deals on GoodGearGuide

Shopping.com

Latest News Articles

Resources

GGG Evaluation Team

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell

LIFEBOOK UH574

The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi

STYLISTIC Q702

The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott

STYLISTIC Q702

My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Latest Jobs

Shopping.com

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?