Google: 129 million different books have been published

Google estimates about 129 million books have been published, and it plans to digitize them all

For those who have ever wondered how many different books are out there in the world, Google has an answer for you: 129,864,880, according to Leonid Taycher, a Google software engineer who works on the Google Books project.

Estimating the number of books in the world is more than an exercise in curiosity for the search giant: It also provides a roadmap of some of the work still left to be done in meeting the company's ambitious goal of organizing all the world's information.

"When you are part of a company that is trying to digitize all the books in the world, the first question you often get is: 'Just how many books are out there?'," Taycher explained in a blog post announcing the estimate.

To come up with a reasonable approximation, the company started by ingesting book information from multiple cataloging systems, such as the International Standard Book Numbers (ISBN).

Such catalogues, while helpful, do not provide a definitive count, however. For instance, ISBNs have only been assigned to books since the 1960s, and tend to be only used in the Western countries.

Also multiple books have been assigned to individual ISBN numbers, and publisher have assigned ISBNs to items other than books, such as t-shirts and DVDs.

So Google engineers have written programs to comb though about 150 such catalogues and directories, and eliminate as many duplicate entries as could be found.

The company also had to make a number of tough decisions about what is and isn't a book, Taycher explained.

For instance, soft cover and hard cover editions of a text are counted as two books, as are the many different versions of a popular text, such as Shakespeare's "Hamlet," due to the forewords and commentaries they may contain. Serials may count as individual books or as a collected work.

As of June, the company has scanned 12 million books, according to a presentation given by Google Books engineering manager Jon Orwant at the USENIX Annual Technical Conference in Boston. These books have been written in about 480 languages (including 3 books in the Star Trek-originated Klingon language) .

The company plans to complete the scanning of existing books within a decade. The resulting virtual collection will consist of four billion pages and two trillion words, Orwant said.

About 20 percent of the world's books are in the public domain, Orwant explained. About 10 to 15 percent of these books are in print. The remaining books -- the vast majority of all titles -- are still under copyright but out of print. Google is in the process of borrowing copies of these books in order to digitize them, from about 40 large libraries worldwide.

It's this act of scanning in books that are out-of-print but still covered by copyright that has been met with some resistance by the publishing industry.

The company is now waiting for a judgement from the U.S. District Court for the Southern District of New York, on whether it can scan these books.

In 2005, the Authors Guild and the Association of American Publishers separately filed class-action lawsuits against the search giant, asserting that the company is infringing on author copyrights by scanning in the books.

Google has claimed it wants to sell digital copies of these otherwise out-of-print books, and set aside royalties for the authors to claim. The company also hopes to reveal snippets of these books in Web searches, and claims this use falls under the U.S. Fair Use doctrine.

Scanning in all the world's books will lead to other benefits in addition to improving searches, Orwant explained. Once all these volumes are digitized, their contents can be subjected to analysis, which can lead to new insights. Linguists can discover when certain words came into widespread use, or who first starting using these words.

The Google Book Search could also help answer some outstanding historical questions: For instance, it could inform the debate over whether Isaac Newton and Gottfried Leibniz -- or someone else entirely -- invented calculus.

"We can search not just for a phrase but for a concept," Orwant explained. "We can take all the different ways [that the idea of] infinity can be inflected, translate that into different languages, and do a search in parallel."

"My hope is that as we start to expose a lot more of this collection, it will allow people to ask questions like this that they haven't been able to ask before," he said.

IDG News Service editor Juan Carlos Perez contributed to this report.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is

Join the PC World newsletter!

Error: Please check your email address.

Tags analyticsGooglecloud computinginternetsearch engines

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Cool Tech

Crucial Ballistix Elite 32GB Kit (4 x 8GB) DDR4-3000 UDIMM

Learn more >

Gadgets & Things

Lexar® Professional 1000x microSDHC™/microSDXC™ UHS-II cards

Learn more >

Family Friendly

Lexar® JumpDrive® S57 USB 3.0 flash drive 

Learn more >

Stocking Stuffer

Plox Star Wars Death Star Levitating Bluetooth Speaker

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest News Articles


GGG Evaluation Team

Kathy Cassidy


First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni


For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell


The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi


The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott


My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?