Alexa opens up Web search database and API

Alexa Internet is offering online computing capacity for US$1 an hour -- and throwing in access to the database of millions of Web pages that lurk behind its Alexa toolbar search service.

Programmers who register for the beta version of Alexa Web Search Platform, released Tuesday, can use it to create specialized search engines for vertical markets, drawing results from the database of 4 billion Web pages crawled by Alexa, the company said. Alexa is a subsidiary of

Following in the footsteps of Google, Alexa is opening up the API (application programming interface) to parts of its search engine, but going one better by offering to host applications that build on its database -- for a fee. Programmers remixing Google's search utilities must organize their own application hosting.

Alexa Web Search Platform gives programmers a way to specify a subset of documents from the archive, develop an application to search those documents, and publish the results as an XML (Extensible Markup Language) feed or a specialized search engine.

The results returned can include simple text or HTML (Hypertext Markup Language) documents, or graphics, audio or video files.

As an example of how to use the service, Alexa has built a photo search engine at that allows visitors to refine their search for photographs according to technical details such as the size of the image, the make and model of camera it was taken with, and even the aperture setting used.

While the photo search engine shows how the platform can be used to build a live service, a one-off search of the database content can also be used to seed another service. That's how Rainer Typke, a researcher at the University of Utrecht in the Netherlands, used the platform to expand his searchable melody directory,

Typke used the platform to extract around 1,000 MIDI files from Alexa's database, converted them to a monophonic form and stored them on his own server to make them easier to search. Musipedia doesn't use Alexa for its live search service, Typke said in an e-mail response to questions.

Using the Alexa computer cluster, Typke plans to identify hundreds of thousands of MIDI files in the database and process them using an algorithm that extracts their characteristic melody. Those melody files will be used to expand the Musipedia directory. Later, he hopes to be able to process files containing audio recordings in the same way.

"For the more computationally expensive preprocessing that would be required, especially by audio, Alexa's fast and large computers will come in handy," he said.

Alexa will charge for hosting applications that use the platform. The charges include US$1 per processor per hour for computing capacity, US$1 a year for 1G-byte of storage, $1 per 50G-bytes of data processed by the system, US$1 per gigabyte of data transferred into or out of the system, and US$1 for every 4,000 search requests the system responds to from published search engines using the service.

Typke expects the pricing will "be okay for people like me," he said. He's identified a number of ways to control the cost of his melody search, including updating the core data less frequently, or restricting the search to a smaller subset of Alexa's total data.

"I still need to get a feeling for how much I can do with one hour of computing power," he said. "Getting the 1,000 files for the prototype took just minutes."

The API is designed for the C programming language. It can be used to build "Web services" which can be integrated into other systems or published through's Web services platform, Alexa said.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.
Peter Sayer

Peter Sayer

IDG News Service
Show Comments

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Luke Hill


I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Laura Johnston

MSI GS65 Stealth Thin

If you can afford the price tag, it is well worth the money. It out performs any other laptop I have tried for gaming, and the transportable design and incredible display also make it ideal for work.

Andrew Teoh

Brother MFC-L9570CDW Multifunction Printer

Touch screen visibility and operation was great and easy to navigate. Each menu and sub-menu was in an understandable order and category

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?