Alexa opens up Web search database and API

Alexa Internet is offering online computing capacity for US$1 an hour -- and throwing in access to the database of millions of Web pages that lurk behind its Alexa toolbar search service.

Programmers who register for the beta version of Alexa Web Search Platform, released Tuesday, can use it to create specialized search engines for vertical markets, drawing results from the database of 4 billion Web pages crawled by Alexa, the company said. Alexa is a subsidiary of

Following in the footsteps of Google, Alexa is opening up the API (application programming interface) to parts of its search engine, but going one better by offering to host applications that build on its database -- for a fee. Programmers remixing Google's search utilities must organize their own application hosting.

Alexa Web Search Platform gives programmers a way to specify a subset of documents from the archive, develop an application to search those documents, and publish the results as an XML (Extensible Markup Language) feed or a specialized search engine.

The results returned can include simple text or HTML (Hypertext Markup Language) documents, or graphics, audio or video files.

As an example of how to use the service, Alexa has built a photo search engine at that allows visitors to refine their search for photographs according to technical details such as the size of the image, the make and model of camera it was taken with, and even the aperture setting used.

While the photo search engine shows how the platform can be used to build a live service, a one-off search of the database content can also be used to seed another service. That's how Rainer Typke, a researcher at the University of Utrecht in the Netherlands, used the platform to expand his searchable melody directory,

Typke used the platform to extract around 1,000 MIDI files from Alexa's database, converted them to a monophonic form and stored them on his own server to make them easier to search. Musipedia doesn't use Alexa for its live search service, Typke said in an e-mail response to questions.

Using the Alexa computer cluster, Typke plans to identify hundreds of thousands of MIDI files in the database and process them using an algorithm that extracts their characteristic melody. Those melody files will be used to expand the Musipedia directory. Later, he hopes to be able to process files containing audio recordings in the same way.

"For the more computationally expensive preprocessing that would be required, especially by audio, Alexa's fast and large computers will come in handy," he said.

Alexa will charge for hosting applications that use the platform. The charges include US$1 per processor per hour for computing capacity, US$1 a year for 1G-byte of storage, $1 per 50G-bytes of data processed by the system, US$1 per gigabyte of data transferred into or out of the system, and US$1 for every 4,000 search requests the system responds to from published search engines using the service.

Typke expects the pricing will "be okay for people like me," he said. He's identified a number of ways to control the cost of his melody search, including updating the core data less frequently, or restricting the search to a smaller subset of Alexa's total data.

"I still need to get a feeling for how much I can do with one hour of computing power," he said. "Getting the 1,000 files for the prototype took just minutes."

The API is designed for the C programming language. It can be used to build "Web services" which can be integrated into other systems or published through's Web services platform, Alexa said.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Peter Sayer

IDG News Service
Show Comments

Cool Tech

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Breitling Superocean Heritage Chronographe 44

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?