Google upgrades Search Appliance
- — 03 June, 2004 10:17
Google has upgraded its Search Appliance, improving the capacity and performance of the device, which combines hardware and software to provide in a box the search functionality employed by the Google.com Web site.
Introduced in early 2002, the appliance is aimed at companies, educational institutions and government agencies that want to make their sites searchable using Google technology. Thus, a university might buy a Google Search Appliance to provide search capabilities for its student and employee intranets, as well as its public Web sites.
The appliance has been revamped to index more documents, do so more intelligently and perform more queries per minute, said Dave Girouard, Google's enterprise unit general manager. The new version of the product also features improved security and allows for collections of indexed documents to be partitioned with more flexibility and granularity, he said.
"We launched this product quietly in 2002 and it has grown nicely and become a successful business for Google," he said. "This is our first major new upgrade of the product."
The feedback Jupiter Research has gotten from customers about the first version of the product has been mixed, said David Schatsky, a Jupiter Research analyst. "What dissatisfaction there is probably comes from mismatched expectations between what clients need from search and what the Google appliance was able to deliver in its first version," he said.
Schatsky, who hadn't been briefed on the new version yet by Google, said, after hearing a quick rundown of key new enhancements in the product, that the company seems to have added features the market requires. "It sounds like in this upgrade Google is moving forward and has focused on key areas for improvement that early adopters have cited as needing development." Schatsky said.
Providing good search functionality for their Web sites is complicated for many organizations, he said. "The difficulties are related to various elements, such as technology, operational processes and user understanding. As a result, many companies feel they don't have the internal competencies to make search an effective tool and are thus attracted by the brand name and reputation of Google in a box."
In terms of performance enhancements, the new version can index as many as 1.5 million documents, which is five times as many as the first version, and execute 300 queries per minute, also a fivefold improvement, he said.
The new version also features more intelligent and efficient document crawling. The first version crawled documents in batch fashion, meaning it would scan and index the entire collection of documents every time the administrator scheduled a refresh. The new version only scans and indexes documents that have changed since the last crawl, an improvement that speeds up the process and reduces consumption of bandwidth and processing power, Google said.
In addition, administrators don't have to schedule the updates, since the new version is continuously crawling the collection, which results in changes being indexed more promptly. Thus, with the first version, the Search Appliance would be configured to run a batch update once a day, or once every two days, which could delay changes until the update was run, while the new version detects changes soon after they're made, Girouard said.
Google also enhanced the product's security by improving its ability to prevent users from viewing documents they're not authorized to access, he said. After executing a query, the upgraded product rounds up all the documents that contain the keywords and then filters those documents based on the user who made the query, showing only the documents that the user has permission to view, he said.
Another new feature is the ability to create different collections of documents, whereas the first version allowed only for the creation of one collection of documents, he said. Thus, with the new version, a company might create a collection of searchable documents for its sales and marketing employees, a different one for its call center employees, and so on.
A related new feature is the product's ability to support different user interfaces for a single collection. Thus, the administrator might set up a user interface for the sales and marketing employees that is different from the user interface for the call center employees, while having both sets of users access the same collection of documents, he said.
The new version of the Search Appliance is twice as tall as the first version because it has more powerful hardware, which in turn generates more heat and requires more space for cooling, Girouard said. That means it is 2U (3.5 inches) high, and 19 inches wide. Google doesn't reveal which vendor makes the appliance's hardware. "It's commodity hardware -- the same general hardware we use in our Google data centers," he said.
A basic installation of the Search Appliance can be completed in as little as 30 minutes, allowing an IS department to have it up and running in a matter of hours, he said. Installations that involve deeper customization will take longer to complete.
The Search Appliance can crawl and index documents in more than 250 file formats, as long as the documents are accessible via HTTP (Hypertext Transfer Protocol), he said. It supports 28 languages, and search formats such as natural language, keywords and Boolean, he said.
It delivers cached page results, allows for document sorting by date, lets users search within results and features a self-learning spell checker that suggests alternate spellings for queries. For administrators, the product generates usage reports and crawl analysis.
The product is sold as a stand-alone device under its GB 1001 model number. A GB 1001 with a capacity of 150,000 documents starts at US$32,000, while one with the maximum capacity of 1.5 million documents costs $175,000, he said. The new version of the GB 1001 is available now. Included in the price are two years of customer support, he said.
The Search Appliance is also sold in pre-configured stacks of multiple GB 1001s. The GB 5005 is a stack of five devices, while the GB 8008 is a stack of 12 devices. (In the first version, the GB 8008 was a stack of eight devices.) Google pre-configures these stacked devices to work together, he said.
At press time, Google didn't have statistics for the performance improvements that the stacked products of the new version will offer, he said. Prices for the stacked products are determined by the number of documents they're able to index, he said.
The upgraded Google Search Appliance is a 2U, Intel architecture server running Linux. (The earlier version was 1U.)