Enterprise search is much like air and water: Users expect it to be available without a second thought. Google and ISYS continue to perfect their enterprise offerings to do just that.
Both the Google Search Appliance 4.6.4 and ISYS:web 8 produce quality Web search results, and both vendors offer desktop components to index local hard-disk content. The battle lines are drawn in how each crawls and federates content from enterprise systems -- portals, databases, legacy systems, and external Web services -- and the value enterprises receive.
Google Search Appliance 4.6.4
Much has improved since I first reviewed the GB-1001 2U server in October 2004, especially how deep the system now reaches into relational databases and file shares. Furthermore, you can push non-Web-accessible information from portals and other internal systems to the appliance by employing code based on Google Enterprise APIs.
I found the new One Box for Enterprise most intriguing. This set of APIs enables users to securely access business applications, such as CRM or BI systems, from the Google search box -- and have this information presented separately from public search results.
Dell now manufactures the Google Search Appliance; my test system was a re-badged PowerEdge 2950 with two dual-core Intel Xeon 5140 (2.33GHz) processors and 16GB of RAM. The system is still physically locked, and initial setup remains plug-and-play.
I connected the box to my network and temporarily plugged in my laptop to perform the initial configuration using a Web form. In about an hour, I was using my remote PC to create search collections, manage crawls, and customize search layout pages.
The Admin Console UI remains a collection of basic Web pages and forms accessed from a straightforward navigation tree. I had no trouble entering URLs to index and specifying continuous crawling to ensure that new content would be found right away and be included in search results.
I also set up KeyMatches, to give preference to specific results for common queries; Query Expansion, to enlarge a query to include multiple words with identical meanings; and Synonym lists. Changing the basic look of the search box and results was quick; more extensive changes didn't take too much longer using the XSLT style-sheet editor.
Crawling structured content in database systems follows much the same formula. I easily completed a form with the connection information for a Microsoft SQL 2000 database server and designated the database rows and fields to crawl.
Because enterprise use of Microsoft SharePoint is so prevalent, I put indexing of WSS (Windows SharePoint Services) or SPS (SharePoint Portal Server) sites on my requirements list. Google currently handles this with an open source SharePoint Connector. For now, this is only sample code, and it takes a bit of configuring to make it work. Google representatives said they plan to release a new API and Connector framework in the first quarter of 2007. The new framework will build the SharePoint connector into the appliance's software and enable easier crawling of Documentum, OpenText, and other enterprise document repositories.
The Google Search Appliance provides a solid range of security and access control, omitting documents from search results if users aren't entitled to see them. The system indexes both public and restricted information -- and enforces document-level security policies at search time. Google also serves secure results with x509 client certificates, a common requirement in government agencies.
Search results were consistently top-quality. At the basic level, I searched information protected by basic HTTP authentication, and I integrated the appliance with Lotus Notes to crawl a Lotus Domino server. Best pages were shown first, with similar results grouped into one cluster. New conveniences include number and date ranges that users can specify to narrow down results.
I also examined several third-party OneBox for Enterprise solutions, which are quickly loaded through the appliance's admin interface. The OneBox technology creates a trigger that determines whether the search is relevant to a OneBox module, such as finding customer information within your Salesforce.com account. Google then passes appropriate security credentials to the provider, gets the results in XML, transforms the data into HTML based on an XSL template, and presents the results to the user in line with their other search results.
This type of mashup is one of the more important developments in enterprise search. Users get relevant information from document management systems, Oracle purchase requisitions, SAS reports, and others within the featured area of the search results -- all without any special steps.