home
NEWS       BLOGS       FORUMS       NEWSLETTERS       RESEARCH       EVENTS       DIGITAL LIBRARY       CAREERS  
Network Computing Network Computing Powered by InformationWeek Business Technology Network

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers



Technology Business Applications
R E V I E W  
Panning for Gold

  September 18, 2003
  By Sean Doherty


TOC Issue TOC
Printer Print full article
Printer Print this page
Printer Download as PDF
E-Mail E-Mail this URL
Discuss Discuss this article
flame author Flame the author
 
  In this article
arrow
Introduction
arrow
CSIRO Panoptic Enterprise Search Engine 4.2.0
arrow
Kanisa Site Search 5.0
arrow
Mondosoft MondoSearch 5.1
arrow
dtSearch Web 6.20
arrow
Executive Summary | Web Links
arrow
How We Tested
arrow
Report Card

Content is king, as valuable as gold. Enterprises create it, save it, license it and sell it. Wherever possible, they reuse data and even refresh it. But what happens when an important bit of data is misplaced? It takes time and money to search for, re-create or reacquire that information. Increasingly, companies are turning to enterprise search engines to index and manage content and avoid the costs associated with recovering lost data.

However, finding the money in the IT budget to buy a search engine can be tough. Search engines don't create content, so they may be perceived as a low priority. But at many sites, current content can be reused and even regenerated into profit. If you look at a search engine as an ongoing maintenance cost for content, you may just get the funds you need.

Perhaps the biggest advantage to using an enterprise search engine is that you can find many of your documents stored in multiple formats using a single product. As long as data can be displayed in text through a browser, it can be indexed and searched using an enterprise search tool.

These search engines also let you re-energize other systems. Processors driving file systems won't spend needless cycles looking for files or content in files. Databases won't have to crunch as many queries, and legacy systems will gain a new lease on life because they're not spending an inordinate amount of time in the search cycle. Better yet, you won't have to train your employees in SQL.


Search-engine software has two components: an indexer and the actual search engine. Indexers retrieve content, extract words and index them for fast retrieval. Engines interpret queries and locate words, concepts or phrases relevant to the question in the index, then format the output in HTML or XML and send it to the user or device that initiated the question.

We went looking for enterprise-class search engines--those that work behind a firewall or secure VPN. The vendors had to supply search-engine software or an appliance that supported it. We did not want it bundled with portal software or content-management software. Our contestants also had to be able to search both structured data in databases and unstructured data on Web servers and file stores. And we required support for a variety of document formats, including word processing and presentation and graphics editors.

We required indexers to retrieve content from secure Web pages (HTTPS) and standard HTTP servers and file systems, and to remove duplicate pages. We also required them to extract words from HTML, XML, Microsoft Office and PDF documents, and index the content. Finally, they had to support ODBC or JDBC (Java Database Connectivity) connectors or gateways.

As for the search engines, we asked that they include a spellchecker and support for phrase searching and stemming (grammatical variations) in addition to keyword searching. We also required a prebuilt search form or user interface to test the indexers and search engines.

We sent invitations to 11 vendors. Four stepped up to the table: CSIRO (Commonwealth Scientific and Industrial Research Organisation), Kanisa, Mondosoft and dtSearch Corp. Each sent software products to our Syracuse University Real-World Labs®.

The companies that dropped out, declined or just didn't qualify ran the gamut from small to large. Copernic Technologies didn't qualify because its product doesn't support ODBC or JDBC. Autonomy Corp. and EasyAsk declined to participate but gave no reason. Convera, Dieselpoint and Fast Search & Transfer each said it is working on a new version of its software and declined. Both Verity and Google declined to participate on the basis of company policy, though Verity was changing its policy as this article went to press.



Navigational Searching
click to enlarge

As for our four contestants, we tested their ability to satisfy navigational searches by using Network Computing's production Web site (www.nwc.com), which contains almost 35,000 pages (see "How We Tested,"). We also tested indexing and searching capabilities using informational searches taken directly from the log files on www.nwc.com. Three of the four products we tested performed above average. Only dtSearch came in under par.

We judged the search engines on their ability to retrieve content using an indexer, also called a spider or crawler. We put a heavy emphasis on the search process, including how much control the administrator could assert, and assessed the amount of control that could be applied as well as the overall performance in navigational searches. We also looked at each vendor's management console and how it accomplished installation, configuration and customization tasks on the search-engine portion. And we considered log files and reporting capabilities. Prices were compared across the board.

Panoptic Enterprise Search Engine won our Editor's Choice award. Its secure and easy-to-use administrative interface, navigational deftness and indexing prowess put it on top.


start top Introduction CSIRO Panoptic Enterprise Search Engine 4.2.0 





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo JitterPlug Into The Cloud
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet EvolutionPyramid Research
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space


App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |  Advertising Contacts  |   Briefing Centers
Copyright © 2009  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights