search engine that provides state-of-the-art text search and a rich structured query language for text collections of up to 50 million documents (single machine) or 500 million documents (distributed search).

Description

Features

Powerful Query Interface

  • Supports popular structured query operators from INQUERY
  • Suffix-based wildcard term matching
  • Field retrieval
  • Passage retrieval

Flexible Indexing and Document Support

  • Supports UTF-8 encoded text
  • Language independent tokenization of UTF-8 encoded documents.
  • Parses PDF, HTML, XML, and TREC documents
  • Word and PowerPoint parsing (Windows only)
  • Text Annotations
  • Document Metadata

Package Versatility

  • Open source, with a flexible BSD-inspired license
  • Includes both command line tools and a Java user interface
  • API can be used from Java, PHP, or C++
  • Works on Windows, Linux, Solaris and Mac OS X

Scalability and Efficiency

  • Best-in-class ad hoc retrieval performance
  • Can be used on a cluster of machines for faster indexing and retrieval
  • Scales to terabyte-sized collections


If this information is inaccurate or incomplete, please submit an update through this form.