Many document formats supported |
HTML, MHT and ASP
PDF
Microsoft Office (Word, Excel, PowerPoint)
Text (ASCII)
MySQL Databases: textual fields. Other databases on demand
Scanned documents (with automatic optical character recognition)
|
| The domain to index |
The Administrator defines the domain to index by a list of the network shared directories that will be indexed, usually on various servers in an Intranet / Extranet.
All subdirectories of each entry of the list will be indexed, but it is possible to exclude some of them via a "blacklist".
The administrator may also select or exclude document formats before an indexing session.
|
| "Manual" or "planned" indexing sessions |
Indexing sessions may be started "manually" by the Administrator through the adminstration interface, but usually they are planned to be automatically started according to a given schedule.
|
| Incremental indexing |
Every indexing session refers to the previous one in order to detect new, modified or deleted documents. Only new and modified documents are indexed: this is incremental indexing
After the installation of REFLEXION, or after an re-intialization of the index, a complete indexing occurs.
|
| Automatic language detection |
The REFLEXION indexing engine detects the dominant language of each document before treating it.
|
New or technical words
|
When the indexing engine cannot find a word in any of the language dictionaries (or in any existing custom dictionary), it stores it in a special (or "custom") dictionary which is continuously updated. Such words are mostly proper nouns, new words and specialized or technical words.
This allows for a fully customized treatment of the enterprise set of documents.
|
| Administering indexes |
The administration module:
the web site / intranet Administrator finds with REFLEXION a set of tools and a pelette of settings for administering and monitoring the indexing and searching activities.
Settings: selection of document formats to index, definition of the automatic indexing schedules, detailed settings of the indexing process, etc...
Index statistics: the Administrator may display the histories of indexing sessions with their main characteristics (number of documents, index size, etc...)
Queries statistics: histories of the user queries (time and date, content of the query, number of results, etc...)
Words statistics: they show a sorted list of the most used words in the indexed set of documents, by languages.
|