Version Française  |  Print  |  Send e-mail   Site search by REFLEXION ->  
Products
Noematics REFLEXION
NeuroNav
  Products  Noematics REFLEXION Search Engine  Detailed functionalities
 Features of the indexing and administration interface
Supported formats of documents HTML and ASP
PDF
Microsoft Office (Word, Excel, PowerPoint)
Text documents (ASCII)
MySQL Databases: textual contents
On request other relational databases like Access, SQLServer, Oracle, etc.

Images of scanned documents (including automatic character recognition) 
Indexing perimeter The administrator defines the area of indexing and its subdivisions, directories, types of documents and file names. The documents may be on one single server or several computers in a network (Intranet). Selected directories may be excluded.
Indexing activation: manual or automatic The indexing may be activated by the administrator, or automatically triggered (planned tasks).
Indexing: total or incremental According to the selected settings, the indexing will cover all files, or only those which have changed since the last indexing, or chosen directories.
It will take into account deleted and new files.
Automatic language detection The Noematics indexing engine automatically detects the language used in a file.
New words The indexing engine marks up all words not found in the linguistic databases, and stores them in a database proper to the work at a given time. Such words are mostly names, neologisms or technical terms.
Index Administration The administration interface: the site or Intranet administrator has a user-friendly tool offering a palette of choices for indexing and search administration.

Settings: selection of file formats to index, of automatic indexing options, of results pages display characteristics (colors of zones and fonts, font size, marking up colors), etc...

Statistics of requests: the administrator can display a detailed report of the requests which have been addressed to the search engine: date and time, questions, search options, number of matching responses.

Statistics on words: the administration interface can display a report of words sorted by their number of occurrences in the indexed work, for each language.
 REFLEXION
Présentation
Why choose REFLEXION?
How it works...
Detailed functionalities
Online demo
System requirements
Pricing
Evaluation version

 Search features and functionalities
Full text search The Noematics engine searches over the whole indexed work, as defined by the administrator before indexing. All the words are indexed.
Access control and Security Noematics search engine takes into account the basic autentication and the integrated autentication (NTLM). This ensure that a user will see in his/her search results only the documents to which he/she has an access right.
Several methods of user's access control to search results are available.
Number of user keywords Illimited
All the forms of words: inflected forms

The term "inflected forms" is used by linguists to designate the grammatical forms of a word, i. e. masculine/feminine (+neutral in some languages), singular/plural, and verb conjugations.

When the "inflected forms" option is activated (as it is by default), Noematics search will yield the same results whichever form was entered by the user for a given keyword. For example, a request with the word horse will give the same complete set of results as with horses. So with ox versus oxen, etc.

This functionality applies to nouns, adjectives, verbs. It is active by default, but may be disabled by the user.

Boolean Operations (summary) The user-friendly interface spares the user of Noematics the chore of manipulating boolean operations. An uneasy task and often source of errors, the engine takes for itself the job of translating the user request into the proper boolean operators (AND, NOT, OR). See also: Boolean operations (for connoisseurs).
Nearness / distance between words Among the documents containing the keywords entered by the user, this option retains only those in which the first word and the last word are separated at most by n words (the so-called banal words are counted - see below).
 
The idea is that nearness of words in a document gives a good clue about the relevance of the document regarding the set of concepts represented by the user's keywords.
Expressions and phrases This option retrieves only the documents containing the exact phrase entered by the user.
Excluded words This options removes from the result list all the documents containing one of the excluded words (or one of its forms if «inflected forms» is active).
Language handling The Noematics engine presently handles French and English languages. The French linguistic database contains some 290,000 forms stemming from more than 62,000 families of words (nouns, adjectives, verbs, and other grammatical categories). The English linguistic database contains some 148,000 formes stemming from more than 80,000 families of words.
Databases of other European languages may be developed by Noematics on request.
"Banal" or "noise" words Banal words are usually pronouns, conjunctions, etc, which do not explicitly designate objects, concepts or actions: they are indexed by the Noematics indexing engine, but skipped from the search request. For each language, Noematics has a list of about 300 banal words, that the administrator is free to customize by adding, editing or removing.
Special and accented characters The Noematics search engine take words exactly as they are entered by the user, including special and accented characters. But it also makes it possible to the user to forget the accents and however to obtain, in most cases, identical results.
"Ligature" characters For French language and some Latin words in English: these special characters are made by tying two vowels : æ, Æ, œ and Œ. They may appear in documents and are taken into account during indexing and stored as two vowels. The user may enter them as 2 vowels or as a "ligature" (if he/she can type it on his/her keyboard): the results of a request will be the same.
Mixing characters The user may enter into his/her request any combination of printable characters.

The wild card '*' may be used to replace a sequence of characters, but only at the end of a word.

Word breakers (delimiters) The Noematics search engine considers as a word any string of characters bounded at left and right by a "word-breaker" or delimiter, or by the beginning/end of a document. Noematics uses more than 50 word-breaker characters, of which only 10 to 15 are of current use.
It can thus index texts having various punctuations, from the most current to the rarest.
Fast search through numeric coding of words The Noematics engine uses a system of coding of the words which allows for excellent response times. Moreover, this system is at the root of the linguistic effectiveness of the engine. Of course, this coding is transparent to the user.
Ambiguity reduction Noematics resolves a part of the ambiguities between words during the indexing process, by priroritizing the grammatical categories most frequently used in queries.
Exhaustiveness The Noematics search engine is exhaustive on any given work, due to its full text indexing, to its extended linguistic databases, and also to its treatment of new words. If a request with a keyword does not give any result that simply means that that word is not present in the work.
Boolean operations (for connoisseurs) Any boolean operation is defined by reference to the "element document" defined when indexing (usually a file), and to the property "containing such word", so it is often a complicated issue for anyone not familiar with formal logic. The words you enter in the keyword text area are linked together by the operater AND. For example: you wish to get the list of all documents which each contain horse AND race.
But the inflected forms of each of these words are linked by the operator INCLUSIVE OR. For example: you request the documents each containing (horse OR horses OR these two forms) AND (race OR races OR these two forms). The words you enter in the "exclusion" text box are linked together by the operator INCLUSIVE OR regarding the elementary document. For example: exclude documents each containing butcher OR butchers OR slaughterhouse OR slaughterhouses OR any combination of these forms.
Eventually, the same request may be formulated as follows:
"Find the set of documents each containing (horse OR horses) AND (race OR races) minus the set of documents containing (butcher OR butchers OR slaughterhouse OR slaughterhouses)."
This may yield different boolean expressions depending upon the selection method you choose.
Display of results The Noematics search engine first presents to the user the list of documents matching the search criteria (keywords) and options. The layout of this list and the selection of the elements that are displayed is entirely customizable throught the administration module.
Ranking The Noematics engine classifies by decreasing relevance the list of documents matching the user's query.
For each document, the calculated relevance index (or rank) takes into account several factors, in particular:
  • the relative weight of each document in the indexed corpus and compared to its language.
  • the number of occurrences of the keywords in the document and in the corpus.
  • Highlight searched words found By clicking on one of the listed documents, the user gets the display of the selected document. The keywords present in the document are emphasized by a color. This functionality applies to html, text and asp documents.

    | home | contact us | print | search | site map  © Noematics 2004-2007