-----------------------------------------------------------
Basic Concepts
The most of following concepts are from the product overview.
| 
Concept | 
Description | 
| 
Document set   description, indexes | |
| 
Applies   algorithms or business rule-based ranking to the results | |
| 
Data Flow   Overview | |
| 
Module   Overview | 
Talking   about a general ideas about all ESP modules | 
| 
Basic   Concepts | 
Talking   about the concepts in ESP | 
| 
Content | 
Data that   has not submitted to the FAST ESP system | 
| 
Document | 
Processed,   searchable content are called Document | 
| 
Collections | 
Documents   are grouped into different collections. Each collection can have its own   processed and indexed way (Index Profile). Also, by setting priority for each   collection, we can specify the order of document processing. | 
| 
Search   Profile | 
Define what   to search and how the queries and results should be processed and displayed | 
| 
Document and   Document Element | 
One content   will be converted into a document. Each property of the content will be converted   into a document element. | 
| 
Index   Schedule, Profile | 
FAST Search   Engine maps the document's elements to fields. Fields are defined document   elements that are to be searchable. Fields can be defined by Index Profile.   Multiple fields may be grouped into composite fields, allowing a query to be   executed on several fields at the same time. | 
| 
Enterprise   Crawler | 
Use   Enterprise Crawler to access content on Web Site(s) | 
| 
File   Traverser | 
The file   traverser scans specified file directories of file servers. | 
| 
Pushing   Content to Search Engine Using Content API | 
Use the   content API directly to push the content to Search Engine. | 
| 
Query Side | 
Three ways   to query the result: Search API, HTTP-based Query Interface, FAST Web Service   Interface | 
| 
Content   Interface | 
Integrate of   application via C++, Java, .NET | 
| 
Search   Interface | |
| 
Document   Processing Interface | 
inclusion of   customer-defined document processors | 
| 
Query/Result   Processing Interface | 
provides an   interface for dynamic linking of custom query and result processors | 
| 
Administration   Interface | 
supports API   integration for system administration and collection configuration | 
| 
Security   Integration | 
Security   Access Module provides document-level security capabilities for integration with   your content and 
portal   infrastructure | 
| 
SDKs | 
ESP Content   SDK, Search SDK, and Application SDK provide various interfacing   capabilities. | 
| 
Web Service   Interface | 
Web services   are a collection of standards and protocols that allow computers to   communicate across the 
internet   using XML and the ubiquitous HTTP protocol | 
| 
Document   processing is defined per collection | |
| 
Document   Processing Engine, Pipeline, Stage | 
One search   engine contains multiple pipelines, but one collection can only have one   pipeline. One pipeline contains multiple stages. One stage performs a   particular document processing task. It takes one or more document elements   to be input and the resulting output is new or modified elements that may be   further processed | 
| 
Entity   Extraction | 
Entity   extraction is detecting, extracting, and normalizing 
entities   from documents | 
| 
Extract   other entities | 
Two ways to   extract other entities: 
Using Admin UI to specify additional extractor 
Via a regular expression document processor | 
| 
Search   Engine Clusters | 
Search   Engine instances are grouped into search engine clusters. A search engine   cluster is a group of 
Search   Engine instances that share the same index schema, which is provided by an   index profile. | 
| 
Search   Columns and Rows | 
Sets of   indexed documents are stored in all search engine instances within a search   column to scale data volume. That means each node in a search rows share the   same set of indexed documents. When a query is sent to a cluster, it will be   sent to all search engine instances within a search row to scale query rate. | 
| 
Index   profile | 
An index   profile is an XML-based configuration file. It’s an index schema that defines   the way documents are searchable. It specifies search properties like: 
Which document elements are to become searchable   fields 
Which document elements are to become fields that are   returned as part of a result 
How to calculate values that are used for sorting and   ranking | 
| 
The relationship   between Document Processing, Indexing and Search Engine Clusters | |
| 
Index   Profile Structure | |
| 
Scope Search | 
Used for 
Indexing customer XML content without any knowledge of   the DTD/Schema. 
Indexing a more dynamic field structure using the   Scope Search framework. | 
| 
Relevancy,   Data mining | |
| 
Linguistic   processing | |
| 
Sorting | |
| 
Rank value   calculations | |
| 
Query   context analysis | |
| 
Navigation | |
| 
Contextual   Insight | |
| 
Ranking   Concept | |
| 
Quality | |
| 
Freshness   Boosting | |
| 
WebAnalyzer | 
The   WebAnalyzer is a FAST ESP module that uses links between documents to improve   search relevancy | 
| 
Tools to modify   rank for individual documents | 
Two tools   for modifying rank 
Search Business Center 
Boost Bulk Tool | 
| 
3 boost   mechanisms | 
Absolute Query Boost: Specify an absolute ranking   position for a document against a specified query. Or exclude displaying a   document against a specified query. 
Relative Query Boost: Ensure a document is always   displayed in first xx (a number)   result list against a specified query. 
Relative Document Boost: Ensure a document is always   displayed in first xx (a number) result list whatever user submitted. | 
| 
Proximity   Ranking and Matching | 
The term proximity   denotes the degree to which a query and a document match, based on the   distance between the query terms within a document. 
Two types of   proximity: 
Explicit Proximity 
Implicit Proximity | 
| 
Field   Collapsing | 
Two kinds of   field collapsing 
Field collapsing which removes collapsed documents 
Field collapsing which does not remove collapsed   document (default) | 
| 
Boundary   Matching | |
| 
Duplicate   Removal | 
Different   ways of detecting and removing duplicate documents. 
Crawler Duplicate Removal (The FAST Crawler) 
Dynamic (Result-Side) Duplicate Removal (may be used   to detect and remove duplicates across collections, and also enable a more   flexible definition of perceived duplicates) 
Field Collapsing | 
| 
GEO Search   Overview | 
The Geo   Search feature provides capabilities for filtering, sorting and boosting   query results based on geographical location. | 
| 
Query   Modifications | 
Query   processing is configured globally and three ways to modify a query in FAST   ESP 
As an automatic rewrite of the query before execution   against the index 
As a suggested rewrite, typically presented as a   search tip on the result page 
A combination of the two above: The query is first   executed in its original form. In case of no hits, the query is automatically   resubmitted using the automatic rewrite option, and the new result is   presented to the user | 
| 
Query   Resubmission | 
The   resubmission is set per query and used to switch to suggested transformation   of the user’s query. There are three kinds of query transformation. 
Modify: Automatically modified. The modified query is   executed and the result set is returned 
Conditional Modify: Automatically modified only if no   hits are returned by the executed query 
Suggest: Never modified. But a suggested transformed   query is returned together with the result set. | 
| 
FAST Query   Language (FQL) | |
| 
Navigator | 
Navigators   provide functionality for drilling down into the query results based on value   distribution of one or more individual fields. | 
| 
Field   Navigator | |
| 
Deep Navigator | |
| 
Shallow   Navigator | |
| 
Scope   Navigator | |
| 
Contextual   Navigator | |
| 
Field   Navigators for Values in Scope Fields | |
| 
Taxonomy | |
| 
FAST   Classifier | |
| 
Unsupervised   Clustering | 
Data Flow
In this section, I'm gonna talk about the data flow. The first one is about how the ESP crawl data.
The second one is about how the ESP handle user search.
 
