g

Elastic Site Search

  • About 
  • Site-Search 
    • Search Items
  • Configuration 
    • Fields Configuration
      • Field Definition Properties
      • Field Types
    • Mapping of the Data
      • Index Document Preview
      • Debugging the Mapping
    • Configuring Search Endpoint 
    • Integration
      • Integrating Endpoints Naturally 
      • Using Headless Requests
      • Using a Block Function
      • Search Results object
      • Search Parameters and Configuration Files
      • Develop/ment mode

  • Using Fielded Search API 
    • General Syntax
      • HTTP Requests
        • Arrays
      • Endpoint Configuration Files
    • Basic Parameters
      • Returning Objects in Results
      • Removing Products from Results
      • Removing fields from Results
    • Filter Queries
      • Filtering by fields
      • Controlling fields in Search Results
      • Inclusive and Exclusive Queries
    • Full-Text Queries
      • Search Modes
        • standard
        • autocomplete
      • Controlling the Sensitivity of Full-Text Queries
        • Fuzziness
        • Sensitivity
      • Search Fields and the Weights
    • Controlling Search Results Scope
      • Pagination
    • The Standard Fields Definition
  • Development Environment 
  • Template Examples;
    • search.html
    • search-item.html
    • search.json
  • Further Reading

About

Elastic Site Search (ESS) uses Elasticsearch engine for indexing and provides a flexible and redistributable way to configure the structure of the index and maintain it with real-time updates.

Site Search

The following items within Core dna are indexed:

  
Module
Object Returned
Endpoint
Pages
Page/search/pages/action
BlogsBlog, BlogPost/search/blogs/action
Faq (Help)Help/search/help/action
ProductsCatProduct/search/products/action

In order to use ESS, simply navigate to "yoursite.com/search", this URL will by default load the search.html in the site directory (modules/search/templates/search.html)

Every time you update an item from the above modules, it will trigger an ESS update. This allows us to keep all records up to date with the search. In order to trigger certain entities ESS actions (eg: reindex, preview) you need to specify which module endpoint, refer to the table above.

Refer to the Templates section below to see some example code.

Search Items

An "item" is a result of one of the listed modules above, they are returned when a search term is executed. Items know their own module, using the method: $item→getType().

It is also possible to define Highlights, this functionality will show snippets of matching text blurbs per item. They can be checked using the method: $item→getHighlights().

When displaying results, you can use $search→getGroupedItems() to return each result grouped by their module. You can also use the normal $search→getResults() which will return the weighted item results.

If you need more information relating to a search result, you can call the function $item→getObject(). This will return the original object related to this search item, for example; return the full product object. This will allow you to call all the functionality on the normal object. Be aware that this should only be done if needed, as every call to fetch these objects can be expensive.

Configuration

To start using Elastic Site Search (EES) it is required to follow four simple steps:

  1. Define fields using JSON configuration file
  2. Map catalogue data to index using a mapping template
  3. Configure one or more endpoints to accept requests
  4. Send requests to the configured endpoint

Fields Configuration

Each item in the supported modules is stored in Elasticsearch engine as an Index Document. The structure of the Index Document is defined in JSON configuration file fields.json located at

./modules/search/fields.json

A typical configuration file looks like this:

Sample fields.json Expand source

Each field in the Index Document must be given a unique key, all properties are optional. There are 16 standard or "core" fields. They do not need to be configured or mapped, and are ready for use:

All core fields can be re-defined in fields.json file and the re-mapped in configuration.html, like any custom field. The full definition of the core fields can be found here

Field Definition Properties

  
Property
Default Value
Description
name
"Humanized" version of the key. E.g. "My Field" for "my_field" Name of the field.
display_name
The value of the "name" propertyDisplayed version of field name. It can be something like "Select a Colour"
type
keyword
Defines how the value is stored in Elasticsearch index. The available types are discussed below.
fulltext
false
When set to true, a text version of the same field will be created in Elasticsearch. This is useful when the same field (e.g. of keyword type) needs to be used as a field and for full-text search at the same time.
case_sensitive
true
By default, the values of keyword fields are indexed as is, which makes field values case-sensitive, so a filter query ?size=XL will not match a document with size XL. This behavior can be changed with case_sensitive property set to false
hidden
false
All fields are assumed to be used as fields and appear in search result. If there is no intention to display the field (perhaps, use this field only for filtering purposes), it can be hidden.
values_limit
10By default, the field will display only top 10 most-popular values, This number can be adjusted with this property
values
(empty array)If a fixed list of field values is required, it can be set with this property
order
value countBy default, field values are ordered by the count of matching documents with the most popular value coming first, The sorting order of field values can be changed with this property. For possible values refer to Elasticsearch documentation
output_order(none)The output_order parameter allows to re-order field values after they have been fetched from Elasticsearch. For example, by using the order and and the values_limit properties it is possible to fetch 10 most popular values, and then by applying output_order property, sort these 10 most popular field values in alphabetical order.
unit
(none)Unit of Measure. The value can be any string, such a "in" or "mm".
props
(empty array)This property allows to pass an array with arbitrary data to the SearchResults object. The arbitrary data may include CSS class names or HTML attributes hinting the font-end about how to render the field on the page.

Field Types

   
Type
Use
Can Be Used as a field
Can be Used in Full-Text Search
keyword
The value is stored "as is". No analysis is applied. The default type for fielded Search.Yes (recommended)No
integer
Used to store whole numbers, such as ID values. YesNo
double
Used to store floating-point numbers, such as price.YesNo
date
Used to store date fieldsNoNo
boolean
truefalseYesNo
text
Used to store long strings of text for full-text search NoYes

Mapping of the Data

Item data is mapped to the index document using a mapping template in the following location:

./modules/search/templates/configuration.html

The purpose of this template is to map the relevant data from the $item object to the $index array which represents the Index Document:

Sample configuration.html file
<{* Typically, product custom fields can be mapped to the index field *}>
<{$index['rating'] = $item->getCustomFieldValue('Rating')}>
<{* With the use of basic Smarty functions it is possible to expand comma-separated list into an array ... *}>
<{$index['retailers'] = array_map('trim'explode(','$item->getCustomFieldValue('Show_Retailers')))}>
<{* ... or create a range from the "start" and the "end" numbers *}>
<{$index['year_range'] = range($item->getCustomFieldValue('year_start'), $item->getCustomFieldValue('year_end'))}>
<{$index['tags'] = ($item->getTags()) ? array_map('trim'explode(','$item->getTags())) : null}>
<{* The values should be set to null if they are not available *}>
<{$index['brand'] = ($item->getBrand()) ? $item->getBrand()->getName() : null}>
<* A field/field in the index can be a completely new thing based on some calculations *>
<{$index['on_sale'] = ($item->getFinalPrice() < $item->getBasePrice())}>
<{foreach $item->getRelationInfo(true) as $relation}>
 <{if $relation.category eq 'fits'}>
 <{$something $relation.description|strtolower}>
 <{$index['relation'][$something][] = $relation.product->getName()}>
 <{/if}>
<{/foreach}>

Even the core fields can be re-assigned if needed:

Sample configuration.html with the core field 'code' being re-mapped.
<{$index['code'] = [
 $index.code,
 $item->getCustomFieldValue('ProxyCode'),
 $item->getCustomFieldValue('VIN'),
 $item->getCustomFieldValue('ISDN'),
]}>

Index Document Preview

To ensure the fields have been configured and correctly, there is a way to see how the data is mapped to the Index Document, There is a special endpoint to preview the Index Document based on the ID or URL Slug of the corresponding product:

/search/preview/products/{slug-or-id}

The URL below will display an Index Document in JSON format for the product with URL Slug "my-test-product":

/search/preview/products/my-test-product

If a product is not published, does not have public view permissions, SEO settings prevent it from indexing, or for whatever other reason it cannot be indexed, an error will be displayed. 

To see the actual reason why the product cannot be indexed, development mode can be turned on and the actual reason will be displayed:

/search/preview/products/my-test-product?dev=1
The Routing module allows to modify the base path of the FS module and change it from /search/ to anything else (e.g. /search/).

Debugging the Mapping

The mapping template, like any other Smarty template, can produce errors, or there might be a need to print out some variables. fielded Search has a special endpoint to print output of the mapping template for ID or URL Slug of the corresponding product:

/search/debug/products/{slug-or-id}

The URL below will display any output from the mapping template for the product with URL Slug "my-test-product":

/search/debug/products/my-test-product

The mapping template is only used to assign values. It must not produce any output! Use /search/debug endpoint to ensure no output is made by it.

Configuring Search Endpoint 

Search Endpoint is a profile used to accept search queries of the same type. For example, one endpoint may be made responsible for servicing requests from the auto-complete search bar, and another endpoint may be servicing filter requests from the product catalogue. 

The endpoint configuration consists of the URL slug, a template to render search results (optional) and the endpoint configuration file with preset parameters which will apply to all queries to this endpoint. Basically, any Search API parameters can be set in the endpoint configuration file. 

Any valid URL Slug can be used as an endpoint name, except the following list of the reserved URLs:

mysite.com/search/reindex/products
mysite.com/search/reindex/pages
mysite.com/search/rebuild
mysite.com/search/preview/products/1234


If the endpoint is called "demo", it will have the following path to the endpoint configuration file:

./modules/search/templates/demo.json

... the results template:

./modules/search/templates/demo.html

... and the endpoint URL:

https://www.example.com/search/demo
The Routing module allows to modify the base path of the FS module and change it from /search/ to anything else (e.g. /sitesearch/).

Integration

There are three ways to send requests to FS endpoints and process responses: integrate a template, make headless ajax calls or use a block function.

Integrating Endpoints Naturally 

Just integrate a template corresponding to the endpoint and use the Search Results object to display results, fields, pagination, etc. The Search Results object is available in the $search variable. 

Using Headless Requests

Leverage Standard JSON Response headless API via AJAX calls and work with the data directly

Using a Block Function

The block function search_search can be used to pull fielded search results into a template used by any other module. It accepts two arguments: "form" (mandatory) and "params" (optional). The first argument is the endpoint name, the second is an array with Search API request parameters. 

Sample use of the block function
<{$params = [
 'objects' => true,
 'recursive' => true,
 'category.slug' => $categorySlug
]}>
<{show_block module="search"
 block='search'
 form='demo'
 params=$params
 template_name='_blank'
}>
The block function does not take any parameters from the HTTP request (i.e. passed in the URL), it only works with the parameters passed to the block function in the params argument and the endpoint configuration file.

Search Results object

Search Parameters and Configuration Files

Any fielded Search API parameters can be preset in the endpoint JSON configuration file. These parameters will be used as default values for the endpoint and can be overwritten by the parameters passed in the URL or arguments of the block function. This is valid for all values except the array. The arrays are being merged. 

Here is an example of JSON configuration file for the "demo" endpoint and HTTP request made to that endpoint:

demo.json
{
 "parent_category.slug""products",
 "size": 30,
 "mode""autocomplete",
 "no_fields"true,
 "fields": [
 "name^3",
 "description.short^2"
 ]
}
/search/demo?size=12&mode=standard&fields=category.name,brand

This request above will result in the following resulting query:

Resulting API Request
{
 "parent_category.slug""products",
 "size": 12,
 "mode""standard",
 "no_fields"true,
 "fields": [
 "name^3",
 "description.short^2",
 "category.name",
 "brand"
 ]
}

Use development mode to see the resulting requests.

Development mode

Development mode can be enabled by passing ?dev=1 in the URL or as an argument to the block function. It allows to:

Using Fielded Search API

General Syntax

HTTP Requests

The Search API Request parameters can be passed either in the query string of HTTP Requests (e.g. ?param1=value1&param2=value2) or in the POST data.

Arrays

Some parameters accept multiple values. There are two notation for this: the PHP array style and comma-separated list. The following two requests are equivalent:

/search/demo/?fields=size,color,brand
/search/demo/?fields[]=size&fields[]=color&fields[]=brand

Endpoint Configuration Files

The endpoint configuration files use JSON format.

Basic Parameters

Search results normally contain an array of fields with their values and matching Index Documents. Index Document is an associative array of fields and values which are stored in the Elasticsearch index. Sometimes this is enough to display test results, but sometimes it is required to work with the instances of the actual object classes. 

Returning Objects in Results

When objects parameter is set to true, the results will contain an array of CatProduct instances accessible through $search->getObjects() 

/search/demo/?objects=true

Removing Products from Results

Sometimes there is no need to have any products data in search results at all (e.g. only fields are required). This can be achieved with no_products parameter set to true:

/search/demo?no_products=true

The request above will return no products in search results, but the fields will be populated for the scope of the query.

Removing fields from Results

Sometimes fields are not required in the search results (e.g. autocomplete function). This can be achieved with setting no_fields to true

/search/demo?no_fields=true

The request above will effectively make all fields hidden. This means the fields will not be included in Elasticsearch query and will save some resources. This also means that $search->getfields()->getVisible() and $search->getfields()->getAvailable() will return an empty array. 

Filter Queries

Filter query is a request in the form of ?field1=value1&field2=value2 sent to the configured endpoint. It will return products matching the filter. A filter can be used in conjunction with the full-text query.

Filtering by fields

The following sample request will search for red, green or blue shoes in size "XL":

/search/sample?color=red,green,blue&size=XL&category.slug=shoes

Use the filter of type to handle multiple or single searches to a certain module:

/search/sample?type=products&q=mytestsearch
/search/sample?type=products,faq,blogs&q=mytestsearch
When a field is specified in the filter, but the value is not provided (e.g. ?field1=&field2=), the filter will return documents where this field exists, but set to null. If there is intention to exclude a field from filter, it needs to be omitted in the query. Alternatively, ignore_empty_filters parameter must be set to "true". 

Controlling fields in Search Results

By default, all visible configured fields will be sent to Elastic search engine to get populated, though many fields may apply only to a subset of products, for example, "Spindle Length", "Spindle Diameter" and "Spindle Material" fields may only apply to the products in the "Spindles" category. Displaying all available fields in search results may be overwhelming and sometimes it is beneficial to hide some fields from view. This can be achieved with the fields parameter in the search query. It accepts a list of fields which will be populated by query and made available in the search results with $search->getfields()->getVisible(). The fields parameter can make a hidden field visible if required. However, the no_fields=true parameter overrides any setting of the fields parameter and no fields will be displayed, 

Inclusive and Exclusive Queries

By default, all the field/value pairs in the filter request are exclusive (i.e. employ "AND" operator).

The request below will match products from the "Classic" collection made of stainless steel having extended warranty

/search/products?collection=Classic&material=Stainless+Steel&warranty=Extended

The should parameter allows to apply "OR" operator to the specified list of fields. 

The request below will match products having extended warranty either from the "Classic" collection or made of stainless steel.

/search/products?collection=Classic&material=Stainless+Steel&warranty=Extended&should=collection,material

Full-Text Queries

Full-Text query is a search phrase passed to the q parameter of the request (e.g. ?q=quick+brown+fox). Full-text queries are performed on all analysed fields, which include text fields and those fields where fulltext property is set to true.

Search Modes

The search mode can be set using the optional mode parameter:

/search/demo?q=quick+brown+f&mode=autocomplete

Currently, two search modes are supported:

standard

The standard search mode breaks down the search phrase into search terms by stripping off all punctuation and special characters, and splitting it by spaces. Then it applies several filters which:

The search phrase "A Cow Jumped over the Moon" will be converted into the following search tokens: "cow", "jump", "over", "moon". Then the tokens will be matched against the reverse index and the relevance of results will be determined on "the more matches the better" principle. According to the sensitivity setting, the least relevant results will be filtered out. According to the fuzziness setting the query can match some more products by expanding search terms into variants. For example, fuzzy logic can expand term "moon" into "noon", "soon", "mood", "moan", and even "moo" and "moron".

autocomplete

In autocomplete mode the search phrase is taken as a prefix and is matched against any documents containing this prefix within it. Though this mode does support stemming and some basic transformations, but fuzziness is not available in this mode. Also, the sensitivity setting has no effect. A search phrase "cow j" will match "cow jumped", "cow jogged", "cow jigged", but will not match "cow has jumped". Although "cow jumping o" search phrase will match "cow jumped over the moon", because autocomplete mode still supports stemming. 

Controlling the Sensitivity of Full-Text Queries

Fuzziness

Fuzziness can be enabled for full-text queries with the fuzzy parameter. By default, it is off.

The fuzzy query uses similarity based on Levenshtein edit distance. Effectively, it allows to auto-correct one or two typos in the search term, depending on the term length, or even expand the search term by a few characters:

/search/country_lookup?q=austrai&fuzzy=1

The query above will match "Austria" because the fuzzy logic can switch the last two letters. But it will also match "Australia" because the Elasticsearch engine can expand the original term by adding "l" and additional "a".

Sensitivity

The sensitivity parameter has a range of 1 to 100 with default value of 70. This parameter sets a percentage of search terms in the query (words excluding articles and grammar constructions) which must match at the minimum, rounded down. In case of 4 words with default sensitivity of 70%, only two words from the query need be matched for the product to appear in search results. If sensitivity is increased to 75%, then 3 words from the query must match.
Unlike traditional search engines which allow only "AND" and "OR" operators in the search phrase, the sensitivity parameter in fielded Search allows to fine-tune the search behavior on a scale from 1 (effectively an "OR" query) to 100, which is effectively an "AND" query where all terms in the search phrase must match.

However, it is important to understand that not all words will be considered as search terms (see Search Modes). In the standard mode all high frequency words are removed.

/search/demo?q=a+cow+jumped+over+the+moon&sensitivity=75

The query above will require at minimum three of the following four words to be found in the document: "cow", "jump", "over", "moon".

Search Fields and the Weights

By default, full-text queries are performed on all analysed fields (the text fields and the fields where fulltext property is set to true), and all fields have equal weight. So, a document containing "almonds" in the product's "name" field will have equal weight to the documents containing "almonds" in the "description" field. 

The fields parameter allows to specify the list of fields which will be used in the search and boost the weight of each field using caret ^ notation.

/search/demo?q=almonds&fields=name^3,code^3,description.short^2,description.long

In the query above the search phrase "almonds" will be searched only in the four fields: code, name, description.short and description.long. But the "name" and the "code" will have the highest weight of all, followed by short description and the long description will have the least weight.

Controlling Search Results Scope

Pagination

Pagination of results is controlled in two possible ways.

By specifying size and from

/search/demo?size=16&from=32

OR by specifying size and page

/search/demo?size=16&page=3

The two above statements are equivalent and display the 3rd page of the search results with 16 products displayed per page.

Sorting

The Standard Fields Definition

 Expand source

Development Environment

Elastic Site Search is using worker scripts running in the background and re-indexing 

Template Examples;

A simple search template you can use to quickly set up ESS.

directory: sitedir/modules/search

search.html

modules/search/templates/search.html

search.html Expand source

search-item.html

modules/search/templates/search-item.html

search-item.html Expand source

search.json

modules/search/templates/search.json

search.json Expand source

Further Reading;