ElasticSearch and the Mystery of Auto-completion

Few months ago, I was looking for information concerning auto-completion with ElasticSearch as a source. I found many solutions, but none of them really fit my needs. Nevertheless, it helped me building my request.

What’s the situation Doc?

I needed to be able to extract values from an indexed field in ElasticSearch according to a group of letters (user input). In fact, my indexed document contains a field named map which is an array of string. The idea is the following: if the user is looking for the value « name » for instance, let’s say he will first type « na ». So here we must be able to suggest searches to the user. Furthermore, with this 2 letters, we must proposed existing fields like « name », « native » or « nature ».

I’m using Elastical to interact with ElasticSearch.
First, I’m building a regex which will be used later in the request.

var search = req.body.searched.toLowerCase(),
firstLetter = search.charAt(0),
rest = req.body.searched.slice(1),
reg = "^["+firstLetter+firstLetter.toUpperCase()+"]"+rest+".*";

The regex is build to match with both an upper case char or a lower case one in first position.

What about the request?

var request = {
  query: {
    query_string: {
      default_field: "map",
      default_operator: "AND",
      query: req.body.searched+"*"
    }
  },
  facets:{
    map:{
      terms:{
        field: "map.exact",
        regex: reg,
        size: 10
      }
    }
  }
}

First, I’m asking ElasticSearch to retrieve all documents which match req.body.searched+ »* » where req.body.searched contains the user input. I’ve change the default operator to « AND » rather than « OR » in order to be able to deal with fields like « Nom de la gare » or « Name of the dog ». By default, ElasticSearch uses the « OR » operator, so it will ask for « name » OR « of » OR « the » OR « dog »; which is not what I wanted.

Then, I’m using facets to retrieve values in the field map of found documents matching the given regex. I’m using map.exact for the same reason I must use the « AND » operator.

This request works great with on the tests I’ve made. Remains to be seen if it can handle big indexes.

I can now ask ElasticSearch with Elastical and build a clean response:

elastical.search(request, function (err, results, full) {
  var terms = [];
  async.forEach(full.facets.map.terms, function(data, callback) {
    terms.push(data.term);
    callback();
  }, function(err) {
    res.send(terms);
  });
});
Anonyme

Auteur/autrice : Victor

Ingénieur en informatique de formation et de métier, j’administre ce serveur et son domaine et privilégie l'utilisation de logiciels libres au quotidien. Je construis progressivement mon "cloud" personnel service après service pour conserver un certain contrôle sur mes données numériques.

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *