Elasticsearch

Elasticsearch is a search engine based on Lucene with a distributed full-text search engine with an HTTP web interface storing JSON documents.

Getting a test installation via Docker

https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

docker pull docker.elastic.co/elasticsearch/elasticsearch:5.6.3
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:5.6.3

[$[Get Code]]

Elastic query

Introducing the query language

Finding Excat values

Your data in Elastic is stored in an so called "index". Which is not the same thing as an index in the database world but more like a table.

In this example the index is called "MY_INDEX"

With this you can search for all documents in the index

GET /MY_INDEX/_search

[$[Get Code]]

It starts with "GET" because you only give the URL and no extra payload

In the results you will find an ID field. You can use this ID field to directly access a document like this

GET /MY_INDEX/table/42

[$[Get Code]]

This also works for DELETE - get rid of document and PUT to over- write a document. For PUT you need to provide a body with {} where the new content is

PUT /MY_INDEX/table/42
{
"name": "john"
}

[$[Get Code]]

If we want to have more complex filter criteria we need to use POST instead of GET and also provide a body. Normally elastic only return the most relevent documents to you and therefore calculates a score for each document to sort them. If you do not need this you should use constant_score so every document has the same score.

POST /MY_INDEX/_search
{
"from": 0,
"size":3,
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "John"
}
}
]
}
}
}
}
}

[$[Get Code]]

So this is the equivalent of

SELECT * from MY_INDEX where firstName="John" and ROWNUM <= 3;

[$[Get Code]]

With more than one criteria that needs to be fulfilled (in this example we test with exists if a specific field exists and is non null)

POST /MY_INDEX/_search
{
"from": 0,
"size":3,
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "John"
}
},
{
"term": {
"secondName": "Doe"
}
},
{
"exists": { "field" : "userId" }
}
]
}
}
}
}
}

[$[Get Code]]

You can also have

conditions that are not allowed to be there
check if a timestamp is a certain range
test if an array contains a value

POST /MY_INDEX/_search
{
"from": 0,
"size":3,
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "John"
}
},
{
"script": {
"script": "doc['myArray'].values.contains('myValue')"
}
},
{
"range": {
"mytimestamp": {
"gte": "2017-10-22T00:00:00Z",
"lte": "2017-10-23T00:00:00Z"
}
}
}
],
"must_not": [
{
"term": {
"secondName": "Doe"
}
}
]
}
}
}
}
}

[$[Get Code]]

The script keyword opens you a way to use the result programs in different programing languages as a value for your query. But with great power ...

GET /foo/_search
{
"from": 0,
"size": 1,
"query": {
...
},
"aggs": {
"foobar": {
"terms": {
"script": "def res=[]; def m=params['_source'].productDna; if(m!=null){for(x in m.keySet()){def f=m.get(x); if(f>0){res.add(x);}}} return res",
"size": 10
}
}
}
}

[$[Get Code]]

Or regular expressions (don't work with must but only with should)

POST /MY_INDEX/_search
{
"from": 0,
"size": 0,
"query": {
"constant_score": {
"filter": {
"bool": {
"should": [
{
"regexp": {
"firstName": "s22.*"
}
}
]
}
}
}
}
}

[$[Get Code]]

You can also aggregate results like with GROUP BY in the SQL world.

This gives you 10 values that the field "secondName" has within the search results. None of the actual search results is shown to you (size 0 / size 10)

Aggreations are limited to 10 by default!

POST /MY_INDEX/_search
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "John"
}
}
]
}
}
}
},
"aggs": {
"allSecondNames": {
"terms": {
"field": "secondName",
"size": 10
}
}
}
}

[$[Get Code]]

This get the average value of the field timeSpent and also does not show you any document

POST /MY_INDEX/_search
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "John"
}
}
]
}
}
}
},
"aggs": {
"myResult": {
"avg": {
"field": "timeSpent"
}
}
}
}

[$[Get Code]]

Instead of just getting the average you can also bucket all values of a field, in this example the field timeSpent and bucket size is 500

POST /MY_INDEX/_search
{
"size" : 0,
"query" : {
"bool" : {
"must" : [
{
"term" : {
"firstName": "John"
}
},
{
"term" : {
"secondName": "Doe"
}
}
]
}
},
"aggs": {
"myResult": {
"histogram": {
"field": "timeSpent",
"interval" : 500
}
}
}
}

[$[Get Code]]

You can also talk directly to the REST endpoint of your Elastic instance via curl, very useful to post process results

curl -X POST 'http://127.0.0.1:9200/YOUR_INDEX/_search' -d '{
"from": 0,
"size":9999,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"regexp": {
"type": "from.*"
}
}
}
}
}' | grep _id | sed 's/_id/\n_id/g' | grep ^_id | sed s/\",\".*// | sed s/^.*\"//

[$[Get Code]]

Show Delete Entry

You need the type and the id of an entry to show or delete it (which you can get from a query result) So assume this is the result from a query

{
...,
"_type": "YOUR_TYPE",
"_id": "YOUR_ID",
...
}

[$[Get Code]]

Try this (in GET mode)

/YOUR_INDEX/YOUR_TYPE/YOUR_ID

[$[Get Code]]

Or via curl

curl -X GET 'http://127.0.0.1:9200/YOUR_INDEX/YOUR_TYPE/YOUR_ID'

[$[Get Code]]

For deletion just replace by delete

Maintanance

Get status

GET /_cat/nodes

GET _cluster/health?pretty

GET _cluster/allocation/explain

GET _cluster/settings

GET /_cluster/settings?pretty&include_defaults

[$[Get Code]]

Exclude node from cluster

Exclude one node and allow all nodes again

PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._ip": "10.20.30.40"
}
}

PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._ip": ""
}
}

[$[Get Code]]

Templates

Elastic uses templates when it creates an index. This lists all templates and prints one

GET /_cat/templates

GET /_template/my_template_name

[$[Get Code]]

Thorsten Gunkel

Elasticsearch

Links

Getting a test installation via Docker

Elastic query

Show Delete Entry

Maintanance

Get status

Exclude node from cluster

Templates

Menu

Actions

Search