Calculate time difference Painless with Elasticsearch & Kibana 8.9

September 20, 2023
Tomasz Dzierżanowski

1. Introduction

Elasticsearch as document database can store different types of fields. One of the most common is date type. Imagine situation of spark logs that have start and end timestamp, now you want to calculate duration which means time difference. There are different contexts where it can be done. Firstly you can prepare mapping definition with runtime fields that calculating duration with Painless script, secondly you can put Painless script inside your search request. Depends on your needs and permissions. In this tutorial I will show you how to calculate time difference with Kibana data view, search request and finally mapping adjustments.

2. Start Elastic and Kibana

As a first step as usual you need Elasticsearch cluster and Kibana to practice. Please run below commands to run it within docker containers.

				
					docker network create kibana

docker run --rm \
--name elk \
--net kibana \
-p 9200:9200 \
docker.elastic.co/elasticsearch/elasticsearch:8.9.1

docker run --rm \
--name kibana \
--net kibana \
-p 5601:5601 \
docker.elastic.co/kibana/kibana:8.9.1

Once started please setup password for elastic user and generate enrollment token that you will use to connect Elasticsearch and Kibana.

				
					# reset password
docker exec -it elk /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic

# apply for new enrollment token
docker exec -it elk /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana

Copy enrollment token to connect Kibana & Elasticsearch. For details check my previous article.

3. Load test data

Time to load some sample in order to have a base to play with. I prepared 2 data sets, one is for date type and another for date_nanos with nanosecond precision.

				
					# skyscrappers dataset with date type
# +------------------------+--------------------+------------------+
# |     building_name      | start_construction | end_construction |
# +------------------------+--------------------+------------------+
# | One World Trade Center | 2006-04-27         | 2015-05-29       |
# | Warsaw Trade Tower     | 1997-06-01         | 1999-11-01       |
# | Varso Place            | 2016-12-01         | 2022-09-01       |
# +------------------------+--------------------+------------------+


# etljobs dataset with date_nanos type
# +-----------+----------------------------------+----------------------------------+
# | job_name  |         start_execution          |          end_execution           |
# +-----------+----------------------------------+----------------------------------+
# | extract   | "2023-09-10T12:00:00.000000000Z" | "2023-09-10T12:01:00.000000000Z" |
# | load      | "2023-09-10T12:00:00.000000000Z" | "2023-09-10T12:00:00.000000001Z" |
# | transform | "2023-09-10T12:00:00.000000000Z" | "2023-09-10T12:00:00.000001000Z" |
# +-----------+----------------------------------+----------------------------------+

Load data in 2 steps, first mapping creation and secondly with bulk load.

				
					curl -k -u elastic -XPUT "https://localhost:9200/skyscrappers" \
-H 'content-type: application/json' -d'
{
  "mappings": {
    "properties": {
      "building_name": {
        "type": "text" 
      },
      "start_construction": {
        "type": "date" 
      },
      "end_construction": {
        "type": "date" 
      }
    }
  }
}'

curl -k -u elastic -XPOST "https://localhost:9200/skyscrappers/_bulk" \
-H 'content-type: application/json' -d'
{"index":{"_id":"onewtc"}}
{ "building_name": "One World Trade Center", "start_construction": "2006-04-27","end_construction": "2015-05-29" } 
{"index":{"_id":"wtt"}}
{ "building_name": "Warsaw Trade Tower", "start_construction": "1997-06-01","end_construction": "1999-11-01" } 
{"index":{"_id":"varso"}}
{ "building_name": "Varso Place", "start_construction": "2016-12-01","end_construction": "2022-09-01" } 
'

curl -k -u elastic -XPUT "https://localhost:9200/etljobs" \
-H 'content-type: application/json' -d'
{
  "mappings": {
    "properties": {
      "job_name": {
        "type": "text" 
      },
      "start_execution": {
        "type": "date_nanos" 
      },
      "end_execution": {
        "type": "date_nanos" 
      }
    }
  }
}'

curl -k -u elastic -XPOST "https://localhost:9200/etljobs/_bulk" \
-H 'content-type: application/json' -d'
{"index":{"_id":"1"}}
{ "job_name": "extract", "start_execution": "2023-09-10T12:00:00.000000000Z","end_execution": "2023-09-10T12:01:00.000000000Z" } 
{"index":{"_id":"2"}}
{ "job_name": "load", "start_execution": "2023-09-10T12:00:00.000000000Z","end_execution": "2023-09-10T12:00:00.000000001Z"}
{"index":{"_id":"3"}}
{ "job_name": "transform", "start_execution": "2023-09-10T12:00:00.000000000Z","end_execution": "2023-09-10T12:00:00.000001000Z" } 
'

4. Runtime query

Although queries over runtime fields are expensive they do not require any reindex operations. Please notice that if extensive queries are disabled in settings

				
					PUT _cluster/settings
{
    "persistent":{
        "search.allow_expensive_queries":false
    }
}

running query with runtime field will cause error

				
					GET skyscrappers/_search
{
  "query": {
    "match": {
      "Construction Years": 9
    }
  },
  "fields" : ["*"]
}
# ERROR
{
    "error": {
        "root_cause": [
            {
                "type": "exception",
                "reason": "queries cannot be executed against runtime fields while [search.allow_expensive_queries] is set to [false]."
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "all shards failed",

but let’s discuss default settings which allows them to be run.

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/skyscrappers/_search" \
-H 'content-type: application/json' -d'
{
  "fields": [
    "Construction Years"
  ],
  "_source": false
}'

curl -k -u elastic:123456 -XGET "https://localhost:9200/etljobs/_search" \ 
-H 'content-type: application/json' -d'
{
  "runtime_mappings": {
    "Duration": {
        "type": "long",
        "script": {
            "source": "emit(ChronoUnit.NANOS.between(doc[\"start_execution\"].value, doc[\"end_execution\"].value));"
        }
    }
  },
    "fields": [
    "Duration"
  ],
  "_source": false
}'

Aggregation is also possible although on millisecond resolution, even when using a date_nanos field. This limitation also affects transforms.

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/etljobs/_search" \
-H 'content-type: application/json' -d'
{
  "runtime_mappings": {
    "duration": {
      "type": "long",
      "script": {
        "source": "emit(ChronoUnit.NANOS.between(doc[\"start_execution\"].value, doc[\"end_execution\"].value));"
      }
    }
  },
  "aggs": {
    "duration_stats": {
      "stats": {
        "field": "duration"
      }
    }
  }
}'

5. Create data view

No different with data views in Kibana. You can use same Painless script to create calculations. Using Kibana UI by adding new field, set value with script

				
					ZonedDateTime start = doc["start_construction"].value;
ZonedDateTime end = doc["end_construction"].value;
long years = ChronoUnit.YEARS.between(start, end);
emit (years);

or using curl command

				
					curl -k -u elastic:123456 -XPOST "http://localhost:5601/api/data_views/data_view" \
-H 'content-type: application/json' -H 'kbn-xsrf: reporting' -d'
{
  "data_view": {
     "title": "skyscrappers",
     "name": "Sky Scrappers",
        "runtimeFieldMap": {
            "Construction Years": {
                "type": "long",
                "script": {
                    "source": "ZonedDateTime start = doc[\"start_construction\"].value;\nZonedDateTime end = doc[\"end_construction\"].value;\nlong years = ChronoUnit.YEARS.between(start, end);\nemit (years);"
                }
            }
        }
  }
}'

6. Update mapping

Finally runtime fields definition can be persisted inside index mapping. Runtime fields will shadow existing mapped fields with same name, it means this way you can convert text field with number into long type for aggregation. Run update mapping to make _source & fields coexist then perform standard operations.

				
					curl -k -u elastic:123456 -XPUT "https://localhost:9200/skyscrappers/_mapping" \
-H 'content-type: application/json' -d'
{
 "runtime": {
            "Construction Years": {
                "type": "long",
                "script": {
                    "source": "ZonedDateTime start = doc[\"start_construction\"].value; ZonedDateTime end = doc[\"end_construction\"].value; long years = ChronoUnit.YEARS.between(start, end); emit (years);"
                }
            }
 }
}'

Fields can be also removed if needed.

				
					curl -k -u elastic -XPUT "https://localhost:9200/skyscrappers/_mapping" \
-H 'content-type: application/json' -d'
{
 "runtime": {
   "Construction Years": null
 }
}'

7. Final thoughts

In this tutorial you have learned how to quickly setup Elasticsearch and Kibana, load sample data and then create runtime fields with help of Painless scripting. This will be useful for you whenever you need to create dashboard from indexes that have all text datatype and you need aggregations. Moreover it opens up a lot of opportunities for exploring your data. Looking forward to see you opinions both on YouTube channel and here in comments.

Have a nice coding!