Table of Contents
1. Introduction
In Elasticsearch you can use Painless scripting to achieve non-standard results. Scripts can be embedded in the pipelines but to make two independent from each other you can store script in Elasticsearch giving some name and call it later on in ingestion pipeline. In this way you can use same script in different places without duplicating code.
2. Start Elasticsearch
For that exercise free version you can use.
docker run --rm \
--name elk01 \
-d \
-e node.name="elk01" \
-p 9200:9200 \
docker.elastic.co/elasticsearch/elasticsearch:8.11.1
Later set password for ‘elastic’ user
docker exec -it elk01 bash -c "(mkfifo pipe1); ( (elasticsearch-reset-password -u elastic -i < pipe1) & ( echo $'y\n123456\n123456' > pipe1) );sleep 5;rm pipe1"
3. Store scripts
Below I am presenting you two examples of scripts
3.1. Store script decode base64
Purpose of this one is to convert field found in source from base64 format into decoded text. Name you can specify at the end in URL.
curl -k -u elastic:123456 -XPUT "https://localhost:9200/_scripts/decodebase64" \
-H 'content-type: application/json' -d'
{
"script": {
"description": "Decode base64",
"lang": "painless",
"source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\'']; ctx[target]=src.decodeBase64();"
}
}'
3.1. Retrieve script decode base64
To confirm how script is stored you can retrieve it’s definition
curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/decodebase64"
3.2. Store script decode HEX
Purpose of this script is to convert string in HEX into string readable by human.
curl -k -u elastic:123456 -XPUT "https://localhost:9200/_scripts/decodehex" \
-H 'content-type: application/json' -d'{
"script": {
"description": "Decode HEX",
"lang": "painless",
"source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\''];StringBuilder sb = new StringBuilder();for (int i = 0; i < src.length(); i+=2) { String byteStr = src.substring(i, i + 2);char byteChar = (char) Integer.parseInt(byteStr, 16);sb.append(byteChar) } ctx[target] = sb.toString();"
}
}'
3.3. Retrieve script decode HEX
curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/decodehex"
4. Create pipeline
Both scripts are stored right now in Elasticsearch. So you can use them with the reference name.
In below pipeline field ‘name_base64’ will be decoded into ‘name’ using stored script ‘decodebase64’. Field ‘color_hex’ will be decoded into ‘color’ field.
curl -k -u elastic:123456 -XPUT "https://localhost:9200/_ingest/pipeline/decodehashes" \
-H 'content-type: application/json' -d'
{
"description": "Decode hash values",
"processors": [
{
"script": {
"id": "decodebase64",
"params": {
"field": "name_base64",
"target_field": "name"
}
}
},
{
"script": {
"id": "decodehex",
"params": {
"field": "color_hex",
"target_field": "color"
}
}
}
]
}'
5. Bulk load through pipeline
Now time to load some data. Here you can use _bulk API and then add pipeline as parameter to URL.
curl -k -u elastic:123456 -XPOST "https://localhost:9200/fruits/_bulk?pipeline=decodehashes" \
-H 'content-type: application/json' -d'
{"index":{"_id":"1"}}
{"name_base64":"QXBwbGU=","color_hex":"477265656e"}
{"index":{"_id":"2"}}
{"name_base64":"QW5hbmFz","color_hex":"59656c6c6f77"}
{"index":{"_id":"3"}}
{"name_base64":"Q2hlcnJ5","color_hex":"526564"}
'
As you can see data is encoded in two different formats.
6. Check loaded data
Search index fruits to see how data was processed
curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty"
response:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "fruits",
"_id": "1",
"_score": 1.0,
"_source": {
"color_hex": "477265656e",
"name": "Apple",
"name_base64": "QXBwbGU=",
"color": "Green"
}
},
{
"_index": "fruits",
"_id": "2",
"_score": 1.0,
"_source": {
"color_hex": "59656c6c6f77",
"name": "Ananas",
"name_base64": "QW5hbmFz",
"color": "Yellow"
}
},
{
"_index": "fruits",
"_id": "3",
"_score": 1.0,
"_source": {
"color_hex": "526564",
"name": "Cherry",
"name_base64": "Q2hlcnJ5",
"color": "Red"
}
}
]
}
}
7. Conclusion
In this tutorial you have learned how to create stored scripts, how to retrieve them in order to see version in which they were created. Later on you practiced how to call stored script from ingestion pipeline definition.
This optimization will let you save space and avoid errors as you will define script once and use multiple times.
Have a nice coding!
One Comment