Search
Close this search box.

Elasticsearch – Use stored script for your ingestion pipeline

Table of Contents

1. Introduction

In Elasticsearch you can use Painless scripting to achieve non-standard results. Scripts can be embedded in the pipelines but to make two independent from each other you can store script in Elasticsearch giving some name and call it later on in ingestion pipeline. In this way you can use same script in different places without duplicating code.

2. Start Elasticsearch

For that exercise free version you can use.

				
					docker run --rm \
--name elk01 \
-d \
-e node.name="elk01" \
-p 9200:9200 \
docker.elastic.co/elasticsearch/elasticsearch:8.11.1
				
			

Later set password for ‘elastic’ user

				
					docker exec -it elk01 bash -c "(mkfifo pipe1); ( (elasticsearch-reset-password -u elastic -i < pipe1) & ( echo $'y\n123456\n123456' > pipe1) );sleep 5;rm pipe1"
				
			

3. Store scripts

Below I am presenting you two examples of scripts

3.1. Store script decode base64

Purpose of this one is to convert field found in source from base64 format into decoded text. Name you can specify at the end in URL.

				
					curl -k -u elastic:123456 -XPUT "https://localhost:9200/_scripts/decodebase64" \
-H 'content-type: application/json' -d'
{
    "script": {
        "description": "Decode base64",
        "lang": "painless",
        "source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\'']; ctx[target]=src.decodeBase64();"
    }
}'
				
			

3.1. Retrieve script decode base64

To confirm how script is stored you can retrieve it’s definition

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/decodebase64"
				
			

3.2. Store script decode HEX

Purpose of this script is to convert string in HEX into string readable by human.

				
					curl -k -u elastic:123456 -XPUT "https://localhost:9200/_scripts/decodehex" \
-H 'content-type: application/json' -d'{
    "script": {
        "description": "Decode HEX",
        "lang": "painless",
        "source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\''];StringBuilder sb = new StringBuilder();for (int i = 0; i < src.length(); i+=2) { String byteStr = src.substring(i, i + 2);char byteChar = (char) Integer.parseInt(byteStr, 16);sb.append(byteChar) } ctx[target] = sb.toString();"
    }
}'
				
			

3.3. Retrieve script decode HEX

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/decodehex"
				
			

4. Create pipeline

Both scripts are stored right now in Elasticsearch. So you can use them with the reference name.

In below pipeline field ‘name_base64’ will be decoded into ‘name’ using stored script ‘decodebase64’. Field ‘color_hex’ will be decoded into ‘color’ field.

				
					curl -k -u elastic:123456 -XPUT "https://localhost:9200/_ingest/pipeline/decodehashes" \
-H 'content-type: application/json' -d'
{
    "description": "Decode hash values",
    "processors": [
        {
            "script": {
                "id": "decodebase64",
                "params": {
                    "field": "name_base64",
                    "target_field": "name"
                }
            }
        },
        {
            "script": {
                "id": "decodehex",
                "params": {
                    "field": "color_hex",
                    "target_field": "color"
                }
            }
        }
    ]
}'
				
			

5. Bulk load through pipeline

Now time to load some data. Here you can use _bulk API and then add pipeline as parameter to URL.

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/fruits/_bulk?pipeline=decodehashes" \
-H 'content-type: application/json' -d'
{"index":{"_id":"1"}}
{"name_base64":"QXBwbGU=","color_hex":"477265656e"}
{"index":{"_id":"2"}}
{"name_base64":"QW5hbmFz","color_hex":"59656c6c6f77"}
{"index":{"_id":"3"}}
{"name_base64":"Q2hlcnJ5","color_hex":"526564"}
'
				
			

As you can see data is encoded in two different formats.

6. Check loaded data

Search index fruits to see how data was processed

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty"
				
			

response:

				
					{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "fruits",
                "_id": "1",
                "_score": 1.0,
                "_source": {
                    "color_hex": "477265656e",
                    "name": "Apple",
                    "name_base64": "QXBwbGU=",
                    "color": "Green"
                }
            },
            {
                "_index": "fruits",
                "_id": "2",
                "_score": 1.0,
                "_source": {
                    "color_hex": "59656c6c6f77",
                    "name": "Ananas",
                    "name_base64": "QW5hbmFz",
                    "color": "Yellow"
                }
            },
            {
                "_index": "fruits",
                "_id": "3",
                "_score": 1.0,
                "_source": {
                    "color_hex": "526564",
                    "name": "Cherry",
                    "name_base64": "Q2hlcnJ5",
                    "color": "Red"
                }
            }
        ]
    }
}
				
			

 

7. Conclusion

In this tutorial you have learned how to create stored scripts, how to retrieve them in order to see version in which they were created. Later on you practiced how to call stored script from ingestion pipeline definition.

This optimization will let you save space and avoid errors as you will define script once and use multiple times.

Have a nice coding!

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Follow me on LinkedIn
Share the Post:

Enjoy Free Useful Amazing Content

Related Posts