Test painless scripts and pipelines – simulate evaluation

January 5, 2024
Tomasz Dzierżanowski

1. Introduction

Workflow is important for every job. How are you resolving tasks is even more important than final results in my opinion. Instead of trying your code in environment you can perform dry run and see results through simulation. One of obvious benefit is no need to delete created objects every time – cluster is clean. In this article I will show you how to execute painless scripts and how to do dry run of pipelines.

2. Start Elasticsearch

Using docker start one node cluster

				
					docker run --rm \
--name elk01 \
-d \
-e node.name="elk01" \
-p 9200:9200 \
docker.elastic.co/elasticsearch/elasticsearch:8.11.3

and set password for ‘elastic’ user

				
					docker exec -it elk01 bash -c "(mkfifo pipe1); ( (elasticsearch-reset-password -u elastic -i < pipe1) & ( echo $'y\n123456\n123456' > pipe1) );sleep 5;rm pipe1"

3. Execute painless script

Having script to decode base64 value you can evaluate it without storing script in Elasticsearch and without creating pipeline simulation.

3.1. Test context

In below example decode method is called directly over String literal:

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_scripts/painless/_execute" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "\"QXBwbGU=\".decodeBase64();"
    }
}'

+ assigning String to variable

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_scripts/painless/_execute" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "def x=\"QXBwbGU=\";x.decodeBase64();"
    }
}'

+ adding params

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_scripts/painless/_execute" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "params.name_base64.decodeBase64();",
        "params": {
            "name_base64": "QXBwbGU="
        }
    }
}'

+ adding NPE protection

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_scripts/painless/_execute" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "def src = params.name_base64;if(src == null){return '\'''\'';} def target = params.target_field; target = src.decodeBase64();",
        "params": {
            "name_base64": "QXBwbGU=",
            "target_field": "name_decoded"
        }
    }
}'

3.2. Filter context

When you using script to filter data in search then

3.2.1. Testing prototype

Define test mapping to run filter context

				
					curl -k -u elastic:123456 -XPUT "https://localhost:9200/testindex1" \
-H 'content-type: application/json' -d'
{
  "mappings": {
    "properties": {
        "name_base64": {
        "type": "keyword"
      }
    }
  }
}'

Define filtering script that answering true/false question

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_scripts/painless/_execute" \
-H 'content-type: application/json' -d'
{
  "script": {
    "source": "def src = doc[params.field];if(src == null){return false;} params.filteronvalue == src.value.decodeBase64();",
    "params": {
      "field": "name_base64",
      "filteronvalue" : "Apple"
    }
  },
  "context": "filter",
  "context_setup": {
    "index": "testindex1",
    "document": {
      "name_base64": "QXBwbGU="
    }
  }
}'

because document in context_setup has exactly value ‘Apple’ encoded in base64 format then you will get true in return

				
					{
    "result":true
}

3.2.2. Loading test data

Load test data with below command:

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/fruits/_bulk" \
-H 'content-type: application/json' -d'
{"index":{"_id":"1"}}
{"name_base64":"QXBwbGU=","color_hex":"477265656e"}
{"index":{"_id":"2"}}
{"name_base64":"QW5hbmFz","color_hex":"59656c6c6f77"}
{"index":{"_id":"3"}}
{"name_base64":"Q2hlcnJ5","color_hex":"526564"}
'

and check if data is loaded properly

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search"

3.2.3. Using tested script in query for filtering

Because script was tested and sample data loaded, now is time to query ‘fruits’ index to search for document that has value ‘Apple’ encoded in base64.

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search" \
-H 'content-type: application/json' -d'
{
    "query": {
        "bool": {
            "filter": {
                "script": {
                    "script": {
                        "source": "def src = doc[params.field];if(src == null){return false;} params.filteronvalue == src.value.decodeBase64();",
                        "params": {
                            "field": "name_base64.keyword",
                            "filteronvalue": "Apple"
                        }
                    }
                }
            }
        }
    }
}'

example response:

				
					{
    "took": 7,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 0.0,
        "hits": [
            {
                "_index": "fruits",
                "_id": "1",
                "_score": 0.0,
                "_source": {
                    "name_base64": "QXBwbGU=",
                    "color_hex": "477265656e"
                }
            }
        ]
    }
}

Only one document got returned that matching filtering criteria.

3.3. Field contexts

3.3.1 Testing

Before defining runtime_mapping you can test it with field_context like

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/painless/_execute?pretty" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "def src = doc[params.field];if(src.size()==0){return ;} emit(src.value.decodeBase64())",
        "params": {
            "field": "name_base64"
        }
    },
    "context": "keyword_field",
    "context_setup": {
        "index": "testindex1",
        "document": {
            "name_base64": "QXBwbGU="
        }
    }
}'

will return:

				
					{
    "result": [
        "Apple"
    ]
}

3.3.2. Using in query

Once you are sure everything is correct with script, you can run it inside query to define runtime field

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "runtime_mappings": {
        "Fruit name": {
            "type": "keyword",
            "script": {
                "source": "def src = doc[params.field];if(src.size()==0){return ;} emit(src.value.decodeBase64())",
                "params": {
                    "field": "name_base64.keyword"
                }
            }
        }
    },
    "fields": [
        "Fruit name"
    ],
    "_source": false
}'

this will return all fruits with new field ‘Fruit name’

				
					{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "fruits",
                "_id": "1",
                "_score": 1.0,
                "fields": {
                    "Fruit name": [
                        "Apple"
                    ]
                }
            },
            {
                "_index": "fruits",
                "_id": "2",
                "_score": 1.0,
                "fields": {
                    "Fruit name": [
                        "Ananas"
                    ]
                }
            },
            {
                "_index": "fruits",
                "_id": "3",
                "_score": 1.0,
                "fields": {
                    "Fruit name": [
                        "Cherry"
                    ]
                }
            }
        ]
    }
}

3.4. Score context

3.4.1. Testing score script

You created script that will manipulate over score value and you want to test it using score context like below

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/painless/_execute?pretty" \
-H 'content-type: application/json' -d'
{
  "script": {
    "source": "def src = doc[params.field];if(src == null){return 0.0;} if(params.filteronvalue == src.value.decodeBase64()){return 1.0}",
    "params": {
      "field": "name_base64",
      "filteronvalue" : "Apple"
    }
  },
  "context": "score",
  "context_setup": {
    "index": "testindex1",
    "document": {
      "name_base64": "QXBwbGU="
    }
  }
}'

which basically scores decoded ‘Apple’ as 1.0 and nulls as zero.

3.4.2. Using new scoring

Tested scoring script can be now used now in queries

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "query": {
        "script_score": {
            "query": {
                 "match_all": {}
            },
            "script": {
                "source": "def src = doc[params.field];if(src == null){return 0.0;} if(params.filteronvalue == src.value.decodeBase64()){return 1.0}",
                "params": {
                    "field": "name_base64.keyword",
                    "filteronvalue": "Apple"
                }
            }
        }
    }
}'

in return will give score 1.0 to document with ‘Apple’ (base64 as ‘QXBwbGU=’)

				
					{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "fruits",
                "_id": "1",
                "_score": 1.0,
                "_source": {
                    "name_base64": "QXBwbGU=",
                    "color_hex": "477265656e"
                }
            },
            {
                "_index": "fruits",
                "_id": "2",
                "_score": 0.0,
                "_source": {
                    "name_base64": "QW5hbmFz",
                    "color_hex": "59656c6c6f77"
                }
            },
            {
                "_index": "fruits",
                "_id": "3",
                "_score": 0.0,
                "_source": {
                    "name_base64": "Q2hlcnJ5",
                    "color_hex": "526564"
                }
            }
        ]
    }
}

4. Use Field API

Above scoring script and others that you see in article can be rewritten to utilize Field API. Look at below example and compare with previous – it looks simpler, isn’t it.

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "query": {
        "script_score": {
            "query": {
                 "match_all": {}
            },
            "script": {
                "source": "def src=$(params.field,null);if(params.filteronvalue == src.decodeBase64()){return 1.0} else {return 0.0}",
                "params": {
                    "field": "name_base64.keyword",
                    "filteronvalue": "Apple"
                }
            }
        }
    }
}'

no need to null checking or calling doc.containsKey(field) method.

other examples:

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "runtime_mappings": {
        "Fruit name": {
            "type": "keyword",
            "script": {
                "source": "def x=$(params.field,\"\"); emit(x.decodeBase64());",
                "params": {
                    "field": "name_base64.keyword"
                }
            }
        }
    },
    "fields": [
        "Fruit name"
    ],
    "_source": false
}'

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "query": {
        "bool": {
            "filter": {
                "script": {
                    "script": {
                        "source": "def src = $(params.field,\"\"); params.filteronvalue == src.decodeBase64();",
                        "params": {
                            "field": "name_base64.keyword",
                            "filteronvalue": "Apple"
                        }
                    }
                }
            }
        }
    }
}'

5. Painless Debugging

For objects in Painless you can surround them with Debug.explain so you can throw exception and see details of data types

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/_scripts/painless/_execute?pretty" \
-H 'content-type: application/json' -d'
{
    "script": {
        "source": "Debug.explain(params.base64value.decodeBase64())",
        "params": {
            "base64value": "QXBwbGU="
        }
    }
}'

will return

				
					{
  "error" : {
    "root_cause" : [
      {
        "type" : "script_exception",
        "reason" : "runtime error",
        "painless_class" : "java.lang.String",
        "to_string" : "Apple",
        "java_class" : "java.lang.String",
        "script_stack" : [
          "Debug.explain(params.base64value.decodeBase64())",
          "                                ^---- HERE"
        ],
        "script" : "Debug.explain(params.base64value.decodeBase64())",
        "lang" : "painless",
        "position" : {
          "offset" : 32,
          "start" : 0,
          "end" : 48
        }
      }
    ],
    "type" : "script_exception",
    "reason" : "runtime error",
    "painless_class" : "java.lang.String",
    "to_string" : "Apple",
    "java_class" : "java.lang.String",
    "script_stack" : [
      "Debug.explain(params.base64value.decodeBase64())",
      "                                ^---- HERE"
    ],
    "script" : "Debug.explain(params.base64value.decodeBase64())",
    "lang" : "painless",
    "position" : {
      "offset" : 32,
      "start" : 0,
      "end" : 48
    },
    "caused_by" : {
      "type" : "painless_explain_error",
      "reason" : null
    }
  },
  "status" : 400
}

and this call

				
					curl -k -u elastic:123456 -XGET "https://localhost:9200/fruits/_search?pretty" \
-H 'content-type: application/json' -d'
{
    "query": {
        "bool": {
            "filter": {
                "script": {
                    "script": {
                        "source": "Debug.explain($(params.field,\"\"))",
                        "params": {
                            "field": "name_base64.keyword",
                            "filteronvalue": "Apple"
                        }
                    }
                }
            }
        }
    }
}'

will return

				
					{
  "error" : {
    "root_cause" : [
      {
        "type" : "script_exception",
        "reason" : "runtime error",
        "painless_class" : "java.lang.String",
        "to_string" : "QXBwbGU=",
        "java_class" : "java.lang.String",
        "script_stack" : [
          "Debug.explain($(params.field,\"\"))",
          "              ^---- HERE"
        ],
        "script" : "Debug.explain($(params.field,\"\"))",
        "lang" : "painless",
        "position" : {
          "offset" : 14,
          "start" : 0,
          "end" : 33
        }
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "fruits",
        "node" : "tYPOnT9aQGupk-TOpTfY3Q",
        "reason" : {
          "type" : "script_exception",
          "reason" : "runtime error",
          "painless_class" : "java.lang.String",
          "to_string" : "QXBwbGU=",
          "java_class" : "java.lang.String",
          "script_stack" : [
            "Debug.explain($(params.field,\"\"))",
            "              ^---- HERE"
          ],
          "script" : "Debug.explain($(params.field,\"\"))",
          "lang" : "painless",
          "position" : {
            "offset" : 14,
            "start" : 0,
            "end" : 33
          },
          "caused_by" : {
            "type" : "painless_explain_error",
            "reason" : null
          }
        }
      }
    ]
  },
  "status" : 400
}

6. Dry run of pipeline

Your pipeline you can check before use with ‘_simulate’. Through docs section your values are accessible in pipeline

				
					curl -k -u elastic:123456 -XPOST "https://localhost:9200/_ingest/pipeline/_simulate?pretty" \
-H 'content-type: application/json' -d'
{
    "pipeline": {
        "processors": [
            {
                "script": {
                    "description": "Decode base64",
                    "lang": "painless",
                    "source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\'']; ctx[target]=src.decodeBase64();",
                    "params": {
                        "field": "name_base64",
                        "target_field": "name"
                    }
                }
            },
            {
                "script": {
                    "description": "Decode hex",
                    "lang": "painless",
                    "source": "def src=ctx[params['\''field'\'']];if(src == null){return;}def target=params['\''target_field'\''];StringBuilder sb = new StringBuilder();for (int i = 0; i < src.length(); i+=2) { String byteStr = src.substring(i, i + 2);char byteChar = (char) Integer.parseInt(byteStr, 16);sb.append(byteChar) } ctx[target] = sb.toString();",
                    "params": {
                        "field": "color_hex",
                        "target_field": "color"
                    }
                }
            }
        ]
    },
    "docs": [
        {
            "_source": {
                "name_base64": "QXBwbGU=",
                "color_hex": "477265656e"
            }
        }
    ]
}'

as an output you will get decoded strings

				
					{
    "docs": [
        {
            "doc": {
                "_index": "_index",
                "_version": "-3",
                "_id": "_id",
                "_source": {
                    "color_hex": "477265656e",
                    "name": "Apple",
                    "name_base64": "QXBwbGU=",
                    "color": "Green"
                },
                "_ingest": {
                    "timestamp": "2023-12-31T13:33:33.698900588Z"
                }
            }
        }
    ]
}

This tested pipeline can be used like in article about stored scripts.

7. Summary

In this article you have discovered how to test painless script before using it as part of pipeline or as runtime script. This let you improve your workflow so you can avoid runtime exceptions and speed up your work.

Have a nice coding!