Updated Means Deleted in ELK

Table of Contents

1. Introduction

Have you ever wonder what happen when you update document in Elasticsearch? Here I am going to explain you with working example what is going to happen underneath.

2. Start ELK cluster

I have prepared for you ready to use project. Simply unpack it and run script start_docker_compose_cluster.sh. If you need more details you can check full article about this project and how to start secure Elasticsearch and Kibana together.

3. Enable Sample Dataset

Insert sample dataset in kibana called kibana_sample_data_logs. You can see it in welcome scree under section “Get started by adding integrations” , button “Try sample data”, on the new page unroll “Other sample data sets” then add “Sample web logs”


kibana add sample data logs

After that you can see new data stream added.

4. Perform update

4.1. Number of deletions before

You can check in dev tools how many indices were deleted.

				
					GET _cat/indices/*kibana_sample_data_logs*?v&h=index,docs.count,docs.deleted,store.size
				
			

above should return

				
					index                                         docs.count docs.deleted store.size
.ds-kibana_sample_data_logs-2025.09.04-000001      14074            0      8.4mb

				
			

Nothing was deleted. So time to update index

4.2. Perform update

Run below in console

response should be

				
					POST kibana_sample_data_logs/_update_by_query
{
    "query": {
        "range": {
            "timestamp": {
                "gte": "now",
                "lte": "now+30m"
            }
        }
    },
    "script": {
        "source": "ctx._source['bytes'] *= 999; "
    }
}
				
			
				
					{
  "took": 6,
  "timed_out": false,
  "total": 8,
  "updated": 8,
  "deleted": 0,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": []
}
				
			

You can see that 6 documents were updated. But know when you check indices list with _cat endpoint you can noticed also 6 documents were deleted at the same time.

				
					GET _cat/indices/*kibana_sample_data_logs*?v&h=index,docs.count,docs.deleted,store.size
				
			

results:

				
					index                                         docs.count docs.deleted store.size
.ds-kibana_sample_data_logs-2025.09.04-000001      14074            8      8.4mb

				
			

If you run same _update_by_query multiple times, it will even increase more.

5. Why Update Deletes?

You may thinking now why updating documents creating deleted documents. This is because documents are immutable and you can’t actually update them in place. What happen in background Elasticsearch creating copy of selected (by query) documents with given changes applied, then saving them under same index.

Physical removal is costly operation and it would slow down ingestion process, therefore _update_by_query actually marks old docs as deleted, and inserts new versions. Finally you can see it under _cat API in column docs.deleted.

5.1. Look how document being changed

You can verify this behavior by selecting document via _id field. First get some sample docs 

				
					GET .ds-kibana_sample_data_logs-2025.09.04-000001/_search
				
			

You can see here what fields are inside documents and note down _id of one document.

				
					    "hits": [
      {
        "_index": ".ds-kibana_sample_data_logs-2025.09.04-000001",
        "_id": "kel2FJkB_UCnX8C1b1wU",
        "_score": 1,
				
			

You can search matching right id

				
					GET .ds-kibana_sample_data_logs-2025.09.04-000001/_search
{
    "query": {
"term": {
  "_id": {
    "value": "kel2FJkB_UCnX8C1b1wU"
  }
}
    }
}
				
			

after successful run modify update by query API call using same query. 

				
					POST kibana_sample_data_logs/_update_by_query
{
  "query": {
    "term": {
      "_id": {
        "value": "kel2FJkB_UCnX8C1b1wU"
      }
    }
  },
  "script": {
    "source": "ctx._source['bytes'] *= 999; "
  }
}
				
			
				
					{
  "took": 8,
  "timed_out": false,
  "total": 1,
  "updated": 1,
  "deleted": 0,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": []
}
				
			

Then you can check same document by _id and verify that version is higher.

				
					GET .ds-kibana_sample_data_logs-2025.09.04-000001/_search?version=true
{
    "query": {
"term": {
  "_id": {
    "value": "kel2FJkB_UCnX8C1b1wU"
  }
}
    }
}
				
			

response:

				
					    "hits": [
      {
        "_index": ".ds-kibana_sample_data_logs-2025.09.04-000001",
        "_id": "kel2FJkB_UCnX8C1b1wU",
        "_version": 2,
        "_score": 1,
				
			

5.2. Segment merges

From time to time Elasticsearch will merge segments. Segments are components of Apache Lucene Index, which is equivalent of Elasticsearch shard. As consequence count of deleted documents is decreasing every time merge operation happens. You can observe that via _cat API

				
					GET _cat/indices/*kibana_sample_data_logs*?v&h=index,docs.count,docs.deleted,store.size
				
			

You can still calculate how many docs were updated sum of _version field value minus one. 

				
					GET kibana_sample_data_logs/_search
{
  "runtime_mappings": {
    "my_version": {
      "type": "long",
      "script": {
        "source": "emit(doc['_version'].value-1)"
      }
    }
  },
  "aggs": {
    "version_sum": {
      "sum": {
        "field": "my_version"
      }
    }
  },
  "size": 0
}

				
			
				
					{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "version_sum": {
      "value": 49
    }
  }
}
				
			

If you do not substract 1 from version value then your result will be equal to total documents in index plus all these marked as deleted.

				
					"source": "emit(doc['_version'].value)"
				
			
				
					{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "version_sum": {
      "value": 14123
    }
  }
}
				
			

Looking at current docs.deleted

				
					index                                         docs.count docs.deleted store.size
.ds-kibana_sample_data_logs-2025.09.04-000001      14074           15      8.4mb
				
			

You can calculate how many updated documents already were part of the clean up(merge).

				
					Already_merged=14123-14074-15=34
				
			

5.3. Force merge

There is a way to force Elasticsearch to merge segments.

				
					POST /kibana_sample_data_logs/_forcemerge?only_expunge_deletes=true
				
			

response

				
					{
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  }
}
				
			

If you check indices again you can see zero documents marked for deletion. All gaps are filled now.

				
					index                                         docs.count docs.deleted store.size
.ds-kibana_sample_data_logs-2025.09.04-000001      14074            0      8.4mb
				
			

And all original documents that were updated got deleted.

				
					Already_merged=14123-14074-0=49
				
			

6. Final thoughts

In this knowledge article you have learned how Elasticsearch handle updated documents and why marking them as deleted. You also practice queries execution on working cluster.

Have a nice coding!

Leave a Reply

Your email address will not be published. Required fields are marked *

Follow me on LinkedIn
Share the Post:

Enjoy Free Useful Amazing Content

Related Posts