Search
Close this search box.

Export Data from Elasticsearch – Logstash

Table of Contents

1. Introduction

Have you ever wonder how to quickly dump data from Elasticsearch?
If you want to move it to another Elasticsearch cluster then good method is to make backup and restore it using _snapshot API. But if you want to play with data in another software? Then you can use query API together with curl command. Because every query response is limited by number of documents you have to wrap it with script for loop execution until all documents are fetch.

Finally if you have spark cluster you can utilize Elasticsearch for Apache Hadoop and Spark support and read/write data there. All of mentioned methods will work but requires coding. Fortunately there is Logstash – member of ELK Stack, tool for logs ingestion into Elasticsearch that can help you dumb documents out of ELK without any coding. You will use configuration file and that’s it.

2. Start Elasticsearch

In another article Load 1M sample records to Elasticsearch with python and curl I shown you how to load 1M documents to Elasticsearch. Please follow steps to load data and then you can resume here having Elasticsearch up and running and with data.

3. Start Logstash

Time to export loaded records into file using Logstash. First step is to make pipeline configuration so Logstash know how to do it.

				
					cat <<EOF >elkdump.conf
input {
  elasticsearch {
    hosts => ["elk"]
    user => "elastic"
    password => "123456"
    ssl_certificate_authorities => "/elkconfig/certs/http_ca.crt"
    ssl_enabled => true
    ssl_verification_mode => "none"
    index => "connections"
    query => '{"query": {"match_all": {}}}'
    scroll => "5m"
    size => 1000
    docinfo => true
  }
}

filter {
}

output {
  file {
    path => "/elkconfig/output-file.json"
    codec => json_lines
  }
}
EOF
				
			

Notice /elkconfig location. This is volume from Elasticsearch container so you can access certificates generated there. You can use same volume to save output.

				
					docker run --rm -it \
--name logstash \
--net logstash \
-e "XPACK_MONITORING_ENABLED=false" \
-v elkconfig:/elkconfig \
-v ./elkdump.conf:/usr/share/logstash/pipeline/elkdump.conf \
docker.elastic.co/logstash/logstash:8.10.4
				
			

It will display log of execution to stdout and here you are waiting for line about ‘Closing file /elkconfig/output-file.json’

				
					{
    "start_connection" => "2029-07-07 01:15:00",
          "@timestamp" => 2023-10-19T11:50:52.630495179Z,
     "connection_name" => "RandomText999750",
            "@version" => "1"
}
[2023-10-19T11:51:13,946][INFO ][logstash.outputs.file    ][main][b7692cf97cc2162ec8f1dad2051a94b8fe4d568c3e488cdcbcac2d2454ca2878] Closing file /elkconfig/output-file.json

				
			

This file you can copy from elk container by

				
					docker cp elk:/usr/share/elasticsearch/config/output-file.json output-file.json
# Successfully copied 141MB to /output-file.json
				
			

4. Final thoughts

In this knowledge article you have learned how to export data from Elasticsearch using Logstash. In the next article I will explain you how to do it with other methods.

Leave a Reply

Your email address will not be published. Required fields are marked *

Follow me on LinkedIn
Share the Post:

Enjoy Free Useful Amazing Content

Related Posts