Table of Contents
1. Introduction
Have you ever wonder how to quickly dump data from Elasticsearch?
If you want to move it to another Elasticsearch cluster then good method is to make backup and restore it using _snapshot API. But if you want to play with data in another software? Then you can use query API together with curl command. Because every query response is limited by number of documents you have to wrap it with script for loop execution until all documents are fetch.
Finally if you have spark cluster you can utilize Elasticsearch for Apache Hadoop and Spark support and read/write data there. All of mentioned methods will work but requires coding. Fortunately there is Logstash – member of ELK Stack, tool for logs ingestion into Elasticsearch that can help you dumb documents out of ELK without any coding. You will use configuration file and that’s it.
2. Start Elasticsearch
In another article Load 1M sample records to Elasticsearch with python and curl I shown you how to load 1M documents to Elasticsearch. Please follow steps to load data and then you can resume here having Elasticsearch up and running and with data.
3. Start Logstash
Time to export loaded records into file using Logstash. First step is to make pipeline configuration so Logstash know how to do it.
cat <elkdump.conf
input {
elasticsearch {
hosts => ["elk"]
user => "elastic"
password => "123456"
ssl_certificate_authorities => "/elkconfig/certs/http_ca.crt"
ssl_enabled => true
ssl_verification_mode => "none"
index => "connections"
query => '{"query": {"match_all": {}}}'
scroll => "5m"
size => 1000
docinfo => true
}
}
filter {
}
output {
file {
path => "/elkconfig/output-file.json"
codec => json_lines
}
}
EOF
Notice /elkconfig location. This is volume from Elasticsearch container so you can access certificates generated there. You can use same volume to save output.
docker run --rm -it \
--name logstash \
--net logstash \
-e "XPACK_MONITORING_ENABLED=false" \
-v elkconfig:/elkconfig \
-v ./elkdump.conf:/usr/share/logstash/pipeline/elkdump.conf \
docker.elastic.co/logstash/logstash:8.10.4
It will display log of execution to stdout and here you are waiting for line about ‘Closing file /elkconfig/output-file.json’
{
"start_connection" => "2029-07-07 01:15:00",
"@timestamp" => 2023-10-19T11:50:52.630495179Z,
"connection_name" => "RandomText999750",
"@version" => "1"
}
[2023-10-19T11:51:13,946][INFO ][logstash.outputs.file ][main][b7692cf97cc2162ec8f1dad2051a94b8fe4d568c3e488cdcbcac2d2454ca2878] Closing file /elkconfig/output-file.json
This file you can copy from elk container by
docker cp elk:/usr/share/elasticsearch/config/output-file.json output-file.json
# Successfully copied 141MB to /output-file.json
4. Final thoughts
In this knowledge article you have learned how to export data from Elasticsearch using Logstash. In the next article I will explain you how to do it with other methods.