Exporting data to S3-compatible storage
The export s3
command starts exporting data and information on the server side about data schema objects to S3-compatible storage, in the format described under File structure:
ydb [connection options] export s3 [options]
where [connection options] are database connection options
Command line parameters
[options]
: Command parameters:
S3 connection parameters
To run the command to export data to S3 storage, specify the S3 connection parameters. Since data is exported by the YDB server asynchronously, the specified endpoint must be available to establish a connection on the server side.
List of exported items
--item STRING
: Description of the item to export. You can specify the --item
parameter multiple times if you need to export multiple items. STRING
is set in <property>=<value>,...
format with the following mandatory properties:
source
,src
, ors
: Path to the exported directory or table,.
indicates the DB root directory. If you specify a directory, all of its items whose names do not start with a dot and, recursively, all subdirectories whose names do not start with a dot are exported.destination
,dst
, ord
: Path (key prefix) in S3 storage to store exported items.
--exclude STRING
: Template (PCRE) to exclude paths from export. Specify this parameter multiple times for different templates.
Additional parameters
Parameter | Description |
---|---|
--description STRING |
Operation text description saved to the history of operations. |
--retries NUM |
Number of export retries to be made by the server.Defaults to 10 . |
--compression STRING |
Compress exported data.If the default compression level is used for the Zstandard algorithm, data can be compressed by 5-10 times. Compressing data uses the CPU and may affect the speed of performing other DB operations.Possible values:
|
--format STRING |
Result format.Possible values:
|
Running the export command
Export result
If successful, the export s3
command outputs summary information about the enqueued operation to export data to S3, in the format specified in the --format
option. The export itself is performed by the server asynchronously. The output summary shows the operation ID that you can use later to check the operation status and perform actions on it:
-
In the default
pretty
output mode, the operation ID is displayed in the id field with semigraphics formatting:┌───────────────────────────────────────────┬───────┬─────... | id | ready | stat... ├───────────────────────────────────────────┼───────┼─────... | ydb://export/6?id=281474976788395&kind=s3 | true | SUCC... ├╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴╴┴╴╴╴╴╴╴╴┴╴╴╴╴╴... | StorageClass: NOT_SET | Items: ...
-
In the proto-json-base64 output mode, the operation ID is in the "id" attribute:
{"id":"ydb://export/6?id=281474976788395&kind=s3","ready":true, ... }
Export status
Data is exported in the background. To find out the export status and progress, use the operation get
command with the operation ID enclosed in quotation marks and passed as a command parameter. For example:
ydb -p db1 operation get "ydb://export/6?id=281474976788395&kind=s3"
The operation get
output format is also set by the --format
option.
Although the operation ID is in URL format, there is no guarantee that it is maintained in the future. It should only be interpreted as a string.
You can track the export progress by changes in the "progress" attribute:
-
In the default
pretty
output mode, an export operation that completed successfully is displayed as "Done" in theprogress
field with semigraphics formatting:┌───── ... ──┬───────┬─────────┬──────────┬─... | id | ready | status | progress | ... ├──────... ──┼───────┼─────────┼──────────┼─... | ydb:/... | true | SUCCESS | Done | ... ├╴╴╴╴╴ ... ╴╴┴╴╴╴╴╴╴╴┴╴╴╴╴╴╴╴╴╴┴╴╴╴╴╴╴╴╴╴╴┴╴... ...
-
In the proto-json-base64 output mode, the completed export operation is indicated with the
PROGRESS_DONE
value of theprogress
attribute:{"id":"ydb://...", ...,"progress":"PROGRESS_DONE",... }
Completing the export operation
When running the export operation, a directory named export_*
is created in the root directory, where *
is the numeric part of the export ID. This directory stores tables with a consistent snapshot of exported data as of the export start time.
Once the export is done, use the operation forget
command to make sure the export is completed: the operation is removed from the list of operations and all files created for it are deleted:
ydb -p db1 operation forget "ydb://export/6?id=281474976788395&kind=s3"
List of export operations
To get a list of export operations, run the operation list export/s3
command:
ydb -p db1 operation list export/s3
The operation list
output format is also set by the --format
option.
Examples
The examples use a profile named db1
. For information about how to create it, see the Getting started with the YDB CLI article in the "Getting started " section.
Exporting a database
Exporting all DB objects whose names do not start with a dot and that are not stored in directories whose names start with a dot to the export1
directory in mybucket
using the S3 authentication parameters from environment variables or the ~/.aws/credentials
file:
ydb -p db1 export s3 \
--s3-endpoint storage.yandexcloud.net --bucket mybucket \
--item src=.,dst=export1
Exporting multiple directories
Exporting items from DB directories named dir1 and dir2 to the export1
directory in mybucket
using the explicitly set S3 authentication parameters:
ydb -p db1 export s3 \
--s3-endpoint storage.yandexcloud.net --bucket mybucket \
--access-key VJGSOScgs-5kDGeo2hO9 --secret-key fZ_VB1Wi5-fdKSqH6074a7w0J4X0 \
--item src=dir1,dst=export1/dir1 --item src=dir2,dst=export1/dir2
Getting operation IDs
To get a list of export operation IDs in a format suitable for handling in bash scripts, use the jq utility:
ydb -p db1 operation list export/s3 --format proto-json-base64 | jq -r ".operations[].id"
You'll get an output where each new line shows an operation's ID. For example:
ydb://export/6?id=281474976789577&kind=s3
ydb://export/6?id=281474976789526&kind=s3
ydb://export/6?id=281474976788779&kind=s3
You can use these IDs, for example, to run a loop to end all the current operations:
ydb -p db1 operation list export/s3 --format proto-json-base64 | jq -r ".operations[].id" | while read line; do ydb -p db1 operation forget $line;done