Dump data form PostgreSQL
Warning
At the moment, YDB's compatibility with PostgreSQL is under development, so not all PostgreSQL constructs and functions are supported yet. PostgreSQL compatibility is available for testing in the form of a Docker container, which can be deployed by following these instructions.
Data from PostgreSQL can be migrated to YDB using utilities such as pg_dump, psql, and YDB CLI. The pg_dump and psql utilities are installed with PostgreSQL. YDB CLI is YDB's command-line client, which is installed separately.
To do this, you need to:
-
Create a data dump using pg_dump with the following parameters:
-
Convert the dump to a format supported by YDB using the
ydb tools pg-convert
command from YDB CLI. -
Load the result into YDB in PostgreSQL compatibility mode.
pg-convert command
The ydb tools pg-convert
command reads a dump file or standard input created by the pg_dump utility, performs transformations, and outputs to standard output a dump that can be sent to YDB's PostgreSQL-compatible middleware.
ydb tools pg-convert
performs the following transformations:
-
Moving the creation of the primary key into the body of the CREATE TABLE command.
-
Removing the
public
schema from table names. -
Deleting the
WITH (...)
section inCREATE TABLE
. -
Commenting out unsupported constructs (optionally):
SELECT pg_catalog.set_config.*
ALTER TABLE
If the CLI cannot find a table's primary key, it will automatically create a BIGSERIAL column named __ydb_stub_id
as the primary key.
The general form of the command:
ydb [global options...] tools pg-convert [options...]
global options
— global parameters.options
— subcommand parameters.
subcommand parameters
Name | Description |
---|---|
-i |
The name of the file containing the original dump. If the option is not specified, the dump is read from standard input. |
--ignore-unsupported |
When this option is specified, unsupported constructs will be commented out in the resulting dump and duplicated in standard error. By default, if unsupported constructs are detected, the command returns an error. This does not apply to ALTER TABLE expressions that define a table's primary key, as they are commented out in any case. |
Warning
When loading large dumps, reading from standard input is not recommended because the entire dump will be stored in RAM. It is advised to use the file option, in which case the CLI will only keep a small portion of the dump in memory.
Example of importing a dump into YDB
As an example, data generated by pgbench will be loaded.
-
Start Docker containers with PostgreSQL and YDB:
docker run --name postgres_container \ -e POSTGRES_USER=pgroot -e POSTGRES_PASSWORD=1234 \ -e POSTGRES_DB=local \ -p 5433:5433 -d postgres:14 -c 'port=5433' docker run --name ydb-postgres -d --pull always -p 5432:5432 -p 8765:8765 \ -e POSTGRES_USER=ydbroot -e POSTGRES_PASSWORD=4321 \ -e YDB_FEATURE_FLAGS=enable_temp_tables,enable_table_pg_types \ -e YDB_USE_IN_MEMORY_PDISKS=true \ ghcr.io/ydb-platform/local-ydb:latest
Information about the started Docker containers:
Database
PostgreSQL
YDB
Container name
postgres_container
ydb-postgres
Address
postgres://pgroot:1234@localhost:5433/local
postgresql://ydbroot:4321@localhost:5432/local
Port
5433
5432
User name
pgroot
ydbroot
Password
1234
4321
-
Generate data through pgbench:
docker exec postgres_container pgbench postgres://pgroot:1234@localhost:5433/local -i
-
Create a dump of the database using pg_dump:
docker exec postgres_container pg_dump postgres://pgroot:1234@localhost:5433/local --inserts \ --column-inserts --encoding=utf_8 --rows-per-insert=1000 > dump.sql
-
Load the dump into YDB:
ydb tools pg-convert --ignore-unsupported -i dump.sql | psql postgresql://ydbroot:4321@localhost:5432/local
This command uses YDB CLI to convert the dump.sql file to the format readable by the YDB PostgreSQL compatibility layer. The converted dump file is then redirected to the
psql
utility for loading the data into YDB via PostgreSQL protocol.