Reading data from an external table pointing to S3 (Object Storage)
Sometimes, the same data queries need to be executed regularly. To avoid specifying all the details of working with this data every time a query is called, use the mode with external tables. In this case, the query looks like a regular query to YDB tables.
Example query for reading data:
SELECT
*
FROM
`s3_test_data`
WHERE
version > 1
Creating an external table pointing to an S3 bucket (Object Storage)
To create an external table describing the S3 bucket (Object Storage), execute the following SQL query. The query creates an external table named s3_test_data
, containing files in the CSV
format with string fields key
and value
, located inside the bucket at the path test_folder
, using the connection credentials specified by the external data source object bucket
:
CREATE EXTERNAL TABLE `s3_test_data` (
key Utf8 NOT NULL,
value Utf8 NOT NULL
) WITH (
DATA_SOURCE="bucket",
LOCATION="folder",
FORMAT="csv_with_names",
COMPRESSION="gzip"
);
Where:
key, value
- list of data columns and their types;bucket
- name of the external data source to S3 (Object Storage);folder
- path within the bucket containing the data;csv_with_names
- one of the permitted data storage formats;gzip
- one of the permitted compression algorithms.
Data model
Reading data using external tables from S3 (Object Storage) is done with regular SQL queries as if querying a normal table.
SELECT
<expression>
FROM
`s3_test_data`
WHERE
<filter>;
Limitations
There are a number of limitations when working with S3 buckets (Object Storage).
Limitations:
- Only data read requests -
SELECT
andINSERT
are supported; other requests are not. -
If the date value stored in the external data source is outside the allowed range for YDB (all dates used must be later than 1970-01-01 but earlier than 2105-12-31), such a value in YDB will be converted to
NULL
.