App in Python

This page contains a detailed description of the code of a test app that is available as part of the YDB Python SDK.

Downloading and starting

The following execution scenario is based on git and Python3. Be sure to install the YDB Python SDK.

Create a working directory and use it to run from the command line the command to clone the GitHub repository and install the necessary Python packages:

git clone https://github.com/ydb-platform/ydb-python-sdk.git
python3 -m pip install iso8601

Next, from the same working directory, run the command to start the test app. The command will differ depending on the database to connect to.

To connect to a locally deployed YDB database according to the Docker use case, run the following command in the default configuration:

YDB_ANONYMOUS_CREDENTIALS=1 \
python3 ydb-python-sdk/examples/basic_example_v1/ -e grpc://localhost:2136 -d /local

To run the example against any available YDB database, you need to know the endpoint and the database path.

If authentication is enabled in the database, you also need to select the authentication mode and get secrets (a token or username/password pair).

Run the command as follows:

<auth_mode_var>="<auth_mode_value>" \
python3 ydb-python-sdk/examples/basic_example_v1/ -e <endpoint> -d <database>

where

  • <endpoint>: The endpoint.
  • <database>: The database path.
  • <auth_mode_var>: The environment variable that determines the authentication mode.
  • <auth_mode_value> is the authentication parameter value for the selected mode.

For example:

YDB_ACCESS_TOKEN_CREDENTIALS="t1.9euelZqOnJuJlc..." \
python3 ydb-python-sdk/examples/basic_example_v1/ -e grpcs://ydb.example.com:2135 -d /path/db )

Note

If you previously reviewed the articles of the "Getting started" section, you must have used the necessary parameters when getting started with the YDB CLI and can get them from the profile:

ydb config profile get db1

Initializing a database connection

To interact with YDB, create an instance of the driver, client, and session:

  • The YDB driver lets the app and YDB interact at the transport layer. The driver must exist throughout the YDB access lifecycle and be initialized before creating a client or session.
  • The YDB client runs on top of the YDB driver and enables the handling of entities and transactions.
  • The YDB session contains information about executed transactions and prepared queries, and is part of the YDB client context.

App code snippet for driver initialization:

def run(endpoint, database, path):
    driver_config = ydb.DriverConfig(
        endpoint, database, credentials=ydb.construct_credentials_from_environ(),
        root_certificates=ydb.load_ydb_root_certificate(),
    )
    with ydb.Driver(driver_config) as driver:
        try:
            driver.wait(timeout=5)
        except TimeoutError:
            print("Connect failed to YDB")
            print("Last reported errors by discovery:")
            print(driver.discovery_debug_details())
            exit(1)

App code snippet for creating a session:

session = driver.table_client.session().create()

Creating tables

Creating tables to be used in operations on a test app. This step results in the creation of DB tables of the series directory data model:

  • Series
  • Seasons
  • Episodes

Once the tables are created, the method for getting information about data schema objects is called and the result of its execution is output.

To create tables, use the session.create_table() method:

def create_tables(session, path):
    session.create_table(
        os.path.join(path, 'series'),
        ydb.TableDescription()
        .with_column(ydb.Column('series_id', ydb.PrimitiveType.Uint64))  # not null column
        .with_column(ydb.Column('title', ydb.OptionalType(ydb.PrimitiveType.Utf8)))
        .with_column(ydb.Column('series_info', ydb.OptionalType(ydb.PrimitiveType.Utf8)))
        .with_column(ydb.Column('release_date', ydb.OptionalType(ydb.PrimitiveType.Uint64)))
        .with_primary_key('series_id')
    )

The path parameter accepts the absolute path starting from the root:

full_path = os.path.join(database, path)

You can use the session.describe_table() method to output information about the table structure and make sure that it was properly created:

def describe_table(session, path, name):
    result = session.describe_table(os.path.join(path, name))
    print("\n> describe table: series")
    for column in result.columns:
        print("column, name:", column.name, ",", str(column.type.item).strip())

The given code snippet prints the following text to the console at startup:

> describe table: series
('column, name:', 'series_id', ',', 'type_id: UINT64')
('column, name:', 'title', ',', 'type_id: UTF8')
('column, name:', 'series_info', ',', 'type_id: UTF8')
('column, name:', 'release_date', ',', 'type_id: UINT64')

Adding data

Adding data to the created tables using an UPSERT statement of YQL. A data update request is sent within a single request to the server with transaction auto-commit mode enabled.

Code snippet for data insert/update:

def upsert_simple(session, path):
    session.transaction().execute(
        """
        PRAGMA TablePathPrefix("{}");
        UPSERT INTO episodes (series_id, season_id, episode_id, title) VALUES
            (2, 6, 1, "TBD");
        """.format(path),
        commit_tx=True,
    )

PRAGMA TablePathPrefix adds a specified prefix to the database table paths. It uses standard file system path concatenation: i.e., it supports parent folder referencing and requires no trailing slash. For example:

PRAGMA TablePathPrefix = "/cluster/database";
SELECT * FROM episodes;

For more information about PRAGMA YQL, see the YQL documentation.

Retrieving data with a Select

Retrieving data using a SELECT statement in YQL. Handling the retrieved data selection in the app.

To execute YQL queries, use the session.transaction().execute() method.
The SDK lets you explicitly control the execution of transactions and configure the transaction execution mode using the TxControl class.

In the code snippet below, the transaction is executed using the transaction().execute() method. The transaction execution mode set is ydb.SerializableReadWrite(). When all the queries in the transaction are completed, the transaction is automatically committed by explicitly setting the flag commit_tx=True. The query body is described using YQL syntax and is passed to the execute method as a parameter.

def select_simple(session, path):
    result_sets = session.transaction(ydb.SerializableReadWrite()).execute(
        """
        PRAGMA TablePathPrefix("{}");
        $format = DateTime::Format("%Y-%m-%d");
        SELECT
            series_id,
            title,
            $format(DateTime::FromSeconds(CAST(DateTime::ToSeconds(DateTime::IntervalFromDays(CAST(release_date AS Int16))) AS Uint32))) AS release_date
        FROM series
        WHERE series_id = 1;
        """.format(path),
        commit_tx=True,
    )
    print("\n> select_simple_transaction:")
    for row in result_sets[0].rows:
        print("series, id: ", row.series_id, ", title: ", row.title, ", release date: ", row.release_date)

    return result_sets[0]

When the query is executed, result_set is returned whose iteration outputs the following text to the console:

> SelectSimple:
series, Id: 1, title: IT Crowd, Release date: 2006-02-03

Parameterized prepared queries

Parameterized prepared queries are saved as templates where specially formatted names are replaced by relevant parameter values each time you execute the query. Use parameterized queries to improve performance by reducing how often queries that only differ in parameter values are compiled and recompiled. The prepared query is stored in the session context.

Code snippet for parameterized prepared queries:

def select_prepared(session, path, series_id, season_id, episode_id):
    query = """
    PRAGMA TablePathPrefix("{}");
    DECLARE $seriesId AS Uint64;
    DECLARE $seasonId AS Uint64;
    DECLARE $episodeId AS Uint64;
    $format = DateTime::Format("%Y-%m-%d");
    SELECT
        title,
        $format(DateTime::FromSeconds(CAST(DateTime::ToSeconds(DateTime::IntervalFromDays(CAST(air_date AS Int16))) AS Uint32))) AS air_date
    FROM episodes
    WHERE series_id = $seriesId AND season_id = $seasonId AND episode_id = $episodeId;
    """.format(path)

    prepared_query = session.prepare(query)
    result_sets = session.transaction(ydb.SerializableReadWrite()).execute(
        prepared_query, {
            '$seriesId': series_id,
            '$seasonId': season_id,
            '$episodeId': episode_id,
        },
        commit_tx=True
    )
    print("\n> select_prepared_transaction:")
    for row in result_sets[0].rows:
        print("episode title:", row.title, ", air date:", row.air_date)

    return result_sets[0]

The given code snippet prints the following text to the console at startup:

> select_prepared_transaction:
('episode title:', u'To Build a Better Beta', ', air date:', '2016-06-05')

Scan queries

Making a scan query that results in a data stream. Streaming lets you read an unlimited number of rows and amount of data.

def executeScanQuery(driver):
  query = ydb.ScanQuery("""
    SELECT series_id, season_id, COUNT(*) AS episodes_count
    FROM episodes
    GROUP BY series_id, season_id
    ORDER BY series_id, season_id
  """, {})

  it = driver.table_client.scan_query(query)

  while True:
    try:
        result = next(it)
        print result.result_set.rows
    except StopIteration:
        break

Managing transactions

Transactions are managed through TCL Begin and Commit calls.

In most cases, instead of explicitly using Begin and Commit calls, it's better to use transaction control parameters in execute calls. This helps you avoid unnecessary requests to YDB and run your queries more efficiently.

Code snippet for transaction().begin() and tx.Commit() calls:

def explicit_tcl(session, path, series_id, season_id, episode_id):
    query = """
    PRAGMA TablePathPrefix("{}");

    DECLARE $seriesId AS Uint64;
    DECLARE $seasonId AS Uint64;
    DECLARE $episodeId AS Uint64;

    UPDATE episodes
    SET air_date = CAST(CurrentUtcDate() AS Uint64)
    WHERE series_id = $seriesId AND season_id = $seasonId AND episode_id = $episodeId;
    """.format(path)
    prepared_query = session.prepare(query)

    tx = session.transaction(ydb.SerializableReadWrite()).begin()

    tx.execute(
        prepared_query, {
            '$seriesId': series_id,
            '$seasonId': season_id,
            '$episodeId': episode_id
        }
    )

    print("\n> explicit TCL call")

    tx.commit()