Terms and definitions

Database

A YDB database (YDB DB) is an isolated consistent set of data that is accessed via YDB.

YDB:

  • Accepts network connections from clients and processes their queries to update and read data.
  • Authenticates and authorizes clients.
  • Provides transactional updates and consistent data reads.
  • Provides DB load scalability and high DB availability for clients.
  • Stores information about a data schema, processes queries to update it, checks the hosted data for compliance with the schema, and allows you to create data queries using schema elements, such as tables, fields, and indexes.
  • Provides consistent data updates in indexes when data is updated in tables.
  • Generates metrics that can be used to monitor the DB performance.

Resources for the YDB database (CPU, RAM, nodes, and disk space) are allocated within a YDB cluster.

Cluster

A YDB cluster is a set of interconnected YDB nodes that distribute the load among themselves. A single cluster may host multiple databases, and specific cluster nodes are allocated for each database. One database is fully hosted in one cluster.

YDB clusters can be deployed in a single or multiple geographically distributed data centers. The smallest single-datacenter cluster consists of a single node.

Regions and availability zones

An availability zone is a data processing center or an isolated segment thereof with minimal physical distance between nodes and minimal risk of failure at the same time as other availability zones.
A large geographic region is an area within which the distance between availability zones is 500 km or smaller.

A geo-distributed YDB cluster contains nodes located in different availability zones within a wide geographic region. YDB performs synchronous data writes to each of the availability zones, ensuring uninterrupted performance if an availability zone fails.

In geographically distributed clusters, you can choose a policy for distributing computing resources across data centers. This lets you find the right balance between minimum execution time and minimum downtime if a data center goes offline.

Computing resources

YDB clusters can be deployed on bare metal, VMs, or in a container virtualization environment.

Storage groups

A storage group is a redundant array of independent disks that are networked in a single logical unit. Storage groups increase fault tolerance through redundancy and improve performance.

Each storage group corresponds to a specific storage schema that affects the number of disks used, the failure model, and the redundancy factor. The block4-2 arrangement is commonly used for single-datacenter clusters with storage located on 8 disks in 8 racks and capable of surviving a failure of any two disks, and ensuring a redundancy factor of 1.5. Multiple-datacenter clusters use the mirror3dc arrangement with storage comprising 9 disks, 3 in each of the three datacenters, and capable of surviving a failure of a datacenter or disk and ensuring a redundancy factor of 3.

YDB lets you allocate additional storage groups for available DBs as your data grows.