YDB overview

YDB is a horizontally scalable distributed fault-tolerant DBMS. YDB is designed for high performance with a typical server being capable of handling tens of thousands of queries per second. The system is designed to handle hundreds of petabytes of data. YDB can operate in single data center and geo-distributed (cross data center) modes on a cluster of thousands of servers.

YDB provides:

  • Strict consistency which can be relaxed to increase performance.
  • Support for queries written in YQL (an SQL dialect for working with big data).
  • Automatic data replication.
  • High availability with automatic failover in case a server, rack, or availability zone goes offline.
  • Automatic data partitioning as data or load grows.

To interact with YDB, you can use the YDB CLI and SDK fo C++, C#, Go, Java, Node.js, PHP, Python, and Rust.

YDB supports a relational data model and manages tables with a predefined schema. To make it easier to organize tables, directories can be created like in the file system. In addition to tables, YDB supports topics as an entity for storing unstructured messages and delivering them to multiple subscribers.

Database commands are mainly written in YQL, an SQL dialect. This gives the user a powerful and already familiar way to interact with the database.

YDB supports high-performance distributed ACID transactions that may affect multiple records in different tables. It provides the serializable isolation level, which is the strictest transaction isolation. You can also reduce the level of isolation to raise performance.

YDB natively supports different processing options, such as OLTP and OLAP. The current version offers limited analytical query support. This is why we can say that YDB is currently an OLTP database.

YDB is an open-source system. The YDB source code is available under Apache License 2.0. Client applications interact with YDB based on gRPC that has an open specification. It allows implementing an SDK for any programming language.

Use cases

YDB can be used as an alternative solution in the following cases:

  • When using NoSQL systems, if strong data consistency is required.
  • When using NoSQL systems, if you need to make transactional updates to data stored in different rows of one or more tables.
  • In systems that need to process and store large amounts of data and allow for virtually unlimited horizontal scalability (using industrial clusters of 5000+ nodes, processing millions of RPS, and storing petabytes of data).
  • In low-load systems, when supporting a separate DB instance would be a waste of money (consider using YDB in serverless mode instead).
  • In systems with unpredictable or seasonally fluctuating load (you can add/reduce computing resources on request and/or in serverless mode).
  • In high-load systems that shard load across relational DB instances.
  • When developing a new product with no reliable load forecast or with an expected high load beyond the capabilities of conventional relational databases.