YDB Platform Team

Product

Our team is responsible for building YDB, a Distributed SQL Database. While YDB is our flagship product, we are not limited to being an OLTP database, but rather are working to build a platform that includes:

  • Persistent data streams and queues;
  • Federated query engine;
  • Continuous stream processing engine;
  • An analytical store.

Our mission is to bring to the world a fully open-source solution for data processing proven at web scale. Nowadays, teams building similar products develop closed-source systems or have limited access to top complex problems. We have both, so proven scalability and openness are our key benefits.

Technology

We believe that C/C++ is the best language for developing system software like OS and databases, so YDB Core is written in C++. Modern C++ is a language that allows you to easily handle memory management with the help of smart pointers, without limiting your ability to get the most out of the hardware you are working with.

YDB has a message-oriented design which reflects the distributed nature of the system, so YDB internals are written in Actor model.

More specifically, we implement a component as a C++ class, that inherits the actor interface and is allowed to send messages to other actors, receive messages from other actors and create new actors. We also have a coroutine engine built on top of the actor system, so we can use coroutines when they better suit the problem.

Our programming language is not limited to C++ at all. We do things like e2e tests, stress tests, and chaos monkey tests in Python, we extend Kubernetes by building YDB operators in Go. We also develop YDB SDK for different programming languages including:

  • Go
  • Python
  • Java
  • Rust
  • Node.js
  • C++
  • PHP

In addition to programming languages, another technology we care about is Kubernetes. Distributed systems are not just about development, but in many ways about setups, updates, and maintenance. We have a long history of managing in-house solutions based on salt, ansible, and even an in-house deployment system, but now we are moving by leaps and bounds towards Kubernetes. Managing a stateful application is always a tough job, and Kubernetes provides us with the flexibility of custom operators for our sophisticated needs.

Cloud Services

We cannot imagine the modern IT world without cloud infrastructure. Cloud services are used everywhere, and they have already changed the way people use computers.

The YDB Platform Team has a unique experience with cloud services. First, before the Cloud era, we provided services for internal company users — in other words, our colleagues. This taught us the difference between just writing code and making a user happy.

Second, the YDB Platform Team participated in the development of a public cloud from the ground up. It was an awesome experience. YDB is used in critical parts of IaaS and PaaS layers:

  • As a storage layer for network block devices attached to virtual machines;
  • As a database for a number of services in the cloud.

Cloud Services

We run a number of Cloud Services which are based on our technology.

Public cloud services require a special approach to security, personal data management and compliance. They are easier to understand when the team takes part in all these activities from the very beginning. Making our services compliant with the highest security standards is a technical challenge comparable to making the system scalable, reliable, and performant.

Third, our team provides a number of cloud services for end users (listed above). All of them are built as multi-tenant services, available for multiple users, and provide isolation guarantees. The first, Managed Service for YDB, is available in two options:

  • more traditional dedicated virtual machines;
  • and as a Serverless service, when a user does not need to provision resources for the database.

All other services are provided in Serverless form. Message Queue and Data Streams implement different types of persistent queues and streams with exactly once delivery guarantees. Federated Query makes it possible to query multiple data sources including Object Storage, PostgreSQL, and ClickHouse.

Team

We have a friendly team of highly motivated professionals with a passion for databases and distributed systems, and enthusiastic about solving top-level problems. Our core team has a strong background in building databases, decades of experience, MSs and PhDs from leading universities.

Previous experience developing databases is not as important to us as passion and patience are. We are welcoming experienced software engineers, middle level, and junior developers. Key people in our current team came to us as interns and grew up to senior level over time.

Our team is quite non-hierarchical, so everyone can freely reach out to other team members. Everyone can find an area where they can shine. Our environment provides a good opportunity to grow as a professional, because the problems we deal with are complex and our backlog is huge. Leadership skills are highly valued, since our broad area and complex problems require ambitious leaders.

Subteams

We have two major products — YDB itself and Federated Query, which allows us to process cross data sources including infinite queries over data streams. Totally 9 subteams develop and maintain these products.

Below are some details about the subteams and their responsibilities.

  • The Distributed Storage Team is responsible for storing and replicating data. Teams involved with low-level optimisations, working directly with block devices, network transport.
  • The Tablets Team is responsible for YDB’s core functionality: table infrastructure, user data organization, and distributed transactions.
  • The Queues Team provides customers with multiple queue and publish/subsribe services.
  • The Analytics Team is a new and growing sub-team responsible for building column storage and efficient subquery execution based on columnar indexes.
  • The Query Processor Team is responsible for query execution, both OLTP and OLAP. It reuses the query processor from the YQL Core Team, and at the same time this team is the most significant contributor to the query processing facilities.
  • The Application Team makes it easy to use YDB. The team is responsible for all client SDKs.
  • The YQL Core Team develops core parts of YDB Query Language.
  • The Streaming Processing Team develops a part of Federated Query which allows us to process continuous queries over streams of data.
  • And finally the DevOps Team is responsible for running our services in the Clouds and on premises, monitoring, and CI/CD processes. We are also a developing Kubernetes operator for automated deployment and management.

Challenges

Distributed Storage and Network Interconnect

  • Seamless storage management, like migrating some part of a cluster from one availability zone to another
  • Actor system optimisations, like dynamic redistribution of CPU cores between thread pools
  • Cluster self heal improvements and optimisations
  • Developing Blob Depot component for seamless blob distribution over a all storage groups. It is useful for instance for node decommission, IOPS and size balancing

Tablets

  • Transactions should see their modifications before commit
  • Distributed transaction protocols optimisations — we know what to do
  • Gather statistics about user data for query optimisations
  • PITR support

Scaling cluster

  • Improvements to internal subscription service to avoid bottlenecks;
  • Scaling State Storage.

Query Processor

  • Add support for PostgreSQL compatibility
  • Performance optimisations, caching programs closer to shards that manages data
  • Add support for cost based optimisation
  • Add support for Common joins
  • Add support for distributed data sorting without size limitations

Analitics

  • Optimise columnar store, add indexes;
  • Add HTAP support.

Cluster Management

  • Automated storage and compute scaling;
  • Integration with different cloud platforms (AWS/GCP/Azure).