Scale it easy: YDB's high performance in a nutshell

Photo
Evgenii Ivanov

Yandex Infrastructure

About speaker

Principal software developer at Yandex Infrastructure.

I've had a passion for programming and computer systems since my teenage years. Now, holding a Master's degree in Computer Science, I have been fortunate throughout my career to amass a breadth of experience. I participated in the Google Summer of Code, working on the KDE project, and contributed to several projects directly supervised by Prof. Andrew Tanenbaum in MINIX 3. I also worked in a fintech startup, focusing on the performance of its core components. Ultimately, I joined Yandex Infrastructure, where I now extensively focus on improving various aspects of its performance — a field I'm deeply passionate about. Outside of work, I enjoy spending time with my family, reading, and engaging in aerial photography.

About speakers's company

At Yandex, thousands of team members generate hundreds of products that entail tens of thousands of comments and pool requests every day. Building user-friendly infrastructure for developing and operating products at that scale is a serious challenge. We build the systems, services, and tools Yandex developers rely on. Our solutions are aimed at ensuring that every product Yandex delivers has ready-to-use infrastructure at every stage. We have our own version control system for storing source code; C++, Java, Python, and Go systems for distributed builds and seamless integration capable of processing hundreds of builds a minute; a distributed task management system; rollout systems; and app monitoring systems. We also develop products for supporting development processes, resource planning, and much more.

4 July, 13:30, «Hall 3»

Abstracts

Implementing a distributed database with strong consistency isn’t difficult; the challenge lies in ensuring speed and scalability. YDB excels in these aspects. In this talk, we’ll discuss YDB’s architecture and its high performance, present results of benchmarks, and compare YDB to top competitors.

Why are performance and scalability so important nowadays? We live in the era of big data, and performance is what makes big data practically useful. Imagine a petabyte database that can handle only 1K transactions per second, or the same database requiring 1M servers to operate. In both cases, the database would not be very useful. In today’s world, big data is everywhere; databases must be fast enough to handle this data and affordable for many companies or even individuals. Performance provides both speed and low cost. A truly high-performance database allows you to process more data using less hardware.

YDB is such a database: it is fast, affordable, and reliable. It combines high availability and scalability with strong consistency and ACID transactions. Time-tested, there are installations with thousands of servers and petabytes of data that have existed for years. In an increasingly data-driven world, YDB’s blend of performance, scalability, and reliability sets it apart as a top choice for organizations seeking to maximize the value of their data.

In this talk, we’ll delve deeper into YDB’s architecture, starting with an overview of its key components and design principles. We’ll discuss its role in achieving high-performance and fault tolerance. Throughout the presentation, we’ll illustrate how the proper separation of key-value and query processing layers can improve performance on various request distributions. Additionally, we examine YDB’s actor system, detailing its evolution and the impact on throughput and latency. To better illustrate YDB’s performance, we’ll provide comprehensive examples of various benchmarks, including:
* Low-level component testing to showcase the efficiency of individual modules
* Key-value YCSB tests to demonstrate YDB’s ability to handle high-throughput workloads
* Distributed transactions to highlight the system’s capacity for managing complex, multi-node operations

We’ll also share case studies of real-world optimization scenarios, enabling attendees to learn:
* How a simple yet effective configuration can increase throughput by up to 30%
* Strategies for addressing GRPC-layer bottlenecks and their impact on performance
* The crucial role of SDK implementations in shaping the end-user performance experience
* The tools and methods we employ in our performance optimization efforts, such as profiling and monitoring

In conclusion, we’ll compare YDB against top competitors such as CockroachDB and Yugabyte, analyzing various aspects such as ease of deployment, performance and scalability. By sharing these results, we’ll help the audience gain a better understanding of YDB’s capabilities and how it outperforms competing solutions in handling large-scale data processing tasks efficiently.

The talk was accepted to the conference program