• Architecture and Scalability (7)

    Photo

    Vuk Marković

    EPAM

    Architecting real-time cloud collaboration systems

    Using cutting-edge modern technologies for real-time communication, you’ll learn how to architect and build a large-scale distributive system that will act as an expandable collaborative environment, capable of handling thousands of users in real-time, allowing them to work together remotely.

    In this talk, using cutting-edge and modern technologies for real-time communication, the audience will be able to learn how to architect and build a large-scale distributive system that will act as an expandable collaborative desktop environment, capable of handling thousands of users in real-time and allowing them to work together remotely. In the end, the audience will be able to apply the concept for engineering any multi-user system that requires a robust state synchronization. The overall idea of the concept revolves around efficient server-side management and discovering a way to robustly update it and distribute it to other machines and client devices. Finding such a strategy is not easy and requires a lot of research, benchmarking, and other relevant techniques in order to achieve the best performance for handling such a large number of users. Connecting the fields of system engineering, software engineering, and engineering management, this talk will be a great introduction into many different projects that can be built on the foundation that this talk will bring to the audience in an interesting and challenging way.

    The talk was accepted to the conference program

    Photo

    Lia Yepremyan

    AMD Armenia

    AI embedded in the chip

    For a number of years, artificial intelligence has been at the top of the technology must-watch list for evolving trends and applications. With our ability to build smart machines that simulate human intelligence, the implications for technological advancement across numerous sectors are endless. So, what could be better than artificial intelligence? Embedded artificial intelligence.

    The Program Committee has not yet taken a decision on this talk

    Photo

    Alexander Sibiryakov

    Zyte

    Kafka architecture: performance

    Kafka is a distributed messaging system capable of delivering high performance. In my talk I’ll explain the architecture of a broker and client parts, putting emphasis on design concepts enabling high performance. It will be useful for system design and overall understanding of Kafka.

    There is not much information available on the net on Apache Kafka architecture, and I would like to fill this gap. In this talk, I’m going to roughly explain the design of distributed messaging in Kafka. To achieve the state of art IO performance, Kafka is making use of page writeback strategy, zero-copy, along with other optimisations on the client and server sides. To my understanding, these optimisations were the main driver of the design of the whole system, protocol, and client APIs. This talk will focus on the concepts responsible for high bandwidth, and will leave out topics of distributed agents coordination, cluster discovery, etc.

    This talk will be useful to architects and system programmers for training system design skills and improving overall understanding of Kafka, as well as to those who are using or considering using Apache Kafka.

    The main things we are going to discuss are:
    - Which architectural decision is making Kafka performant?
    - Meaning of efficiency for Kafka
    - How to benchmark Kafka
    - Theoretical throughput limit
    - Large diagram of a message write path
    - Overview and design of Kafka Producer client
    - Linux writeback
    - Read path diagram
    - Kafka Consumer client design
    - Summary: what makes Kafka efficient

    The Program Committee has not yet taken a decision on this talk

    Photo

    Edgar Mikayelyan

    Qrator Labs

    HTTP/2: The Good, The Bad and The Vulnerable

    4 July, 11:10, «Hall 1»

    HTTP/2 is a newer version of HTTP protocol that enhances client-server communication with faster speed and efficiency. It also brings new vulnerabilities to the table. In this talk, we will explore the strengths and weaknesses of HTTP/2 and discuss how to protect against common DoS and DDoS attacks.

    HTTP/2 is a major upgrade to the HTTP protocol, introducing new features such as binary framing, multiplexing, and server push, which greatly improve the performance and efficiency of web communication. However, the new protocol also brings new vulnerabilities, especially in the context of DoS and DDoS attacks.

    In this talk, we will explore the inner workings of HTTP/2, including its benefits and drawbacks, as well as the potential impact of DoS and DDoS attacks on the protocol. We will also discuss common types of attacks that exploit HTTP/2 weaknesses, such as connection flooding and resource exhaustion, and provide practical tips on how to mitigate these risks. We’ll also discuss specific vulnerabilities in the protocol and techniques for mitigating these attacks, such as rate limiting and traffic analysis. By the end of this talk, attendees will have a better understanding of HTTP/2 and how to protect their websites from DDoS attacks.

    The talk was accepted to the conference program

    Photo

    Karen Tovmasyan

    Uber

    Feet of clay: Achieving strong consistency over eventual

    4 July, 17:00, «Hall 1»

    Is it possible to build strongly consistent storage systems on top of eventual? The short answer is “yes”, and the long answer is “yes, but not really, but maybe, but it depends.”

    The industry has introduced distributed transactions with async replication to overcome the limits of write performance. However, with distributed transactions and replica reads, the eventual consistency has been introduced as a caveat. In this talk, we will observe an opportunity to build a strongly consistent caching system on top of the eventually consistent Redis Cluster.

    The talk was accepted to the conference program

    Photo

    Nick Shadrin

    NGINX

    HTTP/3: Shiny New Thing, or More Issues?

    4 July, 12:20, «Hall 1»

    New iteration of the HTTP protocol brings more challenges than previous upgrades. We now need the UDP transport and a different negotiation method; another trouble is the need for embedded security/encryption. You will learn how to implement this protocol in the network, and most importantly - why.

    Plan:

    – Intro
    -- Main differences between HTTP/3 and previous protocols.
    -- UDP as a transport.
    -- Encryption differences.
    -- Connection ID.
    -- Protocol negotiation and fallback to other protocols.
    -- Use of HTTP/3 in real networks.
    -- All the boxes in between a client and a server.
    -- Security issues.
    -- Implementation in NGINX.
    -- Conclusions and questions.

    The talk was accepted to the conference program

    Photo

    Edoardo Vacchi

    Tetrate

    WebAssembly from the inside out

    4 July, 15:50, «Hall 2»

    A WebAssembly runtime is an embeddable virtual machine. This allows platforms to dynamically load and execute third party bytecode without rebuilding their OS and Arch specific binary. While WebAssembly is a W3C standard, there are a lot of mechanics required to do this, and many critical aspects are out-of-scope, left to implementation.

    Most presentations discuss WebAssembly at a high level, focusing on how end users write a program that compiles to Wasm, with high-level discussions of how a virtual machine enables this technology. This presentation goes the other way round. This talk overviews a function call beginning with bytecode, its translation to machine code, and finally how it is invoked from host code.

    The implementation used for discussion is the wazero runtime, so this will include some anecdotes that affect these stages, as well as some design aspects for user functions that allow them to behave more efficiently. However, we will also indulge in a high-level comparison with other runtimes and similar technologies that in the past tried to solve the same problem.

    When you leave, you'll know how one WebAssembly runtime implements function calls from a technical point of view.

    The talk was accepted to the conference program

  • Databases (11)

    Photo

    Alexey Goncharuk

    Querify Labs

    JIT In the Wild: What Databases Do To Speed Up Your Queries

    4 July, 10:00, «Hall 1»

    Just-in-time (JIT) compilation is widely known at least for Java developers and users, but is JIT limited to virtual machines only?

    In this talk, we will explore why and how modern databases and data management systems use JIT to make the most of the available computational resources and improve query performance, see how JIT and ahead-of-time compilation work together, glance over some of the available JIT toolkits for Java and C++ and how they can be used in non-database projects.

    The talk was accepted to the conference program

    Photo

    Oleg Bondar

    Yandex Infrastructure

    YDB is an open-source Distributed SQL Database

    4 July, 10:00, «Hall 3»

    YDB is an open-source distributed horizontally-scalable fault-tolerant database built from scratch. I would like to talk about YDB design and internals, about decisions made and use cases.

    YDB is a distributed horizontally-scalable fault-tolerant database built from scratch. Its key characteristics include:
    * strict consistency with a possibility to relax guarantees for better performance;
    * horizontal scalability;
    * geo-replication for availability;
    * support for high-performance multi-row and multi-table ACID OLTP transactions;
    * support for row based tables;
    * support for column based tables;
    * support for embedded streams;
    * support for declarative query language YQL (a dialect of SQL);
    * distributed query execution on heterogeneous databases and storages including PostgreSQL, S3, etc.;
    * YDB is used as a mission-critical database for many Internet-scale services;
    * YDB has been designed as a platform for various data storage and processing systems and aimed at solving a wide range of problems.

    YDB is currently used as a core component of:
    * a system for continuous stream processing;
    * persistent queues;
    * a high-performance network block store.

    This will be the first time a talk about YDB is presented at such a big international conference.

    The talk was accepted to the conference program

    Photo

    Mons Anderson

    Exness

    Reliability of In-Memory Databases on the example of Tarantool

    4 July, 15:50, «Hall 3»

    An in-memory database is not a new concept. To date, there is a fairly strong relation of such solutions with the terms "cache", "non-persistent" and "unreliable".

    In-memory databases have a much wider application than a cache. And their reliability is not worse than that of the most proven relational databases.

    I will tell you about approaches that allow an in-memory database to be as reliable as sunrise.

    I will analyze the database engine from the incoming request to the synchronous replication and transactional mechanism at a speed of 1 million RPS.

    The purpose of my talk is to show that in-memory technologies are already mature and reliable enough to be the primary data storage for your product.

    The talk was accepted to the conference program

    Photo

    Igor Latkin

    KTS

    Building the Perfect Queue Broker: Our Journey with Tarantool

    4 July, 17:00, «Hall 3»

    Microservice communication is a crucial part in most applications, but it is hard to find the one and only solution that could satisfy every arising need. We’ll discuss how we at KTS created a queue builder service on top of a distributed Tarantool Database cluster, enabling everything we want from a queue broker, from channels & tenants to deferred tasks and strict user FIFO, keeping it high-performant and fault-tolerant. I’m going to outline why none of the existing solutions like Apache Kafka, RabbitMQ, NATS and others satisfy our needs and share the implementation details, choices and compromises we made. I’ll try to answer a question of why Tarantool proves (again?) to be the most flexible platform for data manipulation in the OLTP world.

    The talk was accepted to the conference program

    Photo

    Evgenii Ivanov

    Yandex Infrastructure

    Scale it easy: YDB's high performance in a nutshell

    4 July, 13:30, «Hall 3»

    Implementing a distributed database with strong consistency isn’t difficult; the challenge lies in ensuring speed and scalability. YDB excels in these aspects. In this talk, we’ll discuss YDB’s architecture and its high performance, present results of benchmarks, and compare YDB to top competitors.

    Why are performance and scalability so important nowadays? We live in the era of big data, and performance is what makes big data practically useful. Imagine a petabyte database that can handle only 1K transactions per second, or the same database requiring 1M servers to operate. In both cases, the database would not be very useful. In today’s world, big data is everywhere; databases must be fast enough to handle this data and affordable for many companies or even individuals. Performance provides both speed and low cost. A truly high-performance database allows you to process more data using less hardware.

    YDB is such a database: it is fast, affordable, and reliable. It combines high availability and scalability with strong consistency and ACID transactions. Time-tested, there are installations with thousands of servers and petabytes of data that have existed for years. In an increasingly data-driven world, YDB’s blend of performance, scalability, and reliability sets it apart as a top choice for organizations seeking to maximize the value of their data.

    In this talk, we’ll delve deeper into YDB’s architecture, starting with an overview of its key components and design principles. We’ll discuss its role in achieving high-performance and fault tolerance. Throughout the presentation, we’ll illustrate how the proper separation of key-value and query processing layers can improve performance on various request distributions. Additionally, we examine YDB’s actor system, detailing its evolution and the impact on throughput and latency. To better illustrate YDB’s performance, we’ll provide comprehensive examples of various benchmarks, including:
    * Low-level component testing to showcase the efficiency of individual modules
    * Key-value YCSB tests to demonstrate YDB’s ability to handle high-throughput workloads
    * Distributed transactions to highlight the system’s capacity for managing complex, multi-node operations

    We’ll also share case studies of real-world optimization scenarios, enabling attendees to learn:
    * How a simple yet effective configuration can increase throughput by up to 30%
    * Strategies for addressing GRPC-layer bottlenecks and their impact on performance
    * The crucial role of SDK implementations in shaping the end-user performance experience
    * The tools and methods we employ in our performance optimization efforts, such as profiling and monitoring

    In conclusion, we’ll compare YDB against top competitors such as CockroachDB and Yugabyte, analyzing various aspects such as ease of deployment, performance and scalability. By sharing these results, we’ll help the audience gain a better understanding of YDB’s capabilities and how it outperforms competing solutions in handling large-scale data processing tasks efficiently.

    The talk was accepted to the conference program

    Photo

    Grigory Reznikov

    YTsaurus

    Cypress: a distributed transactional file system in YTsaurus

    4 July, 11:10, «Hall 3»

    YTsaurus is a data storage and processing system that recently became open-sourced. This system includes storage for huge-sized tables, efficient data processing engines allowing execution of ad-hoc analytical queries as well as building of data processing pipelines, and also efficient OLTP key-value storage. As a result of 12 years of development by experienced developers, YTsaurus proved itself to be a reliable, efficient and convenient way to manage data for different purposes. YTsaurus scales well, allows to store exabytes of data and processes them using millions of CPUs.

    Cypress is a distributed file system used in the core of YTsaurus. In this talk, we will discuss the Cypress interface and its functionality, as well as demonstrate how the Cypress features can be used to simplify working with data.

    We will pay special attention to transactions and discuss how transactions allow using Cypress as a distributed coordination system.

    The talk was accepted to the conference program

    Photo

    Daniil Zakhlystov

    Nebius

    Greenplum Physical Backups with WAL-G

    4 July, 18:10, «Hall 3»

    In 2022, we released the Yandex Managed Service for Greenplum® to the public. One of its features is physical backups via WAL-G instead of logical ones (gpbackup/pg_dump). I’ll tell you about how we made cluster-wide consistent physical backup, PITR, delta-backups, and other nice features that are now available in WAL-G.

    From a quick look, Greenplum itself is multiple Postgres instances wrapped together. Since we already have WAL-G for Postgres physical backups, it should be easy to create a physical backup of Greenplum instances, right? However, there were no implemented solutions for Greenplum physical backups, so we decided to invent our own.

    I’ll go over the following topics:
    • how we were the first in the world to implement physical backups for Greenplum;
    • what challenges we’ve faced and how we solved them;
    • what benefits we’ve received as a result;
    • and, of course, I’ll tell you how to begin using physical backups with WAL-G right now!

    The talk was accepted to the conference program

    Photo

    Ashot Vardanian

    Unum Сloud

    Vector Search and Databases at Scale

    4 July, 12:20, «Hall 3»

    Vector Search databases appear on every corner. Most have already heard about Pinecone, Weaviate, Qdrant, and the Open-Source libraries that precede them ― Facebook's FAISS and Google's SCANN. We will look into the algorithms for approximate nearest neighbors search, profile them, and highlight the bottlenecks that stop most of them from scaling beyond a billion entries. This will help you navigate the increasingly complex space of vector and semantic search products, choose the optimal configuration parameters for indexing, the right neural network to produce embeddings, and the appropriate hardware for the task.

    The talk was accepted to the conference program

    Photo

    Peter Zaitsev

    Percona

    From a database in container to DBaaS on Kubernetes

    4 July, 14:40, «Hall 1»

    DBaaS is growing fast, but it’s typically proprietary and tied to one cloud vendor. We believe Kubernetes finally allows us to build a fully OpenSource DBaaS Solution capable to be deployed anywhere Kubernetes runs. Join us as we discuss running databases on Kubernetes in production.

    DBaaS is the fastest-growing way to deploy databases. It is fast, convenient, and it helps to reduce toil a lot, yet it is typically done using proprietary software and tightly coupled with the cloud vendor. We believe Kubernetes finally allows us to build a fully OpenSource DBaaS Solution capable to be deployed anywhere Kubernetes runs ― on the Public Cloud or in your private data center.

    In this presentation, we will talk about all the important steps it takes to run the database on Kubernetes in production. We will answer the following questions:
    1. Can you do it without operators?
    2. Can you work with k8s primitives only to run production-grade DB and then DBaaS?
    We will describe the most important user requirements and typical problems you would encounter while building DBaaS Solutions and explain how you can solve them using the Kubernetes Operator framework.

    The talk was accepted to the conference program

    Photo

    Evgeny Ermakov

    Toloka.AI

    Modern Data Stack: Choose Suffer Love

    Data Lakehouse, Data Mesh, Data Fabric are loud sounding terms in the Data World. But what if the desire to “implement hype tech” suddenly turns into reality? Be afraid of your desires, because that’s exactly happened at Toloka.AI — the data platform team was given the task: ”Azure. Modern Data Stack. Tomorrow.”

    As part of our talk, we will go through all the stages of our work with Modern Data Stack — choose, suffer, love — and answer following questions:

    * What is a modern data platform and on what pillars does it rest?
    * How not to drown in the world of Modern Data Stack solutions?
    * What issues of different systems integration can you encounter?
    * Which tools (of those that we have tried) are must-have, and which ones can we try to replace?
    * How has the work of analysts changed (and has it changed)?
    * What is worth repeating and what is not worth repeating if you follow the same path?

    The Program Committee has not yet taken a decision on this talk

    Photo

    Maxim Pchelin

    Nebius

    Photo

    Vladimir Verstov

    Yandex Go

    Data Management Platform over YTsaurus: The Path from User to Contributor

    4 July, 14:40, «Hall 3»

    What do you do when your business wants a fast, reliable and convenient data management platform (DMP) for DWH without spending money on paid solutions? How do you build a high-quality system for processing petabytes of data using open-source tools and modify them if necessary?

    In the talk, we will share our experiences in creating DMP that ensures day-to-day operation of DWH for Yandex Taxi, Eats, Grocery, Delivery, and others. We will present our technology stack, primarily based on YTsaurus ― the brand new open-source alternative to Hadoop. We will tell you how we built our platform over it, when we used other tools (like ClickHouse or Greenplum), and how we made improvements to YTsaurus.

    The talk was accepted to the conference program

  • Big Data and Machine Learning (2)

    Photo

    Polina Komissarova

    Evocargo

    How to make a self-driving truck see: from DARPA challenge to nowadays

    4 July, 10:00, «Hall 2»

    Join me to discover how to create a self-driving truck, solve perception tasks using deep learning and LiDARs, and what challenges LiDAR perception faces in the unique driving conditions of the UAE and beyond the polar circle.

    Self-driving cars are set to change the future of transportation. LiDARs, along with cameras and other sensors, are the eyes of self-driving cars, providing unparalleled accuracy and precision in perceiving the environment around them, and enabling safer and more efficient navigation. In my talk, I will explain how Evocargo is solving object detection and semantic segmentation tasks, as well as challenges that perception faces in the diverse weather conditions in different locations: from the UAE to the polar circle.

    The talk was accepted to the conference program

    Photo

    Nemanja Milicević

    SmartCat

    Real-time Personalization with Kafka Streams and TensorFlow

    4 July, 18:10, «Hall 2»

    This talk explores real-time recommender systems from an engineering perspective. The talk will cover a wide range of topics, including an overview of recommender systems in general, a case study on online sports betting personalization with Kafka Streams and TensorFlow, as well as considerations for system operations and performance.

    While batch recommendations can be useful in certain contexts, there are many situations where real-time recommendations are necessary for optimal user experience. The talk will explore the differences between offline and online environments for recommender systems and will provide an overview of the different components that make up a recommender system, including retrieval, filtering, scoring, and ordering.

    The case study on online sports betting personalization with Kafka Streams and TensorFlow will be a major highlight of the talk. There will be a discussion of why Kafka Streams was chosen for this particular use case, as well as an exploration of alternatives such as Apache Flink. The talk will include an architecture diagram that outlines the various components of the system, including Apache Spark and TensorFlow for model training, MLflow as a model registry, and Apache Cassandra for serving recommendations. The talk will also cover considerations for the feature store, including the use of Apache Cassandra and Kafka Streams GlobalKTable abstraction with local RocksDB cache. Finally, there will be a discussion of how Apache Cassandra can be used for serving recommendations, including the use of Kafka Connect sink connectors.

    In addition to the case study, the talk will also cover various considerations for operations and performance in recommender systems. There will be a discussion of Kafka consumer rebalance, including adding/removing new instances and how standby state replicas can help us to alleviate this problem. Finally, there will be a discussion of deploying new model versions, feature data quality pipelines, and the importance of model monitoring and retraining.

    Overall, the talk will provide a comprehensive overview of real-time recommender systems from an engineering perspective and will be of interest to anyone interested in building intelligent systems that can handle large volumes of real-time data.

    The talk was accepted to the conference program

  • Security (1)

    Photo

    George Tarasov

    Qrator Labs

    What measure is a robot? Understanding and managing web bot activity

    4 July, 18:10, «Hall 1»

    Bots are much more active and varied than humans when it comes to using the Web. Among them are the complex ones imitating human-controlled web browsers for data scraping, fraud, or just to ruin your pet project’s day. What’s their gain? Why are they so popular? How can we stop them? Let’s find out.

    While useful bots make themselves instantly recognizable in your website’s traffic, the unwanted ones either don’t care or do everything to pass as a real human user. The simplest bots involve scripted access to web pages to extract or stuff data. Some go directly for the site’s backend API whenever possible. They can make up to 33M requests in one go. We’ll inspect them briefly, because more exciting species are up the food chain.

    Browser bots controlled by an orchestrator, like Chrome Headless with Playwright, have long been a favorite for web scrapers who need to bypass protection. If there is any aggregated data of any value – credentials, stock prices, scientific measurements – there’s a high chance browser bots are collecting it for fun and profit. Why them? Most valuable data has a short lifetime, and lots of artificial users are needed to grab it fast and still remain unsuspicious. Being headless, these browsers can be optimized for minimum CPU & memory consumption, so you can keep them up in thousands for your virtual bot farm relatively cheap. Thus, while constituting just about 2% of all bot traffic we’ve inspected last quarter, these robots punch well above their weight, affecting performance and business metrics alike.

    Recently, Chrome has received an update adding a --headless=new mode, making scraper bots nearly undetectable. The complexity of required countermeasures has also jumped up a notch, yet still doesn’t guarantee 100% detection. With that in mind, it’s time to look at the available damage control strategies, namely:

    ‘Home defense’ against more primitive but persistent scripted web & API bots

    Jury-rigging a basic browser bot detection for your website

    Finding out whether you really need something bigger: specific cases when advanced browser bots can cause lots of pain

    I really hope this talk helps you get useful insights on how to treat bad bots in your traffic. Whether to observe, to manage, or to try and get rid of them – it’s up to you!

    The talk was accepted to the conference program

  • Video and Streaming (1)

    Photo

    Ilya Mikhaelis

    .

    All you want to know about WebRTC on the backend side

    4 July, 14:40, «Hall 2»

    Over the past several years, part of the live streaming content dramatically increased. But the main problem of the live content is latency. Waiting for the answer to a simple question for about 5-10 seconds gets annoying and kills all communication dynamics, doesn’t it? These are typical latency values for standard streaming Live protocols like HLS or DASH. However, here is one of the game changers in Live streaming - WebRTC. This technology makes streaming with a millisecond’s long latency possible from all common browsers.

    While cooking WebRTC on the frontend side is very simple, on the backend side, it’s a real pain in the neck: most of the open-source solutions (and to be honest, not only open-source) aren't suitable for actual high load. At the same time, there isn’t enough information on how to develop your WebRTC backend.

    In this talk, we will:

    - revise the theory, i.e. the definition of objects in video streaming, an overview of the most popular live streaming protocols nowadays, and what latency they have;
    - explain what WebRTC is from the backend point of view. Why did we fail with the open source implementations? And how to make your WebRTC implementation on the backend side?
    - discuss the top problems that you will face developing your own WebRTC and how to overcome them (picture freezes and artifacts, black screen, etc.)
    - And last but not least, show the result of our development in numbers.

    The talk was accepted to the conference program

  • Internet of Things (1)

    Photo

    Nenad Ilić

    AWS

    Orchestrating Application Workloads in Distributed Embedded Systems

    4 July, 17:00, «Hall 2»

    With the rise of edge computing, businesses are increasingly looking for ways to orchestrate application workloads in a multi-SoC (multi-Virtual Machine) environment on embedded devices. In this talk, we’ll explore how to achieve this using open source technologies, such as AWS IoT Greengrass and Hashicorp Nomad. We’ll discuss the benefits of using these technologies and the considerations that need to be taken into account when choosing them for your use case. Additionally, we provide real-world examples of how they can be used to achieve effective orchestration of application workloads in a multi-partition environment on embedded devices.

    The talk was accepted to the conference program

  • Infrastructure and Platforms (5)

    Photo

    Andrei Kvapil (kvaps)

    Ænix

    LINSTOR Is Like Kubernetes, But for Block Devices

    4 July, 12:20, «Hall 2»

    An open-source storage from LINBIT (maintainers of DRBD). It’s fast and fault-tolerant, has a lot of features. It looks like Kuberentes but for block devices. How does it work? How to configure and debug it?

    In this presentation, I’ll show you LINSTOR, an open-source storage from LINBIT (maintainers of DRBD).

    The DRBDv9 changed course from “‎one large fault-tolerant device for all”‎ to “separate DRBD devices per virtual machines”‎. Now it supports diskless replicas, snapshots, encryption and much more. Everything can be orchestrated via Kubernetes-like API.

    The talk was accepted to the conference program

    Photo

    Anastasiia Abrashitova

    Yandex Infrastructure

    Mom, I'm in love with a monorepo

    4 July, 15:50, «Hall 1»

    The world we live in today is all about separation and breaking things into smaller parts. And yet, in Yandex we still have a large-scale monorepository where the code from most of our projects is kept. It is 26 years old and really large, taking up more than 10Tb of code with history. So why do we love it? Let me tell you!

    First and foremost, I’ll talk about our philosophy. Having a single repository does not give you any benefits yet. To get them, you need to reuse the code, to make common libraries and utilities. You need to link projects by code, not by artifacts. You need to maintain green trunk and a full scale CI/CD. You need to limit languages and open source libraries to prevent unmanageable code diversification. You need common code rules and styles. And you need developers ready to collaborate, roll their sleeves, and fix each other’s code. Oh, and you also need tools capable of working with a very large code base efficiently. I’ll give a retrospective of Arcadia – the Yandex monorepository with 26 years of history. I’ll talk about the rules we chose and the tools we created to make it efficient, useful and worth of our developers’ love.

    The talk was accepted to the conference program

    Photo

    George Melikov

    VK Cloud, OpenZFS contributor

    IO, we have a problem! Coping via ZFS

    4 July, 13:30, «Hall 2»

    Storage is always a pain, and there are different flavors of this pain: IO throughput, latency, hdd, ssd, nvme, or even in-memory.
    On top of them – volume manager.
    We should remember about backups too, and make them consistent.
    Of course, downtime is not usually appropriate.
    We'll speak about a less standard way to compete with these problems - openZFS, a file system and volume manager together.
    A view from a contributor's side.

    The talk was accepted to the conference program

    Photo

    Viktor Vedmich

    Amazon Web Services

    Photo

    Mike Golubev

    Amazon Web Services

    Is Arm for you? Patterns and anti-patterns for migration

    4 July, 13:30, «Hall 1»

    There is a lot of buzz about Arm architecture lately, but should you really migrate, or is x86 architecture a better choice for you? We will discuss patterns indicating when you are ready for migration and anti-patterns showing when it is better to hold off. Learn how to measure performance and troubleshoot it: find out which application affects your system SLO’s, which part of source code is to blame and if it’s software or hardware related. Is it so straight-forward to migrate databases? How to build containers for Arm and how to never do it?

    The talk was accepted to the conference program

    Photo

    Ivica Bogosavljević

    Johnny's Software Lab

    A software engineer's view into memory operations

    4 July, 11:10, «Hall 2»

    Not all programs are created equally: some use hardware resources optimally, others not so much. In this short talk, we’ll introduce two types of hardware bottlenecks: the CPU bottleneck and the memory bottleneck. We will give information on how to find out if your code is memory bound or CPU bound and some ideas on how to overcome these limitations.

    The talk was accepted to the conference program