Stateful Workloads on Kubernetes – Webinar by Datadog

HOME
HOSTING
- Email Hosting
  Unlimited email accounts on your own domain name
  
  Learn more
  
  WordPress Hosting
  Supercharged hosting for Wordpress sites
  
  Learn more
  
  Web Hosting
  Limitless web hosting for any kind of website
  
  Learn more
  
  Dedicated Servers
  Managed servers for high scale operations
  
  Learn more
DESIGN
- Websites
  We bring your vision to life with captivating websites
  
  Learn more
  
  Logos
  Leave a lasting impression with a professional logo
  
  Learn more
  
  Stationery
  Enhance professionalism with company stationery
  
  Learn more
  
  Software
  Bespoke software design and development
  
  Learn more
CONTACT

Stateful Workloads on Kubernetes – Webinar by Datadog

May 20, 2024

In a recent Datadog webinar, engineers shared insights into the company’s practices and experiences with running stateful workloads on Kubernetes. The event, part of Datadog’s ‘Datadog on’ series, was unique as it featured a live audience at the Datadog Summit in London, marking the first in-person session of these online events. The session focused on how Datadog handles stateful workloads like Kafka and PostgreSQL on Kubernetes.

The presenters, Edward and Martin, introduced themselves and provided context on Datadog’s scale and infrastructure. Datadog, an observability platform, manages telemetry data for over 27,000 customers, translating to tens of trillions of data points per day. This vast amount of data is processed across thousands of Kubernetes clusters, each with thousands of nodes, highlighting the complexity and scale at which Datadog operates.

The session delved into the technical challenges and solutions associated with running stateful workloads on Kubernetes. Stateless applications are relatively straightforward to manage on Kubernetes, but stateful workloads introduce complexities, primarily due to the need to ensure data integrity and avoid data loss.

Martin discussed Kafka, a distributed streaming platform critical to Datadog’s operations. Kafka supports high throughput and low latency workloads, making it suitable for Datadog’s needs. Martin detailed how Kafka clusters are managed using Kubernetes stateful sets, persistent volume claims (PVCs), and node affinity rules to ensure data persistence and resilience.

One significant topic covered was the distinction between local and remote storage. Local storage offers performance benefits but can complicate node replacements, as data needs to be copied back to new nodes. Remote storage, such as cloud block storage, simplifies this process by allowing data volumes to be reattached to new nodes, though it can introduce latency and performance trade-offs.

Edward then shifted the focus to PostgreSQL, another critical component of Datadog’s infrastructure. PostgreSQL, with its single-leader architecture, requires careful management to ensure high availability and performance. Edward explained how Datadog leverages Zookeeper and Patroni for cluster state management and leader election, ensuring seamless failover and minimal downtime.

Datadog’s approach to node lifecycle management was another key topic. The company employs node lifecycle automation to regularly replace nodes, ensuring hardware freshness and reliability. This practice, while beneficial for maintaining performance, necessitates robust data recovery and backup mechanisms to minimize downtime and data loss.

The webinar also explored the future of Datadog’s stateful workload management. Edward highlighted ongoing improvements in proxy management, aiming to streamline operations and enhance performance. The team is working on integrating authentication, automatic traffic routing, and observability features into their proxy setup, further optimizing their infrastructure.

Throughout the session, the presenters emphasized the importance of Kubernetes as an abstraction layer, providing a consistent operational environment across multiple cloud providers. This consistency allows Datadog to leverage Kubernetes’ extensibility and API-driven nature to build custom solutions tailored to their needs.

The webinar concluded with a Q&A session, addressing topics such as the use of open-source operators, the challenges of multi-cloud environments, and the implications of regional versus zonal clusters. The presenters shared valuable insights into Datadog’s practices and encouraged further engagement with the audience through Datadog’s online resources and future webinars.

Overall, the Datadog webinar provided a deep dive into the company’s innovative approaches to managing stateful workloads on Kubernetes, highlighting the challenges, solutions, and future directions in this critical aspect of their infrastructure.

Email Hosting

WordPress Hosting

Web Hosting

Dedicated Servers

Websites

Logos

Stationery

Software

Stateful Workloads on Kubernetes – Webinar by Datadog

Leave a Reply Cancel reply

Follow us

Company

About us

Services

Customer Service

Useful links

Newsletter

Stateful Workloads on Kubernetes – Webinar by Datadog

Previous

Next

Leave a Reply Cancel reply