The summary of 'ObservabilityCON 2023 - Opening Keynote (Live)'

This summary of the video was created by an AI. It might contain some inaccuracies.

The video captures the highlights of Observability Con 2023 in London, presented by Raj Dutt, co-founder and CEO of Grafana Labs, who emphasizes learning, sharing, and significant turnout. The central theme revolves around Grafana's response to the evolving macroeconomic landscape by adapting its product strategy, introducing adaptive telemetry, and enhancing cost-effectiveness in observability tools. Key product updates include advancements in Grafana Cloud, the debut of Loki 3.0 for log aggregation, adaptive metrics, and the innovative use of eBPF-based solutions for non-intrusive data tracking.

Grafana's commitment to open-source projects like Prometheus, OpenTelemetry, and its expansion to comprehensive observability solutions is emphasized. The company adopts a "Big Tent" philosophy, promoting choice and interoperability. Important terms such as adaptive logs, SLO (service level objective) management, and innovative tools like Baya, Faro, and a cost management hub feature prominently.

Specific product demos highlight application performance monitoring, Kubernetes integration, and use of machine learning (ML) and AI to minimize DevOps toil and enhance data analysis. Notably, the video delves into new tools for incident analysis, including a query advisor and system change identifier called "sift," showcasing Grafana's investment in generative AI, leading to more efficient log analysis and incident response.

The segment culminates with product announcements, reflecting on economic challenges and celebrating milestones including Grafana's 10th anniversary, and recent innovations in open-source observability technology.

00:00:00

In this segment of the video, Raj Dutt, the co-founder and CEO of Grafana Labs, welcomes attendees to Observability Con 2023 in London. He expresses excitement about the packed venue, shares that attendees have traveled globally, and acknowledges customers and users of Grafana. Raj highlights the importance of learning and sharing at the event and thanks the Customer Advisory Board members from prominent companies for their participation. He mentions the event’s unexpected high turnout and outlines the keynote, including product updates and an acquisition announcement. Raj reflects on the past year’s challenges and achievements, including Grafana’s 10th anniversary and significant economic shifts affecting companies’ focus on profitability and sustainable growth.

00:05:00

In this part of the video, the speaker discusses the significant impact of the macroeconomic situation on their product strategy, specifically in the observability space. They acknowledge that the observability sector, including Grafana Labs, has historically not aligned customer value with the rising costs due to increasing data volumes. The company has focused on adapting its product strategy to address this, introducing adaptive telemetry, metrics, and logs. The speaker expresses gratitude to the community, users, and customers for their support and contributions. They review two core principles of their company: promoting a “Big Tent” philosophy that advocates for interoperability and choice in the observability landscape, and their strong belief in open source, which they argue is now leading Cutting-Edge technology in observability. The speaker highlights Grafana’s widespread use and their ongoing investment in open source projects like Prometheus and OpenTelemetry.

00:10:00

In this part of the video, the speaker discusses the evolution and progress of Grafana Labs, noting significant milestones such as the expansion from visualization software to encompassing logs, metrics, and traces with projects like Loki, Tempo, and Mimir. They highlight the company’s innovation in open-source software development, fueled by community and team feedback, and emphasize the importance of Grafana Cloud, which now drives the majority of the company’s revenue. The Chief Technology Officer, Tom Wilky, continues by discussing the company’s journey in building databases over the past six years and the broader strategy known as LGTM (Looks Good to Me), marking the transition to an observability organization. He provides insights into the development and success of the log aggregation system, Loki, and its upcoming version 3.0.

00:15:00

In this segment, the speaker discusses the evolution and recent improvements in Loki, a log aggregation system. Developers have generally appreciated Loki for its label-based log streams, scalability, and cost-effectiveness, but its performance in complex search queries, such as finding a specific order ID within extensive logs, has been challenging. To address this, the team has optimized Loki for such ‘needle in a haystack’ queries by using Bloom filters, significantly speeding up these searches. Moreover, they introduced adaptive metrics, which adjust storage based on query needs, saving costs and aligning more closely with user requirements. Looking ahead, they are experimenting with adaptive logs and seeking large users to collaborate in further development. Additionally, various cost management tools are being integrated into one platform for better control over usage and expenses.

00:20:00

In this part of the video, the speaker introduces a cost management hub designed to consolidate various cost management tools into one accessible location. They then discuss how their open-source projects, such as Grafana, Prometheus, and Loki, have gained popularity and how user evangelism has driven their widespread adoption. The focus shifts to addressing the needs of a more mainstream audience, leading to the development of comprehensive solutions that work out of the box. They outline three key areas: infrastructure observability, testing, and incident response, highlighting recent innovations such as a Kubernetes cost monitoring solution, integrated load testing, and a Grafana SLO management solution. Finally, they announce the general availability of their new application observability solution, which utilizes open-source Telemetry libraries and standards to monitor applications.

00:25:00

In this part of the video, the speaker discusses Grafana’s progression through three acts: the initial open-source Grafana, the development of observability backends (Loki, Tempo, and Mimir), and the integration into a comprehensive end-to-end solution for application engineers. The focus is on Faro, Grafana’s frontend monitoring solution, which can be integrated into JavaScript frontends to collect and visualize performance data. Additionally, the use of open telemetry SDKs for various programming languages is highlighted for backend observability. The segment also includes a demonstration of Grafana’s services overview, showing how to monitor application performance, view detailed metrics, and analyze upstream and downstream dependencies using various tools like service maps and logs, with specific references to Java application monitoring and Kubernetes integration.

00:30:00

In this segment, the speaker addresses challenges with using the Open Telemetry SDK, particularly with languages like C++ or Go, or when dealing with components without accessible code, mixed versions, or complex deployment scenarios. They introduce a solution based on eBPF (Extended Berkeley Packet Filter), which decorates application and kernel calls to track and send data to the cloud without needing code changes. The speaker proudly announces Grafana Baya, an open-source tool under the Apache License V2 that uses the Open Telemetry transport protocol and works across Kubernetes, Docker, and bare-metal environments. The speaker expresses gratitude towards the Grafana teams and highlights ongoing improvements in search speed and data relevance within their solutions. Additionally, they announce the incorporation of Manoge and the Asserts team into Grafana Labs, emphasizing their innovative approach to data correlation and observability.

00:35:00

In this part of the video, the speaker recounts their experience in the tech industry, attending conferences and observing projects like Grafana and Loki. They describe challenges faced at their workplace, AB Dynamics, where the on-call team struggled to debug systems due to a lack of understanding and incomplete dashboards. This motivated them to leave AB Dynamics and seek a solution for these problems, leading to a new approach for handling metrics. They explain their approach involves leveraging metrics, logs, and traces to build service graphs, specifically using Prometheus metrics. They demonstrate how their system can discover new services, nodes, and Kafka topics, integrating with metrics from various sources like Amazon CloudWatch. The demo showcases querying capabilities to visualize dependencies and data relationships in a complex system, illustrating how their solution helps navigate and manage large sets of data efficiently.

00:40:00

In this segment, the speaker explains the dynamic nature of app ownership and how it can change within an organization based on evolving team structures. Using the example of “robot shop services,” the speaker introduces a tool for monitoring software performance by applying assertions, which are checks on the software. They demonstrate changing the data view from the last 5 minutes to the last 24 hours, leading to the appearance of colored rings representing different assertion scores. The “Troubleshooting Workbench” feature is introduced to analyze incidents, displaying patterns and root causes of issues through a flame graph of assertions. A specific incident involving a Java service and its garbage collection latency is explored, revealing a spike in errors and latency without a corresponding traffic spike. The segment concludes by examining detailed logs and performance metrics to diagnose the problem.

00:45:00

In this part of the video, the speaker discusses launching the KPI dashboard using Grafana to monitor various metrics such as collections and pause durations. The segment highlights the ability to filter and troubleshoot incidents without switching between multiple tabs, emphasizing a unified workbench approach. The presenter expresses enthusiasm about innovations in the Grafana Cloud platform and invites viewers to check out their demo booth. Subsequently, Mark Chipperz, a Senior Engineering Director at Grafana Labs, talks about their investments in machine learning (ML) and AI, mentioning tools like adaptive thresholder and outlier detection currently available. He emphasizes the focus on minimizing devops toil, accelerating the understanding of logs and metrics, and reducing noise via generative AI. Grafana’s commitment to open-source development is reiterated, tying it to their approach in enhancing generative AI tools.

00:50:00

In this segment of the video, the speaker discusses the rapid advancements in large language models (LLMs) and the importance of flexibility to work with various models. They introduce “graphon llm app,” an open-source application that allows configuration and generative AI features within Grafana in different environments. A demo shows how the app can auto-generate titles and descriptions for dashboards to reduce manual effort. Another feature demoed is the use of LLMs to generate summaries for incidents, enhancing communication and productivity. Additionally, a “query advisor” feature demonstrates how LLMs can assist in creating PromQL queries, simplifying complex tasks. Lastly, the speaker introduces “sift,” a tool for identifying system changes by analyzing broad observability signals to provide comprehensive insights into incidents and patterns.

00:55:00

In this segment of the video, the speaker discusses using an algorithm called Drain to analyze log patterns and improve responses to service level objective (SLO) breaches. They explain how highlighting changing log patterns aids on-call engineers by providing crucial context and suggesting possible solutions using a large language model (LLM). Additionally, the segment highlights several new product announcements, including Loki 3.0, an application observability product, a cost management hub, and SLO solutions developed by the speaker’s team. The speaker expresses excitement about these innovations and acknowledges the contributions of the development team, customers, and event staff. The segment concludes by previewing upcoming conference sessions.

The summary of ‘ObservabilityCON 2023 – Opening Keynote (Live)’

00:00:00 – 00:59:11