Skip to main content

Lars Kamp
Patrick DeVivo

Software engineering is often more art than science, making it difficult to measure productivity. There are ways to use data to be more effective as an individual contributor or an engineering leader, but surprisingly, engineering organizations and teams typically are not data-driven.

MergeStat is on a mission to change this with open-source, operational analytics for software engineering organizations. MergeStat started as an experiment to bring together two technologies: SQL and Git repositories. MergeStat provides data integration for your Git repositories, facilitating the exploration of legacy code and identification of code that hadn't been touched in a while and maybe deserved new attention.

From there, the use cases evolved. Today, MergeStat is used by organizations that have hundreds or even thousands of repositories. MergeStat is data infrastructure for Git repositories, where anyone can query the history and contents of their code bases.

Behind the scenes, MergeStat syncs data from the tools used to build and ship software into a PostgreSQL instance, as APIs provided by these tools are not always easy to understand and extract data from. MergeStat puts a lot of the usual work into implementing good API data consumption, like pagination and respecting rate limits.

From there, a user can query their data directly in MergeStat, or use other business intelligence tools and dashboards that know how to speak to PostgreSQL. See this example Grafana dashboard for GitHub pull requests.

Patrick DeVivo is Founder and CEO at MergeStat. In this session, we start out with a general overview of MergeStat and how it's used today.

Patrick explains how MergeStat is a general-purpose engine that companies use to craft the queries that fit their organization. We go into a few MergeStat use cases that Patrick sees today:

  • In some cases, the actual data collection is the use case. For example, with audits the action is to deliver the list of pull requests that didn't follow best practices.
  • Understanding the different versions of a programming language in use. If you're a Go shop, a single query aggregates the different Go versions used across all repositories.
  • Find pull requests that have been open for a long time or merged without review.

Patrick's advice is to use MergeStat in a way that is positive and constructive to take action. Watch this episode to learn more about data integration for the software development lifecycle.

Lars Kamp

In the old world of software engineering, developer productivity was measured by lines of code. However, time has shown how code quantity is a poor measure of productivity. So, how come engineering organizations continue to rely on this metric? Because they do not have a "single-pane" view across all the different systems that have data on various activities that actually correlate with productivity.

That's where Faros AI comes in. Faros AI connects the dots between engineering data sources—ticketing, source control, CI/CD, and more—providing visibility and insight into a company's engineering processes.

Vitaly Gordon is the founder and CEO of Faros AI. Vitaly came up with the concept for Faros AI when he was VP of Engineering in the Machine Learning Group at Salesforce. As an engineering leader, it's not always code; you also have business responsibilities. That meant interacting with other functions of the business, like sales and marketing.

In those meetings, Vitaly realized that other functions used standardized metrics that measure the performance of their business. Examples are CAC, LTV, or NDR. These functions built data pipelines to acquire the necessary data and compute these metrics. Surprisingly, engineering did not have that same understanding of their processes.

An example of an engineering metrics framework is DORA. DORA is an industry-standard benchmark that correlates deployment frequency, lead time, change failure rate, and time to restoration with actual business outcomes and employee satisfaction. For hyperscalers like Google and Meta, these metrics are so important that they employ thousands of people just to build and report them.

So, how do you calculate DORA metrics for your business? With data, of course. But, it turns out the data to calculate these metrics is locked inside the dozens of engineering tools used to build and deliver software. While those tools have APIs, they are optimized for workflows, not for exporting data. If you're not a hyperscaler with the budget to employ thousands of people, what do you do? You can turn to Faros AI, which does all the heavy lifting of acquiring data and calculating metrics for you.

The lessons learned from the modern data stack (MDS) come in when building data pipelines to connect data from disparate tools. In this episode, we explore the open-source Faros Community Edition and the data stack that powers it.

Lars Kamp
Waldemar Hummer

Waldemar Hummer is Co-Founder and CTO at LocalStack. LocalStack gives you a fully functional local cloud stack so you can develop and test your cloud and serverless apps offline. LocalStack is an open-source project that started at Atlassian, where its initial purpose was to keep developers productive on their daily commutes despite poor internet connectivity.

LocalStack emulates AWS cloud services on your laptop, increasing the number of phases in your infrastructure environment to four: local, test, staging, and production—with LocalStack efficiently covering the local and test phases (including CI builds). LocalStack also integrates with a large set of other cloud tools, such as Terraform, Pulumi, and CDK.

While the commute problem went mostly away with COVID, it became clear that a local development environment has speed, quality, and cost advantages. Local provisioning of resources is faster and can speed up dev feedback cycles by an order of magnitude. Developers can start their work without IAM enforcement, then later introduce security policies and migrate to the cloud. A local environment also reduces the cost of cloud sandbox accounts.

A key requirement for LocalStack to be valuable is parity with cloud provider services, which means replicating services and API responses. LocalStack is built in Python, and Waldemar walks us through LocalStack's process of building out the platform to have 99% parity with AWS.

In this episode, we also cover developer marketing, community building, and how LocalStack amassed over 44,000 stars on GitHub. Waldemar takes us through both a live LocalStack demo and a deep-dive into LocalStack's GitHub repository.

Lars Kamp
Jonathan Bernales

There is a new generation of companies that are building their applications 100% cloud-native, with a pure serverless paradigm. One such company is Ekonoo, a French FinTech startup that enables customers and organizations to efficiently invest in retirement funds.

Jonathan Bernales is a DevOps Engineer at Ekonoo. In this interview, Jonathan walks us through Ekonoo's approach of giving developers the autonomy to build and deploy code along with the responsibility for security and cost.

Holding developers responsible for security and cost is a rather new part of "shift-left." Cost awareness becomes part of the development culture. To keep cloud bills under control, Ekonoo developers are responsible for their individual test accounts and have access to the AWS Billing Console and AWS Cost Explorer.

At Ekonoo, there is no dedicated "production team." Rather, DevOps collaborates with developers to create guidelines and guardrails for architecture, automation, security, and cost. The entire Ekonoo stack runs on AWS using native AWS services such as CloudFormation, Lambda, and Step Functions.

Watch this episode to learn about Ekonoo's transition to a microservices architecture and the lessons learned along the way.

Lars Kamp
Andreas Grabner

Andreas Grabner is a DevOps Activist at Dynatrace, where he has fifteen years of experience helping developers, testers, operations, and XOps folks do their jobs more efficiently.

In this episode, Andreas and I discuss how the shift to cloud-native and more dynamic infrastructure is followed by a change in how developers, architects, and site reliability engineers (SREs) work together.

With the sheer quantity of resources running in cloud-native infrastructure and the monitoring signals produced by each resource, the only way to keep growing without "throwing people at the problem" is to turn to automation.

Andreas makes a noteworthy distinction between DevOps engineers and SREs:

  • DevOps engineers use automation to speed up delivery and get new changes into production.
  • SREs use automation to keep production healthy.

SREs are often former IT operations and system administrators responsible for physical machines, virtual machines (VMs), and Kubernetes clusters. As SREs, they move up the stack and become responsible for everything from the bottom of the stack all the way up to serverless functions and the service itself.

We dive into the differences between SLAs, SLOs, and Google's four golden signals of monitoring—latency, traffic, errors, and saturation. Andreas shares the example of a bank and how they started defining SLOs to measure the growth of their mobile app business versus just defining engineering metrics.

This episode covers "engineering for game days," chaos engineering, and making the unplannable, plannable. Andreas shares his perspective on the general trend to "shift left" and include performance engineering in the development and architecture of cloud-native systems.

Lars Kamp

Dvir Mizrahi is Head of Financial Engineering at Wix, the leader in website creation with 220 million users running e-commerce operations. And with over six thousand employees, Wix ships more than fifty thousand builds each day.

Dvir is also among the original authors of the AWS Cloud Financial Management certification.

In this episode, Dvir covers how Wix shifted from FinOps to Financial Engineering. It's an engineering-first approach to build tooling and processes tracking financial key performance indicators (KPIs) for its multi-cloud infrastructure. The new approach established a culture of financial responsibility that supports Wix's continued growth.

Wix started in 2006 and initially ran its infrastructure on-premise. Today, Wix runs a multi-cloud environment on Google Cloud Platform (GCP) and Amazon Web Services (AWS). As Wix shifted from on-premise to the cloud, the procurement process of resources changed with it.

In the old world, purchasing additional hardware was a closed and controlled process. But in the cloud, Dvir compares resource procurement to "a supermarket where people can go in, take whatever they want, and leave without passing the registers." A developer could spin up a hundred thousand instances with just the click of a button.

Wix realized the financial risk that comes with liberal permissions to spin up infrastructure and hired Dvir in 2017. FinOps approaches infrastructure governance from a billing perspective and handles workloads already provisioned in the cloud. But at Wix's scale, where there are thousands of engineers, the FinOps approach stops working. "By the time you have a financial incident, it's too late and you didn't govern anything."

Dvir shifted the strategy to proactively preventing waste in the first place, by incorporating financial KPIs into engineering goals. In addition, Dvir built an internal platform called "InfraGod" which collects infrastructure data, integrates with Terraform, and enforces rules at the time of resource provisioning. Taking action at the time resources are provisioned rather than after the fact is "the difference between Finance and Financial Engineering."

Listen to this episode for a deep dive into the tactics that Dvir uses to run Financial Engineering at Wix, such as data collection, engineering post-mortems, monthly reports, and mandatory resource tagging.

Lars Kamp
Tobi Knaup

Tobi Knaup is the CEO and a co-founder at D2iQ, an enterprise Kubernetes platform. D2iQ combines the best open-source technology from the cloud-native technology landscape into a single Kubernetes solution. Customers can deploy this solution without worrying about the individual pieces they would otherwise need to assemble, maintain, and update.

In this episode, we discuss the shift to cloud-native infrastructure and how we are now seeing a new class of smart cloud-native applications emerge. Smart cloud-native applications include artificial intelligence (AI) components that leverage data from production applications. These new applications enable entirely new use cases in every industry. Examples are autonomous driving in automotive, medical imaging in healthcare, and fraud detection in banking or crypto trading.

To build smart cloud-native applications, companies need to build the infrastructure to train their AI models, put them into production, and build differentiated products. It is an entirely new type of workload, with very dynamic and elastic demand for compute and storage.

It turns out that Kubernetes, with its scheduling and orchestration capabilities, is a perfect fit to support workloads from smart cloud-native applications. Training models requires spinning up large amounts of compute to process data and then scaling back down. By putting model predictions into production, companies can lean on existing code pipeline workflows and monitoring.

This also means that instead of running two separate types of infrastructure, companies can consolidate and run their smart cloud-native applications on the same platforms as their production applications, which generate the data in the first place. The outcome for companies is highly differentiated digital products.

Listen to this episode to learn about Kubernetes, cloud-native architecture, and changes in organization and workflows that technology leaders need to adapt to deliver smart cloud-native applications.

Lars Kamp
Jon Edvald

Jon Edvald is the founder and CEO of Garden, an end-to-end cloud delivery platform that accelerates your development, testing, and CI/CD workflows.

In this conversation, Jon covers how the shift from monolithic applications to microservices has taken us from a single codebase to individual deliverables that are getting smaller and smaller. With the introduction of containers, an application now consists of many discrete components—which continue to get even smaller with the arrival of serverless. And where teams previously had to manage five to ten codebases, they are now dealing with hundreds or even thousands. Testing and deploying these different codebases has become a graph problem.

Beyond adopting containers and Kubernetes, the complexity of that graph of system components pushes the boundaries of existing DevOps tool chains. There is overhead for setup of each component in the graph, which becomes unmanageable with existing tools.

Garden solves this issue by factoring out things that are undifferentiated across different teams, allowing them to focus on their own business problems. Garden builds a directed graph of everything that needs to happen to transition from a bunch of Git repositories to a fully built, deployed, and tested system.

Listen to this episode to learn more about the industrialization of continuous integration (CI), infrastructure as code (with popular tools like Terraform and Pulumi), and how Garden helps developers ship more software faster.

Lars Kamp
Avi Press

Historically, the distribution and usage of open-source projects have been challenging to measure. 📏

Understanding your user base is critical to product planning and development. However, open-source maintainers have resorted to inelegant tactics to gather data on their users—such as scraping GitHub user data, performing reverse IP lookups on website traffic, or simply hoping that users submit support requests.

Scarf aims to solve this problem with its Scarf Gateway and Documentation Insights. The Scarf Gateway provides distribution analytics for open-source software and helps maintainers connect with commercial users. Documentation Insights aid in understanding how users interact with project websites and documentation.

In this episode, Lars chats with Avi Press, Founder and CEO at Scarf. Avi shares how Scarf grew from a hobby project into a venture-funded startup, as well as his thoughts on the future of open-source business models.

Lars Kamp
Dieter Matzion

Companies build in the cloud for growth and speed. 📈

Engineering teams love building new things—so much so that cloud spend commonly becomes a major part of a company's profit and loss statement (P&L).

Cloud vendors have introduced pricing and discounting schemes to incentivize increased consumption and lock in long-term commitments from customers. Management gets involved at this point, but they often lack context and understanding of how cloud procurement works.

Forecasting cloud spend and aligning growth with infrastructure efficiency become important capabilities when you are about to sign a multi-million three-year contract with a cloud vendor. 💰

In this episode, Lars talks with Dieter Matzion, Senior Cloud Governance Engineer at Roku and long-time expert in cloud procurement and cloud financial operations. Before joining Roku, Dieter was an engineer at Google, Netflix, and Intuit, where he established infrastructure efficiency programs that combined cloud operations, analytics, and finance.