This is how to create error rate alerts at the service level (as opposed to the load balancer/ingress level) with Linkerd, Prometheus and Alertmanager.

The plumbing

The rules configmap is connected to the linkerd-prometheus, the linkerd-prometheus is connected to the alertmanager, the alertmanager is connected to the pager duty, the pager duty is connected to my wristwatch…

As the Linkerd helm chart stands in version 2.8.1, stable release at the time of writing, in order to get alerting setup you must bring:

  • a rules configmap
  • an alertmanager

And you will want the following chart values for Linkerd 2.8.x:

We have been hearing a lot about service mesh recently, coinciding with the fact that at Transit we are parting ways with our APM vendor in favour of open source monitoring tools, the time seemed right to jump on board with the “world’s most over-hyped technology".

While we were already capturing golden signals at the ingress level (check out this dashboard by @n8han that we contributed upstream), the realization that we would be losing service to service metrics prompted us to implement a mesh.

Being a GKE shop, the managed Istio add-on was very appealing, but in the end it…


So at Transit we’ve been migrating our workloads to GKE. Naturally, we’ve been using helm to package them.

Until recently, we had a pretty OK bash script to install desired releases to various clusters, but wanted a more declarative (and readable!) approach.

My buddy (and CNCF ambassador extraordinaire) Archy told me about helmfile so we checked it out and a few hours of YAML later our bash script was history and we were using helmfile in production to manage all our releases.

Let’s take a look at what works for us.


So you can declare environments in helmfile, e.g.:

Naseem Ullah

DevOps @ Transit

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store