This is how to create error rate alerts at the service level (as opposed to the load balancer/ingress level) with Linkerd, Prometheus and Alertmanager.
The rules configmap is connected to the linkerd-prometheus, the linkerd-prometheus is connected to the alertmanager, the alertmanager is connected to the pager duty, the pager duty is connected to my wristwatch…
As the Linkerd helm chart stands in version 2.8.1, stable release at the time of writing, in order to get alerting setup you must bring:
And you will want the following chart values for Linkerd 2.8.x:
# Linkerd 2.8.xprometheusAlertmanagers:
- scheme: http
- "byo-alertmanager-svc.and-its-namespace" …
We have been hearing a lot about service mesh recently, coinciding with the fact that at Transit we are parting ways with our APM vendor in favour of open source monitoring tools, the time seemed right to jump on board with the “world’s most over-hyped technology".
While we were already capturing golden signals at the ingress level (check out this dashboard by @n8han that we contributed upstream), the realization that we would be losing service to service metrics prompted us to implement a mesh.
Being a GKE shop, the managed Istio add-on was very appealing, but in the end it was a mix of this blog post, the amazing slack community, the performance benchmarking results, these differences, the overall ethos of less is more and last but definitely not least, the relative simplicity of operating Linkerd that helped us decide. …
Until recently, we had a pretty OK bash script to install desired releases to various clusters, but wanted a more declarative (and readable!) approach.
My buddy (and CNCF ambassador extraordinaire) Archy told me about helmfile so we checked it out and a few hours of YAML later our bash script was history and we were using
helmfile in production to manage all our releases.
Let’s take a look at what works for us.
So you can declare environments in