K8s And Harmless Traefik Ingress Logs

While troubleshooting a recent Kubernetes issue, I ran across logs for Traefik which showed some errors. After much troubleshooting, the errors ended up being innocuous in my situation. Due to some of my inexperience with the internals of K8s, though, I didn’t realize this and was troubleshooting down the wrong path for quite some time. Since none of the other blogs or forum posts I ran into seemed to mention this, I figured it was a good candidate for a blog post.

At the end of the day, I’ll just say that I’m glad this issue was occurring in my development environment rather than in production!

The Issue

My issue started when a Let’s Encrypt certificate that I was using in Kubernetes expired. That the cert expired without being automatically renewed is another topic for another time, but suffice to say for now that I quickly got the cert renewed. I was using it to provide TLS to the frontend of the development environment for an internal web application I’m building called DashCentral (which is only relevant because it’ll show up in some logs and YAML.) The cert was applied to a Kubernetes Ingress Controller as a Secret. If anyone is curious, I’m using K3s for Kubernetes in this particular environment with Traefik Proxy as the Ingress Controller. Here’s the relevant YAML:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dashcentral
  labels:
    app: dashcentral
  annotations:
    traefik.ingress.kubernetes.io/redirect-entry-point: https
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
    traefik.ingress.kubernetes.io/router.tls: "true"
spec:
  ingressClassName: traefik
  rules:
    - host: dev-dashcentral.some.domain
      http:
        paths:
          - backend:
              service:
                name: dashcentral
                port:
                  number: 8000
            path: /
            pathType: Prefix
  tls:
    - hosts:
        - dev-dashcentral.some.domain
      secretName: le-tls

After I had renewed the certificate, deleted the Secret for le-tls and recreated it. For good measure, I deleted this Ingress Controller and recreated it as well to ensure everything got sorted as quickly as possible. Unfortunately for me, this didn’t have the intended effect. Rather than receiving an error message that the certificate was expired, I began to receive an error message that the cert was self-signed. Looking at the cert details showed that it was the default cert for Traefik.

I spent some time looking at my Ingress Controller, but everything appeared to be correct, down to it telling me after a k3s kubectl describe ingress dashcentral that the le-tls Secret was terminating it.

The Rabbit Hole

Unfortunately, this issue was tricky to search the web for since a lot of posts were about how to use a self-signed cert with Traefik rather than one not applying. I eventually realized that I should be able to see if Traefik itself has any logs. A quick peak in the kube-system namespace showed a pod for Traefik, and dumping the logs for it showed:

The two errors I became fixated on were:

Cannot create service: service not found

And:

Skipping service: no endpoints found

This confused me quite a bit, especially the “service not found” message since I assumed this meant the K8s Service I was trying to target with my Ingress Controller via the port wasn’t being found for some reason. The plot only thickened more as various attempts of deleting the Ingress Controller, the Service, and even the Deployment would sometimes result in new messages like this in the Traefik logs while other times they wouldn’t appear at all… even though the self-signed certificate continued to be used.

I eventually realized that these particular errors were simply related to things not immediately being available in the K8s backend as I deleted and recreated things in an ad-hoc fashion while trying to troubleshoot. They weren’t at all indicative of the problem I was actually trying to fix. In fact, subsequently checking some of my other single-node clusters shows the same logs appearing after the node is rebooted.

The Fix

How do I know for sure that these weren’t related to the problem? Because the problem was at a much earlier step. When I ran the kubectl command to recreate the Secret storing the certificate, I accidentally forgot to use the cert for dev-dashcentral.some.domain and instead just used dashcentral.some.domain. This actually worked rather than failing with files not being found because when the project was in its infancy, I was hosting it at this particular address on this particular node… and the old, very expired certificate was still on the node. I have to admit that I was a little caught off guard that I didn’t receive a certificate mismatch or certificate expiration error, though. Instead, since the certificate didn’t match the host directive in my Ingress Controller, it simply fell back to using the self-signed certificate instead.