Version: Operator 3.2.2

Recommendations for setting up logging mechanism in Kubernetes

In traditional server environments, application logs are written to a file such as /var/log/app.log. However, when working with Kubernetes, you need to collect logs for multiple transient pods (applications) across multiple nodes in the cluster, making this log collection method less than optimal.

Ways to collect logs in Kubernetes

Basic logging using stdout and stderr

The default Kubernetes logging framework captures the standard output (stdout) and standard error output (stderr) from each container on the node to a log file. You can see the logs of a particular container by running the following commands:

$ kubectl logs <pod-name> -c <container-name> -n <namespace>

For a previously failed container:
$ kubectl logs <pod-name> -c <container-name> -n <namespace> --previous

By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted along with their logs.

Cluster-level logging using node logging agent

With the help of cluster-level logging setup, you can access the logs even after the pod is deleted. The logging agent is commonly a container that exposes logs or pushes logs to a backend. Because the logging agent must run on every node, the best practice is to run the agent as a DaemonSet.

Managing logs on different platforms:

Google Kubernetes Engine (GKE) cluster

For container and system logs, by default GKE deploys a per-node logging agent fluent-bit that reads container logs, adds helpful metadata, and then stores them in Cloud Logging. The logging agent checks for container logs in the following sources:

Standard output and standard error logs from containerized processes
kubelet and container runtime logs
Logs for system components, such as VM startup scripts

For events, GKE uses a deployment in the kube-system namespace which automatically collects events and sends them to Logging. For more details, see Managing GKE logs.

Use kubectl get pods -n kube-system to ensure the fluent-bit pods are up and running.

Sample output:

% kubectl get pods -n kube-system
NAME                                                             READY   STATUS    RESTARTS      AGE
event-exporter-gke-857959888b-mc44k                              2/2     Running   0             8d
fluentbit-gke-6zdgb                                              2/2     Running   0             8d
fluentbit-gke-85mc8                                              2/2     Running   0             8d
fluentbit-gke-mbgkx                                              2/2     Running   0             8d

Read logs

To view logs on Google Cloud Logs Explorer, see Gcloud Logs Explorer.

To fetch logs through the command line, use gcloud logging read.

gcloud logging read 'severity>=DEFAULT AND
resource.type="k8s_container" AND
resource.labels.container_name=<container name>AND
resource.labels.pod_name=<pod name> AND
resource.labels.namespace_name=<namespace name> AND
resource.labels.location="us-west1-a" AND
resource.labels.cluster_name=<cluster name> AND
timestamp>="2023-04-29T11:32:00Z" AND timestamp<="2023-05-29T12:09:00Z"' 
--format=json --order=asc | grep -i textPayload > ~/gcloudlogging.log

This command fetches the textPayload field from all the logs in the given range of timestamp of the container, pod, and cluster mentioned in the command, and populates the gcloudlogging.log file with that information.

Amazon EKS cluster

The Amazon EKS cluster does not come with any per-node logging agent installed. For logging purposes, you can install an agent similar to the fluent-bit deamonset to aggregate Kubernetes logs and send them to AWS CloudWatch Logs.

See the AWS documentation Set up Fluent Bit as a DaemonSet. Also, verify the IAM permissions before setting up fluent-bit Verify prerequisites.

Read logs

To view logs on AWS CloudWatch, see view log data.

To fetch logs through command line, use the aws logs filter-log-events command.

This command needs the arguments --log-group-name and --log-stream-names, which can be obtained from given commands:

% aws logs describe-log-groups                                                                  
{
    "logGroups": [
    {
            "logGroupName": "/aws/containerinsights/openebs-demo/application",
            "creationTime": 1685007094462,
            "metricFilterCount": 0,
            "arn": "arn:aws:logs:us-east-1:<accountNumber>:log-group:/aws/containerinsights/openebs-demo/application:*",
            "storedBytes": 125735395
        },
        ...
    ]
  }
    
% aws logs describe-log-streams --log-group-name /aws/containerinsights/openebs-demo/application
{
    "logStreams": [
        {
            "logStreamName": "aerospike-init",
            "creationTime": 1685431444031,
            "arn": "arn:aws:logs:us-east-1:<accountNumber>:log-group:/aws/containerinsights/openebs-demo/application:log-stream:aerospike-init",
            "storedBytes": 0
        },
        ...
    ]    
}    

% aws logs filter-log-events \
    --start-time `date -d 2023-04-30T12:32:00Z +%s`000 \
    --end-time `date -d 2023-05-30T12:34:40Z +%s`000 \
    --log-group-name <application log group name> \
    --output json --log-stream-names <log stream names> | jq '.events[].message' > ~/awsevents.log

This command fetches the message field from all the logs in the given range of timestamp from the log stream mentioned in the command, and uses that information to populate the awsevents.log file.

On-Prem or Self-Managed cluster

There are several Kubernetes logging stacks that can be implemented in any kind of cluster, including:

EFK (Elasticsearch, FluentD, and Kibana)
ELK (Elasticsearch, Logstash, and Kibana)
PLG (Promtail, Loki, and Grafana)

PLG (Promtail, Loki, and Grafana)

The metadata discovery mechanism in Loki stack is useful in Kubernetes ecosystem when cost-control and storing logs for a long amount of time are priorities.

PLG stack is comprised of the following components:

Promtail: Responsible for data ingestion into Loki. Runs on every node of your Kubernetes cluster.
Loki: The heart of the PLG stack; a data store optimized for logs.
Grafana: Visualizes logs stored in Loki. You can build individual dashboards in Grafana based on application logs and metrics computed from the logs.

Install the PLG stack with Helm

Add the Grafana repository to Helm.

% helm repo add grafana https://grafana.github.io/helm-charts
"grafana" has been added to your repositories
% helm repo update
Hang tight while we grab the latest from your chart repositories...
...
Update Complete. ⎈Happy Helming!⎈

Verify the Grafana repo in Helm

% helm search repo grafana/
NAME                                    CHART VERSION   APP VERSION         DESCRIPTION                                       
grafana/enterprise-logs                 2.4.3           v1.5.2              Grafana Enterprise Logs                           
grafana/enterprise-logs-simple          1.2.1           v1.4.0              DEPRECATED Grafana Enterprise Logs (Simple Scal...
grafana/enterprise-metrics              1.9.0           v1.7.0              DEPRECATED Grafana Enterprise Metrics             
grafana/fluent-bit                      2.5.0           v2.1.0              Uses fluent-bit Loki go plugin for gathering lo...
grafana/grafana                         6.56.5          9.5.2               The leading tool for querying and visualizing t...
grafana/grafana-agent                   0.14.0          v0.33.2             Grafana Agent                                     
grafana/grafana-agent-operator          0.2.15          0.32.1              A Helm chart for Grafana Agent Operator           
grafana/loki                            5.5.5           2.8.2               Helm chart for Grafana Loki in simple, scalable...
...

Configure the PLG stack

Download the value file from grafana/loki-stack to configure based on the use case. In the following example, we customize the value file to deploy only Promtail, Loki, and Grafana.

loki:
  enabled: true
  persistence:
    enabled: true
    storageClassName: ssd
    size: 50Gi
  isDefault: true
  url: http://{{(include "loki.serviceName" .)}}:{{ .Values.loki.service.port }}
  readinessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45
  livenessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45
  datasource:
    jsonData: "{}"
    uid: ""

promtail:
  enabled: true
  config:
    logLevel: info
    serverPort: 3101
    clients:
      - url: http://{{ .Release.Name }}:3100/loki/api/v1/push

grafana:
  enabled: true
  sidecar:
    datasources:
      enabled: true
  image:
    tag: 8.3.5

For Loki, we configure persistence to store our logs on a running Kubernetes cluster with a size of 50 GB. The disk itself is provisioned automatically through the available CSI driver. Depending on your Kubernetes setup or managed Kubernetes vendor, you may have to provide a different StorageClass. (Use kubectl get storageclass) to get a list of available StorageClasses in your cluster.

Deploy the PLG stack with Helm

% helm install loki grafana/loki-stack -n loki --create-namespace -f ~/loki-stack-values.yml
NAME: loki
LAST DEPLOYED: Thu May 25 19:21:04 2023
NAMESPACE: loki
STATUS: deployed
REVISION: 1
NOTES:
The Loki stack has been deployed to your cluster. Loki can now be added as a datasource in Grafana.

See http://docs.grafana.org/features/datasources/loki/ for more detail.

Verify Loki pods created by the above installation:

% kubectl -n loki get pod
NAME                           READY   STATUS    RESTARTS   AGE
loki-0                         0/1     Running   0          26s
loki-grafana-7db596b95-4jdrf   1/2     Running   0          26s
loki-promtail-2fhdn            1/1     Running   0          27s
loki-promtail-dh7g2            1/1     Running   0          27s
loki-promtail-hjdm8            1/1     Running   0          27s

Access Grafana from your local machine.

Find the Grafana password. By default, Grafana is protected with basic authentication You can get the password (username is admin) from the loki-grafana secret in the loki namespace with kubectl:

% kubectl get secret loki-grafana -n loki \
 -o template \
 --template '{{ index .data "admin-password" }}' | base64 -d; echo

Port-Forward from localhost to Grafana

Knowing the username and password, you can port-forward with kubectl port-forward and access Grafana from your local machine through port 8080:

% kubectl get pod -n loki -l app.kubernetes.io/name=grafana
NAME                           READY   STATUS    RESTARTS   AGE
loki-grafana-7db596b95-4jdrf   2/2     Running   0          97s

% kubectl port-forward -n loki loki-grafana-7db596b95-4jdrf 8080:3000
Forwarding from 127.0.0.1:8080 -> 3000
Forwarding from [::1]:8080 -> 3000

Read logs

To see the Grafana dashboard, access localhost:8080. Use admin as the username, and the password you fetched from secret. Use LogQL queries to explore the logs on the Grafana dashboard. For more details see the official LogQL documentation.

Use the logcli command to fetch logs through command line.

Port-Forward from localhost to Loki pod

You can port-forward using kubectl port-forward to access Loki via logcli from your local machine through port 8080:

% kubectl get pod -n loki -l app=loki                   
NAME     READY   STATUS    RESTARTS   AGE
loki-0   1/1     Running   0          5d20h

% kubectl port-forward -n loki loki-0 8080:3100
Forwarding from 127.0.0.1:8080 -> 3100
Forwarding from [::1]:8080 -> 3100

For logcli to access Loki, export the Loki address and port number:

export LOKI_ADDR=http://localhost:8080

logcli query '{namespace="<namespace name>",pod=<pod name>,container=<container name>}' --from "2023-05-29T11:32:00Z" --to "2023-05-30T16:12:00Z" > ~/lokilogs.log

This command fetches all the logs in the given range of timestamp using the query in the command, and uses that information to populate the lokilogs.log file.

Ways to collect logs in Kubernetes​

Basic logging using stdout and stderr​

Cluster-level logging using node logging agent​

Managing logs on different platforms:​

Google Kubernetes Engine (GKE) cluster​

Read logs​

Amazon EKS cluster​

Read logs​

On-Prem or Self-Managed cluster​

PLG (Promtail, Loki, and Grafana)​

Install the PLG stack with Helm​

Ways to collect logs in Kubernetes

Basic logging using stdout and stderr

Cluster-level logging using node logging agent

Managing logs on different platforms:

Google Kubernetes Engine (GKE) cluster

Read logs

Amazon EKS cluster

Read logs

On-Prem or Self-Managed cluster

PLG (Promtail, Loki, and Grafana)

Install the PLG stack with Helm