Forward Kubernetes Logs to Elasticsearch (ELK) [Guide]

Fluent bit is an open source, light-weight log processing and forwarding service. Fluent bit allows to collect logs, events or metrics from different sources and process them. These data can then be delivered to different backends such as Elastic search, Splunk, Kafka, Data dog, InfluxDB or New Relic.

Original content from computingforgeeks.com - post 83278

Fluent bit is easy to setup and configure. It gives you full control of what data to collect, parsing the data to provide a structure to the data collected. It allows one to remove unwanted data, filter data and push to an output destination. Therefore, it provides an end to end solution for data collection.

Some wonderful features of fluent bit are:

High Performance
It is super Lightweight and fast, requires less resource and memory
It supports multiple data formats.
The configuration file for Fluent Bit is very easy to understand and modify.
Fluent Bit has built-in TLS/SSL support. Communication with the output destination is secured.
Asynchronous I/O

Fluent Bit is compatible with docker and kubernetes and can therefore be used to aggregate application logs. There are several ways to log in kubernetes. One way is the default stdout logs that are written to a host path”/var/log/containers” on the nodes in a cluster. This method requires a fluent bit DaemonSet to be deployed. A daemon sets deploys a fluent bit container on each node in the cluster.

The second way of logging is the use of a persistent volume. This allows logs to be written and persistent in an internal or external storage such as Cephfs. Fluent bit can be setup as a deployment to read logs from a persistent Volume.

In this Blog, we will look at how to send logs from a Kubernetes Persistent Volume to Elastic search using fluent bit. Once logs are sent to elastic search, we can use kibana to visualize and create dashboards using application logs and metrics.

This image has an empty alt attribute; its file name is Screen%2BShot%2B2019-05-24%2Bat%2B10.41.39%2BAM.png

PREREQUISITES:

First, we need to have a running Kubernetes Cluster. You can use our guides below to setup one if you do not have one yet:

Secondly, we will need an elastic search cluster setup. You can use elasticsearch installation guide if you don’t have one in place yet. In this tutorial, we will setup a sample elastic search environment using stateful sets deployed in the kubernetes environment. We will also need a kibana instance to help us visualize this logs.

Deploy Elasticsearch

Create the manifest file. This deployment assumes that we have a storage class cephfs in our cluster. A persistent volume will be created along side the elastic search stateful set. Modify this configuration as per your needs.

$ vim elasticsearch-ss.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster 
spec:
  serviceName: elasticsearch
  replicas: 1
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
        resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: cluster.name
            value: k8s-logs
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.seed_hosts
            value: "es-cluster-0.elasticsearch"
          - name: cluster.initial_master_nodes
            value: "es-cluster-0"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx512m"
      initContainers:
      - name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: cephfs
      resources:
        requests:
          storage: 5Gi

Apply this configuration

kubectl apply -f elasticsearch-ss.yaml

2. Create an elastic search service

$ vim elasticsearch-svc.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch 
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      name: rest
    - port: 9300
      name: inter-node

kubectl apply -f elasticsearch.svc

3. Deploy Kibana

$ vim kibana.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.2.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_URL
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601

---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  labels:
    app: kibana
spec:
  ports:
  - port: 5601
  selector:
    app: kibana

Apply this configuration:

kubectl apply -f kibana.yaml

4. We then need to configure and ingress route for the kibana service as follows:

$ vim kibana-ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/tls-acme: "true"
    ingress.kubernetes.io/force-ssl-redirect: "true"
  name: kibana

spec:
  rules:
  - host: kibana.computingforgeeks.com
    http:
      paths:
      - backend:
          serviceName: kibana
          servicePort: 5601
        path: /
  tls:
  - hosts:
    - kibana.computingforgeeks.com
    secretName: ingress-secret   // This can be created prior if using custom certs

kubectl apply -f kibana-ingress.yaml

Kibana service should now be accessible via https://kibana.computingforgeeks.com/

Kibana page

Once we have this setup, We can proceed to deploy fluent Bit.

Step 1: Deploy Service Account, Role and Role Binding

Create a deployment file with the following contents:

$ vim  fluent-bit-role.yaml 
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-read
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-read
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: default

Apply deployment config by running the command below.

kubectl apply -f fluent-bit-role.yaml

Step 2: Deploy a Fluent Bit configMap

This config map allows us to be able to configure our fluent Bit service accordingly. Here, we define the log parsing and routing for Fluent Bit. Change this configuration to match your needs.

$ vim fluentbit-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    k8s-app: fluent-bit
  name: fluent-bit-config

data:
  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               *
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off

  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-elasticsearch.conf
  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               *
        Path              /var/log/*.log
        Parser            json
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10
  output-elasticsearch.conf: |
    [OUTPUT]
        Name            es
        Match           *
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        Logstash_Format On
        Replace_Dots    On
        Retry_Limit     False
  parsers.conf: |
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%d %H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        # http://rubular.com/r/tjUt3Awgg4
        Name cri
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S

kubectl apply -f fluentbit-configmap.yaml

Step 3: Create a Persistent Volume Claim

This is where we will write application logs.

$ vim pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: logs-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: cephfs #Change accordingly
  resources:
    requests:
      storage: 5Gi

kubectl apply -f pvc.yaml

Step 4: Deploy a kubernetes deployment using the config map in a file

$ vim fluentbit-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: fluent-bit-logging
  name: fluent-bit
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: fluent-bit-logging
  template:
    metadata:
      annotations:
        prometheus.io/path: /api/v1/metrics/prometheus
        prometheus.io/port: "2020"
        prometheus.io/scrape: "true"
      labels:
        k8s-app: fluent-bit-logging
        kubernetes.io/cluster-service: "true"
        version: v1
    spec:
      containers:
      - env:
        - name: FLUENT_ELASTICSEARCH_HOST
          value: elasticsearch
        - name: FLUENT_ELASTICSEARCH_PORT
          value: "9200"
        image: fluent/fluent-bit:1.5
        imagePullPolicy: Always
        name: fluent-bit
        ports:
        - containerPort: 2020
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/log
          name: varlog
        - mountPath: /fluent-bit/etc/
          name: fluent-bit-config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: fluent-bit
      serviceAccountName: fluent-bit
      volumes:
      - name: varlog
        persistentVolumeClaim:
          claimName: logs-pvc
      - configMap:
          defaultMode: 420
          name: fluent-bit-config
        name: fluent-bit-config

Create objects by running the command below:

kubectl apply -f  fluentbit-deployment.yaml

Step 5: Deploy an application

Let’s test that our fluent bit service works as expected. We will use an test application that writes logs to our persistent volume.

$ vim testpod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) >> /var/log/app.log; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /var/log
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: logs-pvc

Apply with the command:

kubectl apply -f testpod.yaml

Check if the pod is running.

kubectl get pods

You should see the following output:

NAME      READY   STATUS    RESTARTS   AGE
test-pod   1/1     Running   0          107s

Once the pod is running, We can proceed to check if logs are sent to Elastic search.

On Kibana, we will have to create an index as shown below. Click on “Management > Index Patterns> Create index pattern”

index

kibana_timestamp

Once the index has been created. Click on the discover icon to see if our logs are in place:

kibana_logs

See more guides on Kubernetes:

Install Ambassador API Gateway/Ingress Controller on Kubernetes

Deploy Highly Available Kubernetes Cluster on CentOS 7 using Kubespray

Monitor Docker Containers and Kubernetes using Weave Scope

Teleport – Secure Access to Linux Systems and Kubernetes

Forward Kubernetes Logs to Elasticsearch (ELK) using Fluentbit