Friday, November 23, 2018

Running Logstash container under OpenShift

What is the issue?

Main problem for running random images under OpenShift is that OpenShift starts containers as a random user. This is done for security reasons (isolation of workloads). A user can be given permissions to run `privileged` containers but this is not recommended if it can be avoided.

You can check my earlier blog about building an SSH container image for openshift for more information an a more complicated example.

Logstash official container image

Official logstash image can be found on dockerhub and is built off logstash-docker github project. It is not specifically built to run in OpenShift but it is still straightforward to run it unmodified. There are only 2 issues:
  • it tries to run as user 1000 and expects to find logstash code in user's home directory
  • some configuration files lack needed permissions to be modified by a randim user id

Get running it

Depending on what you're trying to do, you can approach in a somehow different way. I will give a specific example by mostly retaining original configuration (beats input and stdout output) but adding `config` file with Kubernetes audit setup and disabling elasticsearch monitoring as don't have an elasticsearch backend. I hope this will provide enough of an example so you can setup your instance the way you desire.

Creating configuration

To store our custom configuration files, we will create a config map with the file content.
$ cat logstash-cfgmap.yml
apiVersion: v1
data:
  logstash-wrapper.sh: |-
      set -x -e
      rm -vf "/usr/share/logstash/config/logstash.yml"
      echo "xpack.monitoring.enabled: false" > "/usr/share/logstash/config/logstash.yml"
      exec /usr/local/bin/docker-entrypoint "$@"
  config: |-
    input{
        http{
            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
            port=>8888
            host=>"0.0.0.0"
        }
    }
    filter{
        split{
            # Webhook audit backend sends several events together with EventList
            # split each event here.
            field=>[items]
            # We only need event subelement, remove others.
            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
        }
        mutate{
            rename => {items=>event}
        }
    }
    output{
        file{
            # Audit events from different users will be saved into different files.
            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
        }
    }
kind: ConfigMap
metadata:
  name: logstash
$ oc create -f logstash-cfgmap.yml
configmap/logstash created

With the above config map we have two files.
  • logstash-wrapper.sh - this we need to run some custom commands before we delegate back to image original entry point. Namely to remove original `logstash.yml` that lacks group write permissions. As well disable elasticsearch monitoring that is enabled by default. The write permissions are needed in case logstash image startup script notice env variables that need to be converted to configuration entries and put into it. See env2yaml.go and docker-config docs.
  • config - this file contains logstash configuration file and is a copy of what I presently see in kubernetes auditing docs.
Note that at this step you can create full Logstash configuration inside the config map together with `logstash.yml`,`log4j2.properties`, `pipelines.yml`, etc. Then we can ignore default config from image.

Creating deployment config

$ oc run logstash  --image=logstash:6.5.0 --env=LOGSTASH_HOME\=/usr/share/logstash --command=true bash -- /etc/logstash/logstash-wrapper.sh -f /etc/logstash/config
deploymentconfig.apps.openshift.io/logstash created

A few things to explain:
  • we are setting LOGSTASH_HOME environment variable to `/usr/share/logstash` because we are running as a random user thus user home directory will not work
  • we override container start command to our wrapper script
    • we add `-f  /etc/logstash/config` to point at our custom config
    • in case we wanted to put all our configuration in the config map, then we can set instead `--path.settings /etc/logstash/`
    • once pull/113 is merged, the custom startup script wrapper will not be needed, but we may still want to provide additional arguments like `-f` and `--path.settings`
 Further we need to make sure our custom configuration is mounted under  `/usr/share/logstash`
$ oc set volume --add=true --configmap-name=logstash --mount-path=/etc/logstash dc/logstash
deploymentconfig.apps.openshift.io/logstash volume updated

Finally, because our custom config wants to write under /var/log, we need to mount a volume on that path.
oc set volume --add=true --mount-path=/var/log dc/logstash

What we did is create an emptyDir volume that will go away when pod dies. If you want to persist these logs, then a Persistent Volume needs to be used instead.

Exposing logstash service to the world

First we need to create a service that will allow other project pods and Kubernetes to reach Logstash.
$ oc expose dc logstash --port=8888
service/logstash exposed
Port 8888 is what we have set as an HTTP endpoint in `config`. If you expose other ports, then you'd have to create one service per each port that you care about.

We can easily expose HTTP endpoints to the great Internet so that we can collect logs from services external of the OpenShift environments. We can also expose non-HTTP endpoints to the internet with the node port service type but there are more limitations.
$ oc expose service logstash --name=logstash-http-input
route.route.openshift.io/logstash-http-input exposed

Important: Only expose secured endpoints to the Internet! In the above example the endpoint is insecure and no authentication is required. Thus somebody can DoS your Logstash service easily.

That's all.