Friday, November 23, 2018

Running Logstash container under OpenShift

What is the issue?

Main problem for running random images under OpenShift is that OpenShift starts containers as a random user. This is done for security reasons (isolation of workloads). A user can be given permissions to run `privileged` containers but this is not recommended if it can be avoided.

You can check my earlier blog about building an SSH container image for openshift for more information an a more complicated example.

Logstash official container image

Official logstash image can be found on dockerhub and is built off logstash-docker github project. It is not specifically built to run in OpenShift but it is still straightforward to run it unmodified. There are only 2 issues:
  • it tries to run as user 1000 and expects to find logstash code in user's home directory
  • some configuration files lack needed permissions to be modified by a randim user id

Get running it

Depending on what you're trying to do, you can approach in a somehow different way. I will give a specific example by mostly retaining original configuration (beats input and stdout output) but adding `config` file with Kubernetes audit setup and disabling elasticsearch monitoring as don't have an elasticsearch backend. I hope this will provide enough of an example so you can setup your instance the way you desire.

Creating configuration

To store our custom configuration files, we will create a config map with the file content.
$ cat logstash-cfgmap.yml
apiVersion: v1
data:
  logstash-wrapper.sh: |-
      set -x -e
      rm -vf "/usr/share/logstash/config/logstash.yml"
      echo "xpack.monitoring.enabled: false" > "/usr/share/logstash/config/logstash.yml"
      exec /usr/local/bin/docker-entrypoint "$@"
  config: |-
    input{
        http{
            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
            port=>8888
            host=>"0.0.0.0"
        }
    }
    filter{
        split{
            # Webhook audit backend sends several events together with EventList
            # split each event here.
            field=>[items]
            # We only need event subelement, remove others.
            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
        }
        mutate{
            rename => {items=>event}
        }
    }
    output{
        file{
            # Audit events from different users will be saved into different files.
            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
        }
    }
kind: ConfigMap
metadata:
  name: logstash
$ oc create -f logstash-cfgmap.yml
configmap/logstash created

With the above config map we have two files.
  • logstash-wrapper.sh - this we need to run some custom commands before we delegate back to image original entry point. Namely to remove original `logstash.yml` that lacks group write permissions. As well disable elasticsearch monitoring that is enabled by default. The write permissions are needed in case logstash image startup script notice env variables that need to be converted to configuration entries and put into it. See env2yaml.go and docker-config docs.
  • config - this file contains logstash configuration file and is a copy of what I presently see in kubernetes auditing docs.
Note that at this step you can create full Logstash configuration inside the config map together with `logstash.yml`,`log4j2.properties`, `pipelines.yml`, etc. Then we can ignore default config from image.

Creating deployment config

$ oc run logstash  --image=logstash:6.5.0 --env=LOGSTASH_HOME\=/usr/share/logstash --command=true bash -- /etc/logstash/logstash-wrapper.sh -f /etc/logstash/config
deploymentconfig.apps.openshift.io/logstash created

A few things to explain:
  • we are setting LOGSTASH_HOME environment variable to `/usr/share/logstash` because we are running as a random user thus user home directory will not work
  • we override container start command to our wrapper script
    • we add `-f  /etc/logstash/config` to point at our custom config
    • in case we wanted to put all our configuration in the config map, then we can set instead `--path.settings /etc/logstash/`
    • once pull/113 is merged, the custom startup script wrapper will not be needed, but we may still want to provide additional arguments like `-f` and `--path.settings`
 Further we need to make sure our custom configuration is mounted under  `/usr/share/logstash`
$ oc set volume --add=true --configmap-name=logstash --mount-path=/etc/logstash dc/logstash
deploymentconfig.apps.openshift.io/logstash volume updated

Finally, because our custom config wants to write under /var/log, we need to mount a volume on that path.
oc set volume --add=true --mount-path=/var/log dc/logstash

What we did is create an emptyDir volume that will go away when pod dies. If you want to persist these logs, then a Persistent Volume needs to be used instead.

Exposing logstash service to the world

First we need to create a service that will allow other project pods and Kubernetes to reach Logstash.
$ oc expose dc logstash --port=8888
service/logstash exposed
Port 8888 is what we have set as an HTTP endpoint in `config`. If you expose other ports, then you'd have to create one service per each port that you care about.

We can easily expose HTTP endpoints to the great Internet so that we can collect logs from services external of the OpenShift environments. We can also expose non-HTTP endpoints to the internet with the node port service type but there are more limitations.
$ oc expose service logstash --name=logstash-http-input
route.route.openshift.io/logstash-http-input exposed

Important: Only expose secured endpoints to the Internet! In the above example the endpoint is insecure and no authentication is required. Thus somebody can DoS your Logstash service easily.

That's all.

    Friday, December 29, 2017

    Why am I a necromancer.


    Some forum zealots are bullying poor souls who answer or correct 2, 3 or 5 years old threads. Same zealots are at the same time usually scolding users that don't first search and only later ask.

    Now my question is what is the point in searching in 5 years old posts that have never been updated? If we are going to have a canonical source of truth for every question, then we would have to update those. Or if we consider old threads not interesting thus shouldn't be updated, then why don't we delete them after some time to stop polluting Internet search engine results?

    I personally find most sense to keep old threads and when there is some update, then put it in. If I reached a thread, then it had a pretty high search rating, so it is likely other users would hit that one too. Why create a new thread and make information harder to reach? Or why delete old posts that might be useful. Even outdated, they often provide necessary clues to get closer to the desired results.

    My track record so far is some 22 necromancer badges on StackOverflow so I think other people also appreciate my approach. In fact, most of my answers are to old questions that I reached by means of Internet search engines and decided to update.

    Now there is the dark side of clueless users that put useless comments in old threads. Or don't understand what has been already written (or didn't read it) and ask stupid questions [*]. The thing is that they can't be avoided unfortunately and they spam even new threads. I don't think useless bumping of old threads should be treated equally to useful updates made to old threads.

    In summary:
    • Old thread
      • useful post
        • upvote
        • clap
        • thank
        • like
        • etc.
      • useless comment/stupid question
        • downvote
        • remove post
        • send angry face
        • ban the user
        • remove account
        • report to police
        • etc.
    • Recent thread
      • useful post
        • upvote
        • clap
        • thank
        • like
        • etc.
      • useless comment/stupid question
        • downvote
        • remove post
        • send angry face
        • ban the user
        • remove account
        • report to police
        • etc.
    Happy necromancing.

    [*] I'm not immune to asking stupid question. I'm somehow exaggerating, the point being one shouldn't attack every post to an old thread regardless of its quality.

    Wednesday, December 20, 2017

    Debugging input devices

    Having troubles with input devices like mice, touchpads and keyboards or even cameras is hard to debug. Usually one is not sure whether the device is misbehaving or the desktop environment or the application are mishandling the events from the input device.

    First check if the driver used for your device is what you expect. For example I had mi x11 libinput driver removed by `dnf autoremove` and had my touchpad taken by `evdev` thus not working.

    $ xinput list-props "SynPS/2 Synaptics TouchPad" 
    Device 'SynPS/2 Synaptics TouchPad': 
        Device Enabled (140):    1 
        Coordinate Transformation Matrix (142):    1.000000, 0.000000, 0.000000,  0.000000, 1.000000, 0.000000, 0.000000, 0.000000, 1.000000 
        Device Accel Profile (275):    0 
        Device Accel Constant Deceleration (276):    1.000000 
        Device Accel Adaptive Deceleration (277):    1.000000 
        Device Accel Velocity Scaling (278):    10.000000 
        Device Product ID (262):    2, 7 
        Device Node (263):    "/dev/input/event4" 
        Evdev Axis Inversion (279):    0, 0 
        Evdev Axis Calibration (280):    <no items> 
        Evdev Axes Swap (281):    0 
        Axis Labels (282):    "Abs MT Position X" (302), "Abs MT Position Y"  (303), "Abs MT Pressure" (304), "Abs Tool Width" (301), "None" (0),  "None" (0), "None" (0) 
        Button Labels (283):    "Button Left" (143), "Button Unknown" (265),  "Button Unknown" (265), "Button Wheel Up" (146), "Button Wheel Down" (147) 
        Evdev Scrolling Distance (284):    0, 0, 0 
        Evdev Middle Button Emulation (285):    0 
        Evdev Middle Button Timeout (286):    50 
        Evdev Middle Button Button (287):    2 
        Evdev Third Button Emulation (288):    0 
        Evdev Third Button Emulation Timeout (289):    1000 
        Evdev Third Button Emulation Button (290):    3 
        Evdev Third Button Emulation Threshold (291):    20 
        Evdev Wheel Emulation (292):    0 
        Evdev Wheel Emulation Axes (293):    0, 0, 4, 5 
        Evdev Wheel Emulation Inertia (294):    10 
        Evdev Wheel Emulation Timeout (295):    200 
        Evdev Wheel Emulation Button (296):    4 
        Evdev Drag Lock Buttons (297):    0
    

    Usually you'd expect to see `libinput` (synaptics is now abandoned).

    ...
        libinput Send Events Mode Enabled (266):    0, 0
        libinput Send Events Mode Enabled Default (267):    0, 0
    ...
    

    Fortunately there is a tool to help understand what is device sending to the computer. This works for libinput devices.

    $ sudo dnf install evemu
    

    Then we can see
    $ ls /usr/bin/evemu-*
    /usr/bin/evemu-describe  /usr/bin/evemu-event  /usr/bin/evemu-record
    /usr/bin/evemu-device    /usr/bin/evemu-play
    

    These executable files can be used to inspect, record and replay the events sent by any connected device.

    $ sudo evemu-record
    Available devices:
    /dev/input/event0: Lid Switch
    /dev/input/event1: Sleep Button
    /dev/input/event2: Power Button
    /dev/input/event3: AT Translated Set 2 keyboard
    /dev/input/event4: SynPS/2 Synaptics TouchPad
    /dev/input/event5: Video Bus
    /dev/input/event6: Video Bus
    /dev/input/event7: TPPS/2 IBM TrackPoint
    /dev/input/event8: Logitech MX Anywhere 2
    /dev/input/event9: ThinkPad Extra Buttons
    /dev/input/event10: HDA Intel PCH Dock Mic
    /dev/input/event11: HDA Intel PCH Mic
    /dev/input/event12: HDA Intel PCH Dock Headphone
    /dev/input/event13: HDA Intel PCH Headphone
    /dev/input/event14: HDA Intel PCH HDMI/DP,pcm=3
    /dev/input/event15: HDA Intel PCH HDMI/DP,pcm=7
    /dev/input/event16: HDA Intel PCH HDMI/DP,pcm=8
    /dev/input/event17: HDA Intel PCH HDMI/DP,pcm=9
    /dev/input/event18: HDA Intel PCH HDMI/DP,pcm=10
    /dev/input/event19: Integrated Camera: Integrated C
    Select the device event number [0-19]: 8 
    # EVEMU 1.3
    # Kernel: 4.14.5-300.fc27.x86_64
    # DMI: dmi:bvnLENOVO:bvrR07ET63W(2.03):bd03/15/2016:svnLENOVO:pn20FXS0BB14:pvrThinkPadT460p:rvnLENOVO:rn20FXS0BB14:rvrNotDefined:cvnLENOVO:ct10:cvrNone:
    # Input device name: "Logitech MX Anywhere 2"
    # Input device ID: bus 0x03 vendor 0x46d product 0x4063 version 0x111
    # Supported events:
    #   Event type 0 (EV_SYN)
    #     Event code 0 (SYN_REPORT)
    #     Event code 1 (SYN_CONFIG)
    ...
    B: 15 00 00 00 00 00 00 00 00
    A: 20 1 652 0 0 0
    ################################
    #      Waiting for events      #
    ################################
    E: 0.000001 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.000001 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +0ms
    E: 0.013561 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.013561 0002 0001 0001 # EV_REL / REL_Y                1
    E: 0.013561 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +13ms
    E: 0.039808 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.039808 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +26ms
    E: 0.063578 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.063578 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +24ms
    E: 0.071790 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.071790 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +8ms
    E: 0.087586 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.087586 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +16ms
    E: 0.111578 0002 0001 0001 # EV_REL / REL_Y                1
    E: 0.111578 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +24ms
    ...
    

    Decoding those would be left for another post or as an exercise for the reader. At the very least one can prepare logs while things are misbehaving and then report bugs to the affected projects with the logs attached. Make sure to read `man evemu-record` to check for a common issues preventing event capturing.

    -- thanks to Peter Hutterer for pointing me at this tool

    Wednesday, June 1, 2016

    Creating docker images suitable for OpenShift v3 (ssh-git image HowTo)

    Intro

    This is not going to be a detailed guide for creating docker images. I'll present an example ssh-git image and highlight the more important concerns for running such image on OpenShift v3. Things are basically in documentation but I hope to just get you started quickly. (update: wow I thought it's gonna be a few lines but it turned into a monster)

    tl;dr; skip to the OpenShift section

    Plain Docker image

    Starting with little docker experience and no knowledge about OpenShift requirements I just went ahead to create a standard SSH server image and because of a nice git feature, one can just create a local `bare` repo to be served over SSH (to whoever has a matching key in ~/.ssh/authorized_keys).

    I looked around but found only a few Ubuntu examples. My favorite distro is Fedora (I'm affiliated but still) so thought it's a shame and went ahead to create a fedora based Dockerfile. In fact it was pretty much pain-free. Here's my initial version running OpenSSH as root:

    https://github.com/openshift-qe/ssh-git-docker/blob/master/ssh-git-root/Dockerfile

    The interesting points are:
    • `FROM fedora:latest`
    • `RUN ...` - pretty much standard commands to install ssh and configure a user; I usually also do `restorecon -R ~/.ssh` but inside docker selinux is nil, thus that's skipped.
    • `EXPOSE 22` - so that docker knows which ports are needed
    • `CMD ssh-keygen -A && exec /usr/sbin/sshd -D` - here interesting part is generating keys as OpenSSH can't work properly otherwise

    Building, running, tagging, pushing

    Building

    # docker build -t docker.io/myaccount/imagename:latest PATH

    Where latest can also be another version.

    Tagging

    # docker tag docker.io/myaccount/imagename:latest docker.io/myaccount/imagename:1.0

    As image you can use a tag or an image hash. Doesn't matter.

    Running

    Launch container with ports exposed and giving container a name.
    # docker run -d -P --name ssh-git-server myaccount/imagename:latest
    btw you can try the built image from `aosqe/ssh-git-server:root-20150525`

    Get exposed port number so you can use it later.
    # docker port ssh-git-server 22

    Put your ssh public key in.
    # docker exec ssh-git-server bash -c 'echo "ssh-rsa ..." > /home/git/.ssh/authorized_keys'

    Clone the sample repo:
    $ git clone ssh://git@localhost:32769/repos/sample.git

    Terminating

    # docker rm ssh-git-server
    # docker rmi <image tag> # to get rid of the image locally
    

    Sharing with others (pushing)

    Then you can push these images to dockerhub:
    # docker login docker.io
    # docker push docker.io/myaccount/imagename:lates
    

    Image where SSHd runs as a regular user

    I knew OpenShift doesn't let you run images as root so next step was to create an image where OpenSSH runs as the `git` user. In fact it allows you, but you have to grant your user extra privileges and there is really no good reason to do that for a ssh-git server. Also a future OpenShift Online service would not allow such extra privileges for security reasons. At some point it is likely secure root pods to be allowed using user namespaces with some performance penalty.


    That was even less painful thanks to an old post in cygwin list. Basically privilege separation needs to be turned off as it can only work as root and adjust some locations in sshd_config using `sed`. Finally little `chown/chmod` adjustments. And before I forget, port cannot be 22 so I selected 2022.

    So new things are adding more `RUN` commands and the `USER git` directive so final CMD is run as the user instead root. Here's the result:

    https://github.com/openshift-qe/ssh-git-docker/blob/master/ssh-git-user/Dockerfile

    You can try:
    # docker run -d -P --name ssh-git-server aosqe/ssh-git-server:git-20150525

    But testing this on OpenShift I've got the strange error message:

    No user exists for uid 1000530000

    I was stuck here for a little while until I could figure that error is not produced by OpenShift but ssh server itself.

    OpenShift ready Image

    What I found out (see the official guidelines in the References section) is that regardless of your `USER` directive in Dockerfile, unless you give the user or service account that would launch pod as some random UID. The group will be static though - root.

    Because that random UID will not be part of the passwd file, some programs will fail to start with an error message like what I saw above. Another issue is that pre-setup of SSH becomes impossible as some files need to be with permissions 700 for ssh to accept them. Obviously as a random UID we cannot repair that once pod stats.

    Here's how I approached:
    1. move most setup to the container start CMD
    2. make a couple directories writable to the root group so that step #1 can create necessary new files (this time with proper owner and permissions)
    3. make `passwd` root group writable so that we can fix our UID (official guideline suggests using nss wrapper but I thought it's easier to just fix in-place)
    End result is otherwise basically the same thing, just moving around the commands:

    https://github.com/openshift-qe/ssh-git-docker/blob/master/ssh-git-openshift/Dockerfile

    btw I have to change the multi-line CMD to a shell script and add to image. Would be easier to customize.

    Doing it the OpenShift way

    Since I've got an OpenShift Enterprise environment running I thought to use it directly instead of using plain docker commands (which would also work fine):
    $ oc new-build -name=git-server --context-dir=ssh-git-openshift https://github.com/openshift-qe/ssh-git-docker

    FYI you can append `#branch` if you want to build of non-default branch. The good thing using this approach is that image can be rebuilt automatically when base image (fedora:latest) changes and when your code changes. You may need to configure hooks though. See triggers doc.

    To monitor build:
    $ oc logs -f bc/git-server

    In the log, you will see something like (can be useful later):
    The push refers to a repository [172.30.2.69:5000/a9vk7/git-server]
    Now run the image by:
    $ oc new-app git-server:latest --name=git-server

    You would end up with a deployment config called git-server that creates a replication controller `git-server-1` that keeps one pod called `git-server-...` running from the `git-server` image stream created by the `new-build` command. Also a service called `git-server` is created that will provide you with a stable IP to access the pod + its name can be used as a hostname of the git server in any pod or build that happens within the same project.

    One last detail is to make service listen on port 22 for nicer git URLs:
    $ oc edit svc git-server # change `port` to 22 from 2022

    Note that services can only be accessed from pods running in same project or in project 'default'. To access service from the internet, you need to create a nodePort service. Because this is not HTTP based, we can't use regular routes. Hope to get on that later.

    To see you pod name and use it, you can do:
    $ oc get pod # see pod name

    Then:
    $ oc rsh git-server-... # configure ssh keys there, create repos, etc.

    Now once you have your public key in the pod, you can access this server from other pods. You can do for trying out from the server pod itself. Provided you have the matching private key. While into `rsh` do:
    $ git clone git@git-server:sample.git

    To push your image to dockerhub, see how to set build config output. Or you can manually ssh to the OpenShift node and do as root:
    # docker tag 172.30.2.69:5000/a9vk7/git-server docker.io/myaccount/imagename:latest
    # docker login docker.io 
    # docker push docker.io/myaccount/imagename:latest
    
    If you want to run your image off dockerhub, you can do:
    $ oc run git-server --image=aosqe/ssh-git-server-openshift
    $ oc expose dc git-server --port=22 --target-port=2022
    $ oc set probe dc/git-server --readiness --open-tcp=2022
    

    Setting the probe lets your replication controller notice pod is dead and spawn a new one.

    Some words about persistent volumes
    The way images I refer to above are built would cause any changes in public keys and repo data to be lost upon pod restart. To avoid that persistent volumes need to be used.
    Persistent volumes at attach time will be chowned to the current UID of the pod. Provided the OpenShift ready image does setup at launch time, that should be easily supportable. i.e. mount volume to /home/git/

    But a few changes will still need to be done:
    • creation of sample git repo needs to be conditional, when it doesn't exist
    • `sshd_config` and ssh-keygen should create host keys somewhere in `git` user home dir to keep host keys between pod restarts

    Future work

    • make the OpenShift ready image runnable off a persistent volume  
    • add info about making repo accessible from the Internet
    • convert multi-line CMD to a startup script
    Current post is based on the initial commit of the docker files in the repo.

      References

      Friday, May 13, 2016

      quick debugging KVM VM issues

      See a hang or infinite loop, or perf issue with VM on KVM? Here's how to get a trace of it so a bugzilla report can be meaningful:

      First attach VM configuration XML. That is obtained by:

      >  sudo virsh dumpxml [vm_name] > some_file

      Cole Robinson wrote on 09/23/2014 04:24 PM:
      > sudo debuginfo-install qemu-system-x86
      >
      > Then on the next hang, grab the pid of the busted VM from ps axwww, and do:
      >
      > sudo pstack $pid
      >
      > The dump that output in a bug report, along with
      > /var/log/libvirt/qemu/$vmname.log. File it against qemu

      Also interesting might be system log from Host and guest. On Fedora you can obtain it by a command similar to:

      > sudo journalctl --system --since today

      Tuesday, May 10, 2016

      replicating HTTP Server replies using ncat and socat

      I was looking at an issue that rest-client ruby gem raised an error on `#cookies_jar` on one particular server while it worked fine on a couple of public servers I tried [1].

      I was just going to write a simple script to serve as a HTTP server to return me same response as the offending HTTP server but hey, I thought, there must be an easier way.

      So I just obtained raw response from original server, put it into a file and asked netcat to listen and give it back on request.

      $ cat > response.raw << "EOF"
      HTTP/1.1 200 OK
      Accept-Ranges: bytes
      Content-Length: 36
      Content-Type: text/html; charset=utf-8
      Last-Modified: Mon, 11 Apr 2016 05:39:53 GMT
      Server: Caddy
      Date: Tue, 10 May 2016 08:10:17 GMT
      Set-Cookie: OPENSHIFT_x7xn3_service-unsecure_SERVERID=c72192d7fe9c33d8dec083448dd4f40f; path=/; HttpOnly
      Cache-control: private
      
      Hello-OpenShift-Path-Test http-8080
      
      EOF
      
      $ nc -l 8080 < response.raw
      ## on another console
      $ curl -v localhost:8080 

      That's the simplest I could get. It will return the same thing regardless of path and query string you put in your client URL. e.g. this will work the same:

      $ curl -v localhost:8080/path&asd=5

      Now if you want your server to return something multiple times, then you can try

      $ nc -kl 8080 -c 'cat response.raw'

      Another option if your system lacks netcat is the `socat` utility.

      $ socat TCP-LISTEN:8080,fork EXEC:"cat response.raw" 

      If you remove `fork` from the options, it will exit after first connection served. But we can also listen over HTTPS:

      $ socat OPENSSL-LISTEN:8080,cert=/path/cert.pem,verify=0 EXEC:"cat response.raw"

      Again, add `fork` option to keep listening. This above will ignore client certificate. In fact you can create proper client cert and configure SSL verify. But that's beyond today's topic. FYI, use `socat` version 1.7.3.1+, otherwise you'd be hit with weak DH key used [2]. As a workaround you could generate DH key in a file and provide it with the `dhparams` option to socat.

      [1] https://github.com/rest-client/rest-client/issues/487
      [2] https://bugzilla.redhat.com/show_bug.cgi?id=1021946