Monday, April 15, 2019

accessing namespaces of a docker/podman container (nsenter)

There is a nice utility `nsenter` that allows you to switch to the namespace of another process. It took me considerable time to search it out today so thought to write a short blog about it.

Now I have a Podman container (for docker just use `docker` command instead of `podman` below). I started that container by:

$ sudo podman run -t -a STDIN -a STDOUT -a STDERR --rm=true --entrypoint /bin/bash quay.io/example/image:version

And I've been running some testing on it but it turned out I want to increase limits without destroying my preparations if I exit the process. So first thing is to figure out pid namespace of my container:

$ sudo podman ps --ns
CONTAINER ID  NAMES                PID   CGROUPNS    IPC         MNT         NET         PIDNS       USERNS      UTS
a147a3a5b35f  fervent_stonebraker  1408  4026531835  4026532431  4026532429  4026532360  4026532432  4026531837  4026532430

I see different namespaces but `nsenter` requires a file name to switch to a PID namespace. SO I will use the PID information in above output.

$ sudo nsenter --pid=/proc/1408/ns/pid

The above starts a shell for me in the PID namespace of my container. Now I want to change limits. Interesting to note here is that I change pid 1 as it is the PID of my bash shell in the container:

$ sudo prlimit --rss=-1 --memlock=33554432 --pid 1

Finally verify limits in my container shell:

bash-4.2$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 23534
max locked memory       (kbytes, -l) 32768
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 16384
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1048576
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

One interesting thing is `ps` inside namespace. If I run these two

$ ps -ef
$ sudo nsenter --pid=/proc/1408/ns/pid ps -ef

They will show exactly the same output. It is because I still have same `/proc` mounted even though my PID namespace is changed. And it is what `ps` looks at.

With `nsenter` you can switch any namespace, not only PID. I hope this is a useful short demonstration how to do fun things with linux namespaces.

Some links:
  • https://lwn.net/Articles/531114/ - namespaces overview series

Saturday, January 19, 2019

Install OKD 3.11 with source version of openshift-ansible installer

To install OpenShift by openshift-ansible from sources, one needs to build the openshift-ansible RPMs and install them as a repo on the machine performing the installation. For 3.11 in CI this is done by the following YAML.

First clone openshift-ansible repo.

$ git clone --depth=1 --branch=release-3.11 https://github.com/openshift/openshift-ansible.git

Then build base image as described in the YAML.

$ cd openshift-ansible
$ BUILDAH_LAYERS=false sudo podman build -f images/installer/Dockerfile -t ocp-ansible --layers=false .

Run the image and prepare for RPM building

$ sudo podman run -t -a STDIN -a STDOUT -a STDERR --rm=true -u root ocp-ansible /bin/bash
# yum install tito createrepo
# git clone https://github.com/openshift/openshift-ansible.git --depth=1 --branch=release-3.11
# git config --add user.email myemail@example.com
# git config --add user.name myname

Build RPMs as pointed in the rpm building section of the YAML with slight modifications. In bold I write things that differ.

# tito tag --offline --no-auto-changelog
# tito build --output="_output/local/releases" --rpm --test --offline --quiet
# createrepo _output/local/releases/noarch

Now RPM repo is under `_output/local/releases/noarch/`.  Copy it to a web server or locally on the machine where you would run the installation. Then create a file /etc/yum.repos.d/my-ocp-ansible.conf:

[tmp-openshift-ansible]
baseurl = <file:// or http:// url of RPM repo>
enabled = 1
gpgcheck = 0
name = Custom built OpenShift Ansible repo

Finally perform the installation as described in the official docs.

$ ansible-playbook ....

Make sure that you see your RPMs in the install log under `List all openshift ansible packages`.

Thursday, January 10, 2019

Building debug firefox build from source RPM on Red Hat Enterprise Linux

In short:
  • Create an account on https://access.redhat.com.
  • Get Red Hat Enterprise Linux (RHEL)
    • Download and install RHEL server on a local physical or virtual machine (it is free with developer subscription).
    • Or spawn a RHEL machine in some cloud service.
    • Important:  you will need a large machine. For me 4GB failed [*] and I used a 16GB one. I didn't check what is the minimum required.
  • If you installed your own RHEL, then you need to subscribe the machine.
    • subscription-manager register # use your access.redhat.com credentials
    • subscription-manager attach
      • if the above does not work automatically try the below
      • subscription-manager list --available
      • subscription-manager attach --pool=<whatever you find useful above>
  • sudo yum install yum-utils rpm-build
  • yumdownloader --source firefox
  • rpm -ivh firefox-*.rpm
  • sudo yum-builddep rpmbuild/SPECS/firefox.spec
    • on a vanilla system you will see missing dependencies
    • if you wanted to figure that out by yourself, you'd go to https://access.redhat.com and search for the packages to see what repos they come from (or maybe use some clever yum command that I don't know atm)
  • yum-config-manager --enable rhel-7-server-devtools-rpms rhel-7-server-optional-rpms
    • or edit /etc/yum.repos.d/redhat.repo
  • sudo yum-builddep rpmbuild/SPECS/firefox.spec # this time it will succeed
  • rpmbuild -ba --with=debug_build rpmbuild/SPECS/firefox.spec
  • find the built rpm at
    • ~/rpmbuild/RPMS/x86_64/firefox-60.4.0-1.el7.x86_64.rpm
    • ~/rpmbuild/RPMS/x86_64/firefox-debuginfo-60.4.0-1.el7.x86_64.rpm
    • ~/rpmbuild/SRPMS/firefox-60.4.0-1.el7.src.rpm

[*] it is really sad, in the past one could learn to be a developer on a budget machine. Nowadays it seems like even compiling your code takes a beefy one :/

Friday, November 23, 2018

Running Logstash container under OpenShift

What is the issue?

Main problem for running random images under OpenShift is that OpenShift starts containers as a random user. This is done for security reasons (isolation of workloads). A user can be given permissions to run `privileged` containers but this is not recommended if it can be avoided.

You can check my earlier blog about building an SSH container image for openshift for more information an a more complicated example.

Logstash official container image

Official logstash image can be found on dockerhub and is built off logstash-docker github project. It is not specifically built to run in OpenShift but it is still straightforward to run it unmodified. There are only 2 issues:
  • it tries to run as user 1000 and expects to find logstash code in user's home directory
  • some configuration files lack needed permissions to be modified by a randim user id

Get running it

Depending on what you're trying to do, you can approach in a somehow different way. I will give a specific example by mostly retaining original configuration (beats input and stdout output) but adding `config` file with Kubernetes audit setup and disabling elasticsearch monitoring as don't have an elasticsearch backend. I hope this will provide enough of an example so you can setup your instance the way you desire.

Creating configuration

To store our custom configuration files, we will create a config map with the file content.
$ cat logstash-cfgmap.yml
apiVersion: v1
data:
  logstash-wrapper.sh: |-
      set -x -e
      rm -vf "/usr/share/logstash/config/logstash.yml"
      echo "xpack.monitoring.enabled: false" > "/usr/share/logstash/config/logstash.yml"
      exec /usr/local/bin/docker-entrypoint "$@"
  config: |-
    input{
        http{
            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
            port=>8888
            host=>"0.0.0.0"
        }
    }
    filter{
        split{
            # Webhook audit backend sends several events together with EventList
            # split each event here.
            field=>[items]
            # We only need event subelement, remove others.
            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
        }
        mutate{
            rename => {items=>event}
        }
    }
    output{
        file{
            # Audit events from different users will be saved into different files.
            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
        }
    }
kind: ConfigMap
metadata:
  name: logstash
$ oc create -f logstash-cfgmap.yml
configmap/logstash created

With the above config map we have two files.
  • logstash-wrapper.sh - this we need to run some custom commands before we delegate back to image original entry point. Namely to remove original `logstash.yml` that lacks group write permissions. As well disable elasticsearch monitoring that is enabled by default. The write permissions are needed in case logstash image startup script notice env variables that need to be converted to configuration entries and put into it. See env2yaml.go and docker-config docs.
  • config - this file contains logstash configuration file and is a copy of what I presently see in kubernetes auditing docs.
Note that at this step you can create full Logstash configuration inside the config map together with `logstash.yml`,`log4j2.properties`, `pipelines.yml`, etc. Then we can ignore default config from image.

Creating deployment config

$ oc run logstash  --image=logstash:6.5.0 --env=LOGSTASH_HOME\=/usr/share/logstash --command=true bash -- /etc/logstash/logstash-wrapper.sh -f /etc/logstash/config
deploymentconfig.apps.openshift.io/logstash created

A few things to explain:
  • we are setting LOGSTASH_HOME environment variable to `/usr/share/logstash` because we are running as a random user thus user home directory will not work
  • we override container start command to our wrapper script
    • we add `-f  /etc/logstash/config` to point at our custom config
    • in case we wanted to put all our configuration in the config map, then we can set instead `--path.settings /etc/logstash/`
    • once pull/113 is merged, the custom startup script wrapper will not be needed, but we may still want to provide additional arguments like `-f` and `--path.settings`
 Further we need to make sure our custom configuration is mounted under  `/usr/share/logstash`
$ oc set volume --add=true --configmap-name=logstash --mount-path=/etc/logstash dc/logstash
deploymentconfig.apps.openshift.io/logstash volume updated

Finally, because our custom config wants to write under /var/log, we need to mount a volume on that path.
oc set volume --add=true --mount-path=/var/log dc/logstash

What we did is create an emptyDir volume that will go away when pod dies. If you want to persist these logs, then a Persistent Volume needs to be used instead.

Exposing logstash service to the world

First we need to create a service that will allow other project pods and Kubernetes to reach Logstash.
$ oc expose dc logstash --port=8888
service/logstash exposed
Port 8888 is what we have set as an HTTP endpoint in `config`. If you expose other ports, then you'd have to create one service per each port that you care about.

We can easily expose HTTP endpoints to the great Internet so that we can collect logs from services external of the OpenShift environments. We can also expose non-HTTP endpoints to the internet with the node port service type but there are more limitations.
$ oc expose service logstash --name=logstash-http-input
route.route.openshift.io/logstash-http-input exposed

Important: Only expose secured endpoints to the Internet! In the above example the endpoint is insecure and no authentication is required. Thus somebody can DoS your Logstash service easily.

That's all.

    Friday, December 29, 2017

    Why am I a necromancer.


    Some forum zealots are bullying poor souls who answer or correct 2, 3 or 5 years old threads. Same zealots are at the same time usually scolding users that don't first search and only later ask.

    Now my question is what is the point in searching in 5 years old posts that have never been updated? If we are going to have a canonical source of truth for every question, then we would have to update those. Or if we consider old threads not interesting thus shouldn't be updated, then why don't we delete them after some time to stop polluting Internet search engine results?

    I personally find most sense to keep old threads and when there is some update, then put it in. If I reached a thread, then it had a pretty high search rating, so it is likely other users would hit that one too. Why create a new thread and make information harder to reach? Or why delete old posts that might be useful. Even outdated, they often provide necessary clues to get closer to the desired results.

    My track record so far is some 22 necromancer badges on StackOverflow so I think other people also appreciate my approach. In fact, most of my answers are to old questions that I reached by means of Internet search engines and decided to update.

    Now there is the dark side of clueless users that put useless comments in old threads. Or don't understand what has been already written (or didn't read it) and ask stupid questions [*]. The thing is that they can't be avoided unfortunately and they spam even new threads. I don't think useless bumping of old threads should be treated equally to useful updates made to old threads.

    In summary:
    • Old thread
      • useful post
        • upvote
        • clap
        • thank
        • like
        • etc.
      • useless comment/stupid question
        • downvote
        • remove post
        • send angry face
        • ban the user
        • remove account
        • report to police
        • etc.
    • Recent thread
      • useful post
        • upvote
        • clap
        • thank
        • like
        • etc.
      • useless comment/stupid question
        • downvote
        • remove post
        • send angry face
        • ban the user
        • remove account
        • report to police
        • etc.
    Happy necromancing.

    [*] I'm not immune to asking stupid question. I'm somehow exaggerating, the point being one shouldn't attack every post to an old thread regardless of its quality.

    Wednesday, December 20, 2017

    Debugging input devices

    Having troubles with input devices like mice, touchpads and keyboards or even cameras is hard to debug. Usually one is not sure whether the device is misbehaving or the desktop environment or the application are mishandling the events from the input device.

    First check if the driver used for your device is what you expect. For example I had mi x11 libinput driver removed by `dnf autoremove` and had my touchpad taken by `evdev` thus not working.

    $ xinput list-props "SynPS/2 Synaptics TouchPad" 
    Device 'SynPS/2 Synaptics TouchPad': 
        Device Enabled (140):    1 
        Coordinate Transformation Matrix (142):    1.000000, 0.000000, 0.000000,  0.000000, 1.000000, 0.000000, 0.000000, 0.000000, 1.000000 
        Device Accel Profile (275):    0 
        Device Accel Constant Deceleration (276):    1.000000 
        Device Accel Adaptive Deceleration (277):    1.000000 
        Device Accel Velocity Scaling (278):    10.000000 
        Device Product ID (262):    2, 7 
        Device Node (263):    "/dev/input/event4" 
        Evdev Axis Inversion (279):    0, 0 
        Evdev Axis Calibration (280):    <no items> 
        Evdev Axes Swap (281):    0 
        Axis Labels (282):    "Abs MT Position X" (302), "Abs MT Position Y"  (303), "Abs MT Pressure" (304), "Abs Tool Width" (301), "None" (0),  "None" (0), "None" (0) 
        Button Labels (283):    "Button Left" (143), "Button Unknown" (265),  "Button Unknown" (265), "Button Wheel Up" (146), "Button Wheel Down" (147) 
        Evdev Scrolling Distance (284):    0, 0, 0 
        Evdev Middle Button Emulation (285):    0 
        Evdev Middle Button Timeout (286):    50 
        Evdev Middle Button Button (287):    2 
        Evdev Third Button Emulation (288):    0 
        Evdev Third Button Emulation Timeout (289):    1000 
        Evdev Third Button Emulation Button (290):    3 
        Evdev Third Button Emulation Threshold (291):    20 
        Evdev Wheel Emulation (292):    0 
        Evdev Wheel Emulation Axes (293):    0, 0, 4, 5 
        Evdev Wheel Emulation Inertia (294):    10 
        Evdev Wheel Emulation Timeout (295):    200 
        Evdev Wheel Emulation Button (296):    4 
        Evdev Drag Lock Buttons (297):    0
    

    Usually you'd expect to see `libinput` (synaptics is now abandoned).

    ...
        libinput Send Events Mode Enabled (266):    0, 0
        libinput Send Events Mode Enabled Default (267):    0, 0
    ...
    

    Fortunately there is a tool to help understand what is device sending to the computer. This works for libinput devices.

    $ sudo dnf install evemu
    

    Then we can see
    $ ls /usr/bin/evemu-*
    /usr/bin/evemu-describe  /usr/bin/evemu-event  /usr/bin/evemu-record
    /usr/bin/evemu-device    /usr/bin/evemu-play
    

    These executable files can be used to inspect, record and replay the events sent by any connected device.

    $ sudo evemu-record
    Available devices:
    /dev/input/event0: Lid Switch
    /dev/input/event1: Sleep Button
    /dev/input/event2: Power Button
    /dev/input/event3: AT Translated Set 2 keyboard
    /dev/input/event4: SynPS/2 Synaptics TouchPad
    /dev/input/event5: Video Bus
    /dev/input/event6: Video Bus
    /dev/input/event7: TPPS/2 IBM TrackPoint
    /dev/input/event8: Logitech MX Anywhere 2
    /dev/input/event9: ThinkPad Extra Buttons
    /dev/input/event10: HDA Intel PCH Dock Mic
    /dev/input/event11: HDA Intel PCH Mic
    /dev/input/event12: HDA Intel PCH Dock Headphone
    /dev/input/event13: HDA Intel PCH Headphone
    /dev/input/event14: HDA Intel PCH HDMI/DP,pcm=3
    /dev/input/event15: HDA Intel PCH HDMI/DP,pcm=7
    /dev/input/event16: HDA Intel PCH HDMI/DP,pcm=8
    /dev/input/event17: HDA Intel PCH HDMI/DP,pcm=9
    /dev/input/event18: HDA Intel PCH HDMI/DP,pcm=10
    /dev/input/event19: Integrated Camera: Integrated C
    Select the device event number [0-19]: 8 
    # EVEMU 1.3
    # Kernel: 4.14.5-300.fc27.x86_64
    # DMI: dmi:bvnLENOVO:bvrR07ET63W(2.03):bd03/15/2016:svnLENOVO:pn20FXS0BB14:pvrThinkPadT460p:rvnLENOVO:rn20FXS0BB14:rvrNotDefined:cvnLENOVO:ct10:cvrNone:
    # Input device name: "Logitech MX Anywhere 2"
    # Input device ID: bus 0x03 vendor 0x46d product 0x4063 version 0x111
    # Supported events:
    #   Event type 0 (EV_SYN)
    #     Event code 0 (SYN_REPORT)
    #     Event code 1 (SYN_CONFIG)
    ...
    B: 15 00 00 00 00 00 00 00 00
    A: 20 1 652 0 0 0
    ################################
    #      Waiting for events      #
    ################################
    E: 0.000001 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.000001 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +0ms
    E: 0.013561 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.013561 0002 0001 0001 # EV_REL / REL_Y                1
    E: 0.013561 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +13ms
    E: 0.039808 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.039808 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +26ms
    E: 0.063578 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.063578 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +24ms
    E: 0.071790 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.071790 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +8ms
    E: 0.087586 0002 0000 0001 # EV_REL / REL_X                1
    E: 0.087586 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +16ms
    E: 0.111578 0002 0001 0001 # EV_REL / REL_Y                1
    E: 0.111578 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +24ms
    ...
    

    Decoding those would be left for another post or as an exercise for the reader. At the very least one can prepare logs while things are misbehaving and then report bugs to the affected projects with the logs attached. Make sure to read `man evemu-record` to check for a common issues preventing event capturing.

    -- thanks to Peter Hutterer for pointing me at this tool