Health Monitoring

The following is adapted from some email from Nitesh. I'm including it so it doesn't disappear into my mailbox.

  • Login to gdp-s1
  • Check what's containers are running using docker ps. I didn't see any containers that were running the gdp-monitoring image (hongdb/mariadb is running, however.)
  • See what was the last running container using docker ps -a. Roughly based on the exit date, there was a running container with the name clever_bassi.
  • Start the container as it is. I know this was going to fail, but whatever. I used docker container start clever_bassi.
  • Of course, it failed after a bit and disappeared from running containers. Verify that this was because of incorrect password. It was. I used docker logs --tail 60 clever_bassi
  • Commit the now dead container to the gdp-monitoring image, in case I made some local changes to the running container. This is a 2 step process:
    1. First, docker commit clever_bassi. This roughly translates to committing the disk state of the container to an image, and gives you a hash. [sha256:abcd...]
    2. Then, use this commit hash and tag it as the gdp-monitoring image. I used docker tag <abcd> gdp-monitoring, where <abcd> is first few digits of a git hash. This whole process is quite similar to git.
  • Next, see what's the state of affairs inside the image. Get in using docker run -i -t gdp-monitoring /bin/bash. See /mnt/ just to get a sense.
  • Turns out that the image is run with stuff mounted from /var/swarm/gdp/monitoring/ from the host. The actual password is not stored in the image. Good.
  • Update the password in the /var/swarm/gdp/monitoring/emailconfig on the host, and run the container now (interactive mode) docker run -i -t -v /var/swarm/gdp/monitoring:/mnt gdp-monitoring/

You can also run in non-interactive mode as well, but this is good enough for the moment.

Eric's observations and thoughts:

  • /var/swarm/gdp/monitoring on gdp-s1 has a world-readable file called emailconfig containing the password in plaintext. I changed this so it was only readable by user or group gdp.
  • Arguably only the logs belong in /var/swarm/gdp/monitoring. The other files (emailconfig and tests.conf) are configuration and belong in /etc/gdp, not /var. I realize that this makes life more difficult when starting up Docker, but backup schedules are usually different and there was a reason that /var was created in the first place.
  • It appears /mnt/ isn't in git (that I can find).
  • It isn't clear why there isn't just a Dockerfile to create this image.
  • It appears there still isn't an automatic startup on reboot.

I realize this is just grumping ... more to follow.