Close

Making a Docker Image from Scratch for Fossil-SCM

A project log for Dockerize All the Things

hard drive crash! recover, revive, and re-engineer server using docker contained services this time.

ziggurat29ziggurat29 11/18/2020 at 17:590 Comments

Summary

My next stop in this Odyssey involves getting a little more hands-on with Docker.  Here, I create a bespoke image for an application, and integrate that into my suite of services.  We explore 'multi-stage builds'.

Deets

Another of my services to be restored is my source code management (SCM) system.  Indeed, it is due to data failures in the SCC that first alerted me to the fact that my server was failing.  It's a separate topic as to how I did data recovery for that, but the short story is that it was a distributed version control system (DVCS), so I was able to recover by digging up some clones on some of my other build systems.

The SCM I am presently using for most of my personal projects is Fossil.  I'm not trying to proselytize that system here, but I should mention some of it's salient features to give some context on what is going to be involved in getting that service back up and running.

Fossil is a DVCS, in the vein of Git and Mercurial, and for the most part the workflow is similar.  The features I like about Fossil is that it also includes a wiki, bug tracking/ticketing system, technical notes, a forum (this is new to me), in addition to the source code control.  All of this is provided in a single binary file, and a project repository is similarly self-contained in a single file.  The gory details of the file format are publicly documented.  It was created by the SQLite folks, who use it for SQLite source, and also now SQLite's public forum is hosted from it as well.  It's kind of cool!  If you choose to check it out, know also that it can bi-directionally synchronize with git.  (I have done this but I'm not going to discuss that here.)

DVCS was an important thing for the Linux kernel development, but pretty much everything else I have seen doesn't really leverage the 'distributed' part of it.  DVCS systems are still mostly used in a master/slave kind of arrangement.  What seems to me to be the reason they took off so strongly was that they have really, really, good diff and merge capabilities relative to prior systems.  Not because prior systems couldn't, but rather they didn't need to as badly, and DVCS just wouldn't be viable at all if they didn't have really good branch-and-merge capabilities.  So I think it's the improvement in branch-and-merge that led to their widespread adoption more than the 'distributed' part of it. (Which when you think about it, is kind of a hassle:  I've got to commit AND push? lol.)

Anyway, Fossil is less commonly used, so you usually build it from source.  No biggie -- it's really easy to build, and it's just a single binary file you put somewhere in your path, and you're done.

Providing a server for it means starting the program with some command-line switches.  In the past, I set up an xinetd to spawn one on-demand.  Now, I'll just run it in a docker container in the more pedestrian command-line mode.

The protocol Fossil uses is http-based.  This means that I can use nginx to proxy it.  Historically, I opened a separate port (I arbitrarily chose 8086), but I now can use a sub-domain, e.g. fossil.example.com, and have nginx proxy that over the the fossil server, and avoid opening another port.

Alas, not so fast for me.  I am using a dynamic DNS service, which doesn't support subdomains on the free account, so I'll still have to open that port, alas.  I do have a couple 'spare' domains on GoDaddy, so I can test out the proxy configuration, anyway, though.

The other benefit of proxying through nginx (when you can), is that you can do it over TLS.  Fossil's built-in web server doesn't do TLS at this time -- you have to reverse proxy for that.

OK!  Time to build!

Building Fossil

I did first build on the host system because I find the fossil exe handy for inspecting the repositories in an ad-hoc manner, and also by using the test-integrity command from time-to-time to detect corruption (shouldn't happen, but the defective SD card is what brough me to this Odyssey).  I won't cover that here.  But I did do the same in the 'alpine:latest' container to develop and test the commands that I will ultimately put in my Dockerfile.

#start an interactive container for development of Dockerfile
docker container run -it --rm alpine:latest sh

and then I tried it out until I wound up with this recipe:

#try out the following to verify we can build and install fossil
mkdir build
cd build
apk add --update alpine-sdk build-base tcl-dev tk openssl-dev
wget https://fossil-scm.org/home/uv/fossil-src-2.12.1.tar.gz
tar zxvf fossil-src-2.12.1.tar.gz
cd fossil-2.12.1
./configure --with-th1-docs --with-th1-hooks --json
make
strip fossil
make install

The only trick here was figuring out the different packages to install for Alpine.  On Ubuntu, the main thing is 'build-essentials', but here it is 'alpine-sdk' and 'build-base'.  You'll need those for building stuff from source in the future as well.

The Dockerfile can look like this at this point:

/mnt/datadrive/srv/docker/fossil/Dockerfile

FROM alpine:latest

ARG FOSSIL_VERSION=2.12.1

WORKDIR /build

RUN set -x && \
  apk add --update --no-cache alpine-sdk build-base tcl-dev tk openssl-dev && \ 
  wget https://fossil-scm.org/home/uv/fossil-src-${FOSSIL_VERSION}.tar.gz && \ 
  tar zxvf fossil-src-${FOSSIL_VERSION}.tar.gz && \ 
  cd /build/fossil-${FOSSIL_VERSION} && \ 
  ./configure --with-th1-docs --with-th1-hooks --json && \ 
  make && \ 
  strip fossil && \ 
  make install

ENV FOSSIL_PORT=8086
ENV FOSSIL_REPO_LOC=/srv/fossil/repos

EXPOSE ${FOSSIL_PORT}

ENTRYPOINT fossil server --port ${FOSSIL_PORT} --repolist ${FOSSIL_REPO_LOC}

and then build:

docker image build -t fossil-server .

This builds the docker image, and sets up the appropriate stuff to have the fossil server run when the container starts.

You'll notice two directives 'ARG' and 'ENV'.  They're mostly the same, but subtly different.  'ARG' will provide an environment variable that is defined only when building.  ENV will provide an environment variable that is defined both when building and also when later running.

I used these variables here simply so that I could more easily change some parameters as needed in the future.

It takes a while to compile the stuff; about 35-40 minutes.

Afterwards:

bootilicious@rpi3server001:/mnt/datadrive/srv/docker/fossil$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
fossil-server       latest              f63d681b8b16        34 seconds ago      334MB
nginx-certbot       latest              2d7ad5eed815        3 days ago          83.1MB
php                 fpm-alpine          31b8f6ccf74b        12 days ago         71.5MB
nginx               alpine              3acb9f62dd35        2 weeks ago         20.5MB
alpine              latest              2e77e061c27f        2 weeks ago         5.32MB

334 MB! Sweet Jesus!  That's way too much.  Most of it is the intermediate build artifacts.

Multi-stage Builds

This is a common problem, and Docker introduced a concept called 'multistage builds' to cope with it.  The gist is that your docker file specifies several build operations, and that you copy the desired pieces from one to another.  So, in this case, you can do the first build (resulting in the 334 MB image, but then do another build by plucking out the desired pieces for the next stage.

The methodology is fairly straight-forward:  each 'FROM' directive terminates a previous build stage and starts a new one.  The build stages are internally numbered starting from 0, and can be referred to that way, but there is a convenience feature where you can give them a name and refer to them that way.  So in our case, we'll label what we've done so far as 'buildstage', and then add a new stage called 'production' that simply copies in the desired build artifact, and that's what we'll use.

/mnt/datadrive/srv/docker/fossil/Dockerfile

#build stage for creating fossil executable
FROM alpine:latest AS buildstage

ARG FOSSIL_VERSION=2.12.1

WORKDIR /build

RUN apk add --update --no-cache alpine-sdk build-base tcl-dev tk openssl-dev && \ 
wget https://fossil-scm.org/home/uv/fossil-src-${FOSSIL_VERSION}.tar.gz && \ 
tar zxvf fossil-src-${FOSSIL_VERSION}.tar.gz && \ 
cd /build/fossil-${FOSSIL_VERSION} && \ 
./configure --with-th1-docs --with-th1-hooks --json && \ 
make && \ 
strip fossil && \ 
make install

#production stage just has the build fossil executable, and serves the repos
#note, this presumes the repos have been bind-mounted in ${FOSSIL_REPO_LOC}
FROM alpine:latest AS production

ENV FOSSIL_PORT=8086
ENV FOSSIL_REPO_LOC=/srv/fossil/repos

COPY --from=buildstage /usr/local/bin/fossil /usr/local/bin/fossil

EXPOSE ${FOSSIL_PORT}

ENTRYPOINT fossil server --port ${FOSSIL_PORT} --repolist ${FOSSIL_REPO_LOC}

build as per yoosh; then docker image ls:

bootilicious@rpi3server001:/mnt/datadrive/srv/docker/fossil$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
fossil-server       latest              ef792c58af18        About a minute ago   22.9MB
<none>              <none>              c3e0b8734019        About a minute ago   334MB
nginx-certbot       latest              2d7ad5eed815        3 days ago           83.1MB
php                 fpm-alpine          31b8f6ccf74b        12 days ago          71.5MB
nginx               alpine              3acb9f62dd35        2 weeks ago          20.5MB
alpine              latest              2e77e061c27f        2 weeks ago          5.32MB

bootilicious@rpi3server001:/mnt/datadrive/srv/docker/fossil$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
fossil-server       latest              ef792c58af18        About a minute ago   22.9MB
<none>              <none>              c3e0b8734019        About a minute ago   334MB
nginx-certbot       latest              2d7ad5eed815        3 days ago           83.1MB
php                 fpm-alpine          31b8f6ccf74b        12 days ago          71.5MB
nginx               alpine              3acb9f62dd35        2 weeks ago          20.5MB
alpine              latest              2e77e061c27f        2 weeks ago          5.32MB

So, well, the multi-stage build saved 310 MB.  Hmm, I guess that works!  I don't know how to automatically get rid of the intermediate image labelled as <none>:<none>, but a 'docker image rm c3e0' casts it aside.

Now that we have our smaller image, we can test it:

#test
docker run --rm -d --name fossil-server \
    --mount 'type=bind,src=/mnt/datadrive/srv/data/fossil/repos,dst=/srv/fossil/repos' \
    -p 8086:8086 \
    fossil-server

We should be able to browse at host IP address, port 8086, and see the repository list. (We can add repo name to url for specific repository -- this is all Fossil-specific stuff.)

Now we are ready to integrate it into the 'myservices' collection for systemd.

Systemd

Adding the additional service to the 'myservices' group is relatively straightforward:

/etc/docker/compose/myservices/docker-compose.yml

version: '3'
services:

  #Fossil service
  fossil:
    image: fossil-server
    container_name: fossil-server
    restart: unless-stopped
    tty: true
    #I can proxy this through nginx, however since the DNS provider (noip.com)
    #does not support sub-domains, it is still necessary in my case to continue
    #to provide the port through my firewall for Internet access.  If your DNS
    #does not have this limitation, then you can comment this out.
    ports:
      - "8086:8086"
    volumes:
      - /mnt/datadrive/srv/data/fossil/repos:/srv/fossil/repos
    networks:
      - services-network

  #PHP-FPM service (must be FPM for nginx)
  php:
    image: php:fpm-alpine
    container_name: php
    restart: unless-stopped
    tty: true
    #don't need to specify ports here, because nginx will access from services-network
    #ports:
    #  - "9000:9000"
    volumes:
      - /mnt/datadrive/srv/config/php/www.conf:/usr/local/etc/php-fpm.d/www.conf
      - /mnt/datadrive/srv/data/www:/srv/www
    networks:
      - services-network

  #nginx
  www:
    depends_on:
      - php
    image: nginx-certbot
    container_name: www
    restart: unless-stopped
    tty: true
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /mnt/datadrive/srv/config/nginx/default.conf:/etc/nginx/conf.d/default.conf
      - /mnt/datadrive/srv/data/www:/srv/www
      - /mnt/datadrive/srv/config/certbot/etc/letsencrypt:/etc/letsencrypt
      - /mnt/datadrive/srv/data/nginx/dhparam.pem:/etc/ssl/certs/dhparam.pem
    networks:
      - services-network

#Docker Networks
networks:
  services-network:
    driver: bridge

The additional service is added.  In my case, since my DNS does not support sub-domains, I still need to expose the port directly in order for it to be accessible from the outside world.  However, I also did augment the nginx configuration to proxy.  This works if your DNS allows subdomains, and then you don't need to open additional ports -- you simply prefix your domain with 'fossil'; e.g. 'fossil.example.com'.

Nginx

XXXX

/mnt/datadrive/srv/config/nginx/default.conf

#this overrides the 'default.conf' in the nginx-certbot container

#this is for the example.com domain web serving; we have php enabled here

#this does http-to-https redirect
server {
    listen       80;
    listen  [::]:80;
    server_name example.com;
    return 301 https://$server_name$request_uri;
}
#this does the https version
server {
    listen       443 ssl http2;
    listen  [::]:443 ssl http2;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    ssl_protocols TLSv1.2;
    ssl_prefer_server_ciphers on;
    ssl_dhparam /etc/ssl/certs/dhparam.pem;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
    ssl_ecdh_curve secp384r1;
    ssl_session_timeout  10m;
    ssl_session_cache shared:SSL:10m;
    ssl_session_tickets off;
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 8.8.8.8 8.8.4.4 valid=300s;
    resolver_timeout 5s;
    add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection "1; mode=block";

    #charset koi8-r;
    #access_log  /var/log/nginx/host.access.log  main;

    root /srv/www/vhosts/example.com;
    index  index.html index.htm index.php;

    location / {
        try_files $uri $uri/ /index.php?$query_string; 
    }

    #error_page  404              /404.html;

    # redirect server error pages to the static page /50x.html
    #
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }

    # pass the PHP scripts to FastCGI server listening on (docker network):9000
    #
    location ~ \.php$ {
        try_files $uri = 404; 
        fastcgi_split_path_info ^(.+\.php)(/.+)$; 
        fastcgi_pass php:9000; 
        fastcgi_index index.php; 
        include fastcgi_params; 
        fastcgi_param REQUEST_URI $request_uri; 
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; 
        fastcgi_param PATH_INFO $fastcgi_path_info; 
    }
}

#this is our proxied fossil server; keep in mind the host name in 'proxy_pass' is 
#the docker hostname on the internal network, which happens also to be the name of 
#the container
server {
    listen       80;
    listen  [::]:80;
    server_name fossil.example.com;

    location / {
        proxy_pass  http://fossil-server:8086/;
        proxy_redirect     off;
        proxy_set_header   Host $host;
    }
}

Well, that was a bit of an Odyssey!  I still have more services to dockerize, though.  I need to do FTP next.  I haven't completely decided if I'm going to do MySQL, or instead migrate my legacy MySQL database to SQLite form, and redesign my legacy apps to use that, instead (obviating the need for a server).  Also, my legacy SVN repositories might be migrated to fossil.  There is also VPN, though I think that will be best served by running on the host system, anyway.  We'll see!

Next

Tackling FTP.

Discussions