Here at Man AHL, we run almost all our trading software and strategies in Docker containers. Docker containers are great, but they bring with them some debugging challenges. For example, we like collecting core dumps, as it helps us pin-point exactly where a program crashed, but what happens with core dumps when they are generated in a container?

Traditionally, our software would run on bare metal servers without containers, and core dumps would be written directly to disk (to be shipped off somewhere with an rsync-like job ran through cron). When Docker containers are introduced however, the traditional model doesn’t quite work, because the core dumps end up being written inside the Docker container itself (because of the mount namespace).

As core dumps are an important tool in identifying problems in our trading models, we sat down and thought of various ways to solve this issue. This blog post will present the new solution we came up with, and the advantages of using this approach.

Traditional style core dumping in Docker containers

Googling for core dumping in Docker yields an obvious solution to the namespace issue. Basically, we can mount an external directory into the Docker container, and when a process dumps core, even though it is written to the directory inside the container, it is visible outside.

For example, we could imagine a scenario like:

[email protected]:~$ sysctl kernel.core_pattern
kernel.core_pattern = /var/crash/core.%e.%h.%p.%t

# Start Docker container with a mounted in core dump directory
[email protected]:~$ docker run -d -v /var/crash:/var/crash  my_app:latest
 

One problem with the above approach is that the invocation of the containers has to be changed. The core dumps are also written to a local filesystem (or NFS, but we don’t run that in production). In general, our servers do not have spare 100+GB for core dumps, so we’d end up with half dumped cores and full filesystems.

Enter the pipes

There is a neat feature in the Linux kernel which lets you dump the core file to a program/script (on stdin) rather than a file. The program is executed in the global namespace (outside of the container), which brings several benefits, for example:

  • core dumping can be made completely independent of containers
  • core dumps do not have to be written to a filesystem

Setting core dumps to use pipes is simple:

kernel.core_pattern = |/some/script %e %h %p %t
 

The coredump.c file in the Linux source code shows how the pipes are handled differently to file based core dumps.

 

A note on abrtd

RedHat/CentOS (and maybe others) ship with a piece of software called Abrt. It is a “set of tools to help users detect and report application crashes”.

abrtd also uses the kernel.core_pattern set to a pipe to capture core dumps, but unfortunately it has proven unstable in our environment. We would regularly find abrtd pinned at 100% CPU, which is a bit sad in itself, but when you also run Openstack, and have lots of VMs on each hypervisor, abrtd could easily monopolise several cores on a machine.

Another issue is that even though Abrtd has plugins for sending tar balls over FTP, it appears to require local spool space to create the tar balls etc, which we don’t have (and don’t want to manage).

Our solution

We wanted a solution which was KISS, required no modification to the Docker containers, or to the invocation of the containers, didn’t spool enormous files locally, and “just worked, everywhere”. To meet these requirements we developed a core handling script, which reads the core dump over stdin, and sends it to a remote (in-house) FTP server. We use anonymous FTP with the server configured to disallow downloads to avoid having to manage credentials.

The script also does some other stuff, like reports crashes to our Elastic Search database, but the cleverness all boils down to a simple ncftpput command.

ncftpput -t "$ftp_timeout" -m -c "$remote_host" "$filename"

 

We push the script and sysctl settings out using Ansible, which ensures that all our machines in our entire estate have the same core dump settings.

For the anonymous FTP server part, we use the excellent VsFtpd, which is secure and easy to configure. VsFtpd is running on an Openstack node, and it is configured to only allow uploads using anonymous FTP (no downloads). The files are uploaded onto a Pure Flashblade, where developers and support staff can see / analyse the core dumps.

The solution in action

Here’s an Asciinema cast showing the solution in action. The sleep command is sent to the background, and then kill -SEGV is used to force it to generate a core dump, both inside and outside of a Docker container.

Show me the code

You can download our full script, and example Ansible roles at our GitHub repo.

Come work for us

Found this blog post and the technologies mentioned interesting? Come work for us - check our careers page.

Back to Technology

Latest Technology articles

At Man AHL, we believe in the Python ecosystem and have been successfully trading Machine Learning based systems since early 2014.

Technology Team

At Man AHL, we believe in the Python Ecosystem and are successfully trading Machine Learning based systems since early 2014.

Technology Team

Important information

Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc (‘Man’). These opinions are subject to change without notice, are for information purposes only and do not constitute an offer or invitation to make an investment in any financial instrument or in any product to which the Company and/or its affiliates provides investment advisory or any other financial services. Any organisations, financial instrument or products described in this material are mentioned for reference purposes only which should not be considered a recommendation for their purchase or sale. Neither the Company nor the authors shall be liable to any person for any action taken on the basis of the information provided. Some statements contained in this material concerning goals, strategies, outlook or other non-historical matters may be forward-looking statements and are based on current indicators and expectations. These forward-looking statements speak only as of the date on which they are made, and the Company undertakes no obligation to update or revise any forward-looking statements. These forward-looking statements are subject to risks and uncertainties that may cause actual results to differ materially from those contained in the statements. The Company and/or its affiliates may or may not have a position in any financial instrument mentioned and may or may not be actively trading in any such securities. This material is proprietary information of the Company and its affiliates and may not be reproduced or otherwise disseminated in whole or in part without prior written consent from the Company. The Company believes the content to be accurate. However accuracy is not warranted or guaranteed. The Company does not assume any liability in the case of incorrectly reported or incomplete information. Unless stated otherwise all information is provided by the Company. Past performance is not indicative of future results.

18/0745/RoW/GL/R/W

Please update your browser

Unfortunately we no longer support Internet Explorer 8, 7 and older for security reasons.

Please update your browser to a later version and try to access our site again.

Many thanks.