The "dangerzone" service was a documentation sanitization system based on the Dangerzone project, using Nextcloud as a frontend.

RETIRED

It was retired in 2025 because users had moved to other tools, see TPA-RFC-78.

This documentation is kept for historical reference.

Tutorial
- Sanitizing untrusted files in Nextcloud
How-to
- Pager playbook
  - Stray files in processing
  - Inspecting service status and logs
- Disaster recovery
Reference
Discussion

Tutorial

Sanitizing untrusted files in Nextcloud

Say you receive resumes or other untrusted content and you actually need to open those files because that's part of your job. What do you do?

make a folder in Nextcloud
upload the untrusted file in the folder
share the folder with the dangerzone-bot user

after a short delay, the file disappears (gasp! do not worry, they actually are moved to the dangerzone/processing/ folder!)
then after another delay, the sanitized files appear in a safe/ folder and the original files are moved into a dangerzone/processed/ folder
if that didn't work, the original files end up in dangerzone/rejected/ and no new file appear in the safe/ folder

A few important guidelines:

files are processed every minute
do NOT upload files directly in the safe/ folder
only the files in safe/ are sanitized
the files have been basically converted into harmless images, a bit as if you had open the files on another computer, taken a screenshot, and copied the files over back to your computer
some files cannot be processed by dangerzone, .txt files in particular, are known to end up in dangerzone/rejected
the bot recreates the directory structure you use in your shared folder, so, for example, you could put your resume.pdf file in Candidate 42/resume.pdf and the bot will put it in safe/Candidate 42/resume.pdf when done
files at the top-level of the share are processed in one batch: if one of the files fails to process, the entire folder is moved to dangerzone/rejected

How-to

This section is mostly aimed at service administrators maintaining the service. It will be of little help for regular users.

Stray files in `processing`

The service is known to be slightly buggy, and crash midway, leaving files in the dangerzone/processing directory (see issue 14). Those files should normally be skipped, but the processing directory can be flushed if no bot is currently running (see below to inspect status).

Files should either be destroyed or moved back to the top level (parent of dangerzone) folder for re-processing, as they are not sanitized.

Inspecting service status and logs

The service is installed under dangerzone-webdav-processor.service, to look at the status, use systemd:

systemctl status dangerzone-webdav-processor

To see when the bot will run next:

systemctl status dangerzone-webdav-processor.timer

To see the logs:

journalctl -u dangerzone-webdav-processor

Disaster recovery

Service has little to no non-ephemeral data and should be rebuildable from scratch by following the installation procedure.

It depends on the availability of the WebDAV service (Nextcloud).

Reference

This section goes into how the service is setup in depth.

Installation

The service is deployed using the profile::dangerzone class in Puppet, and uses data such as the Nextcloud username and access token retrieved from Hiera.

Puppet actually deploys the source code directly from git, using a Vcsrepo resource. This means that changes merged to the main branch on the dangerzone-webdav-processor git repository are deployed as soon as Puppet runs on the server.

SLA

There are no service level guarantees for the service, but during hiring it is expected to process files before hiring committees meet, so it's possible HR people pressure us to make the service work in those times.

Design

This is built with dangerzone-webdav-processor, a Python script which does this:

periodically check a Nextcloud (WebDAV) endpoint for new content
when a file is found, move it to a dangerzone/processing folder as an ad-hoc locking mechanism
download the file locally
process the file with the dangerzone-converter Docker container
on failure, delete the failed file locally, and move it to a dangerzone/rejected folder remotely
on success, upload the sanitized file to a safe/ folder, move the original to dangerzone/processed

The above is copied verbatim from the processor README file.

The processor is written in Python 3 and has minimal dependencies outside of the standard library and the webdavclient Python library (python3-webdavclient in Debian). It obviously depends on the dangerzone-converter Docker image, but could probably be reimplemented without it somewhat easily.

Queues and storage

In that sense, the WebDAV share acts both as a queue and storage. The dangerzone server itself (currently dangerzone-01) stores only temporary copies of the files, and actively attempts to destroy those on completion (or crash). Files are stored in a temporary directory and should not survive reboots, at the very least.

Authentication

Authentication is delegated to Nextcloud. Nextcloud users grant access to the dangerzone-bot through the filesharing interface. The bot itself authenticates with Nextcloud with an app password token.

Configuration

The WebDAV URL, username, password, and command line parameters are defined in /etc/default/dangerzone-webdav-processor. Since the processor is short lived, it does not need to be reloaded to reread the configuration file.

The timer configuration is in systemd (in /etc/systemd/system/dangerzone-webdav-processor.timer), which needs to be reloaded to change the frequency, for example.

Issues

Issues with the processor code should be filed in the project issue tracker

If there is an issue with the running service, however, it is probably better to file or search for issues in the team issue tracker.

Maintainer, users, and upstream

The processor was written and is maintained by anarcat. Upstream is maintained by Micah Lee.

Monitoring and testing

There is no monitoring of this service. Unit tests are planned. There is a procedure to setup a local development environment in the README file.

Logs and metrics

Logs of the service are stored in systemd, and may contain privately identifiable information (PII) in the form of file names, which, in the case of hires, often include candidates names.

There are no metrics for this service, other than the server-level monitoring systems.

Backups

No special provision is made for backing up this server, since it does not keep "authoritative" data and can easily be rebuilt from scratch.

Discussion

The goal of this project is to provide an automated way to sanitize content inside TPA.

Overview

The project was launched as part of issue 40256, which included a short iteration over a possible user story, which has been reused in the Tutorial above (and the project's README file).

Two short security audits were performed after launch (see issue 5) and minor issues were found, some fixed. It is currently assumed that files are somewhat checked by operators for fishy things like weird filenames.

A major flaw with the project is that operators still receive raw, untrusted files instead of having the service receive those files themselves. An improvement over this process would be to offer a web form that would accept uploads directly.

Unit tests and CI should probably be deployed for this project to not become another piece of legacy infrastructure. Merging with upstream would also help: they have been working on improving their commandline interface and are considering rolling out their own web service which might make the WebDAV processor idea moot.

History

I was involved in the hiring of two new sysadmins at the Tor Project in spring 2021. To avoid untrusted inputs (i.e. random PDF files from the internet) being open by the hiring committee, we had a tradition of having someone sanitize those in a somewhat secure environment, which was typically some Qubes user doing ... whatever it is Qubes user do.

Then when a new hiring process started, people asked me to do it again. At that stage, I had expected this to happen, so I partially automated this as a pull request against the dangerzone project, which grew totally out hand. The automation wasn't quite complete though: i still had to upload the files to the sanitizing server, run the script, copy the files back, and upload them into Nextcloud.

But by then people started to think I had magically and fully automated the document sanitization routine (hint: not quite!), so I figured it was important to realize that dream and complete the work so that I didn't have to sit there manually copying files around.

Goals

Those were established after the fact.

Must have

process files in an isolated environment somehow (previously was done in Qubes)
automation: TPA should not have to follow all hires

Nice to have

web interface
some way to preserve embedded hyperlinks, see issue 16

Non-Goals

perfect security: there's no way to ensure that

Approvals required

Approved by gaba and vetted (by silence) of current hiring committees.

Proposed Solution

See issue 40256 and the design section above.

Cost

Staff time, one virtual server.

Alternatives considered

Manual Qubes process

Before anarcat got involved, documents were sanitized by other staff using Qubes isolation. It's not exactly clear what that process was, but it was basically one person being added to the hiring email alias and processing the files by hand in Qubes.

The issue with the Qubes workflow is, well, it requires someone to run Qubes, which is not exactly trivial or convenient. The original author of the WebDAV processor, for example, never bothered with Qubes...

Manual Dangerzone process

The partial automation process used by anarcat before automation was:

get emails in my regular tor inbox with attachments
wait a bit to have some accumulate
save them to my local hard drive, in a dangerzone folder
rsync that to a remote virtual machine
run a modified version of the dangerzone-converter to save files in a "safe" folder (see batch-convert in PR 7)
rsync the files back to my local computer
upload the files into some Nextcloud folder

This process was slow and error-prone, requiring a significant number of round-trips to get batches of files processed. It would have worked fine if all files came as a single batch, but files are actually trickling in in multiple batches, worst case being they need to be processed one by one.

Email-based process

An alternative, email-based process was also suggested:

candidates submit their resumes by email
the program gets a copy by email
the program sanitizes the attachment
the program assigns a unique ID and name for that user (e.g. Candidate 10 Alice Doe)
the program uploads the sanitized attachment in a Nextcloud folder named after the unique ID

My concern with the email-based approach was that it exposes the sanitization routines to the world, which opens the door to Denial of service attacks, at the very least. Someone could flood the disk by sending a massive number of resumes, for example. I could also think of ZIP bombs that could have "fun" consequences.

By putting a user between the world and the script, we have some ad-hoc moderation that alleviates that issues, and also ensures a human-readable, meaningful identity can be attached with each submission (say: "this is Candidate 7 for job posting foo").

The above would also not work with resumes submitted through other platforms (e.g. Indeed.com), unless an operator re-injects the resume, which might make the unique ID creation harder (because the From will be the operator, not the candidate).

Keyboard shortcuts