TPA in-person meetup
We held an in-person meet up in Montreal! It was awesome, and here are the notes.
schedule
- 20: people arriving, day off
- 21: at anarcat's
- 22: at the rental apartment
- 23: at ATSE (aka la balise)
- 24: back at the rental
- 25-26: weekend, days off
- 27: rental
- 28: people leaving, day off
actual sessions
Those are notes from sessions that were actually held.
BBB hot take
anarcat presented the facts and the team decided to go with Maadix.
groente and anarcat worked on importing the users and communicating with the upstream and tor-internal, the migration was completed some time during the meeting.
Details in tpo/tpa/team#41059.
SOTO ideas
anarcat got enrolled in the "State of the onion" (SOTO) presentation... What should he talk about?
The idea is to present:
- “Chaos management”: upgrades, monitoring, Tails merge.
- Anecdote: preventing outages, invisible work that enables all the rest.
See also the issue around planning that session.
The DNSSEC outage was approved as an example outage example.
Roadmapping
Q4
Legend:
- :thumbsup: 2025 Q4
- :star: 2026
- :cloud: ~2030
crosseddone
Review from the 2025 roadmap:
- Web things already scheduled this year, postponed to 2025
Improve websites for mobile (needs discussion / clarification, @gaba will check with @gus / @donuts)- Create a plan for migrating (and execute?) the gitlab wikis to something else (TPA-RFC-38) :star:
Improve web review workflows, reuse the donate-review machinery for other websites (new)this can use the new multi-version GitLab pages machinery in Ultimate- Deploy and adopt new download page and VPN sites :thumbsup:
Search box on blog- Improve mirror coordination (e.g. download.torproject.org) especially support for multiple websites, consider the Tails mirror merge, currently scheduled for 2027, possible to squeeze in a 2025 grant, @gaba will check with the fundraising team :star:
- marble on download and support portal :thumbsup:
- Make a plan for SVN, consider keeping it :star:
- NetSuite adoption?
- MinIO in production, moving GitLab artifacts, and collector to
object storage, also for network-health team (contact @hiro) (Q1 2025) :star:
- no backups yet
- other than the need of Network Health team, the main reasons to have implemented this were GitLab Runner cache and centralize storage in the organization (including other GitLab artifacts)
- still need to move GitLab artifacts: CI and uploads (images, attachments)
- the Network Team will likely not use object storage for collector anymore
- no container images published by upstream anymore
- upstream slowly pushing to proprietary "AI Store", abandoning FLOSS minio
- upstream removed the web dashboard
- maybe replace with Garage (no dashboard now, but upstream wants to have in the future)
- Prometheus phase B:
inhibitions,self-monitoring, merge the two servers,authentication fixesand(new) autonomous delivery- Make a plan for Q4 to expand the storage capacity of the Prometheus cluster, unblock the monitoring merge for Tails :thumbsup:
- Merge the two servers :star:
- Debian trixie upgrades during freeze :thumbsup: but maybe :star:
Puppet CI (see also merge with Tails below)Development environment for anti-censorship team (contact @meskio), AKA "rdsys containers" (tpo/tpa/team#41769)Possibly more hardware resources for apps team (contact @morganava)Test network for the Arti release for the network team (contact @ahf)- Tails 2025 merge roadmap, from the Tails merge timeline
- Puppet repos and server:
Upgrade Tor's Puppet Server to Puppet 7Upgrade and converge Puppet modulesImplement commit signing- Puppet server (merge) + EYAML (merge) :thumbsup:
Bitcoin (retire)LimeSuvey (merge)- Website (merge) :cloud: not a priority, we prefer to finish the puppet merge and start on monitoring
- Monitoring (migrate) :thumbsup: or :star:: make a plan by EOY, perhaps hook node exporter everywhere and evaluate what else is missing for 2026
- shift merge :star: (depends on monitoring)
Come up with a plan for authentication
- Puppet repos and server:
Pending discussions:
- How to deal with web planning. we lack capacity to implement proper web development, perhaps other teams should get involved which are more familiar with web (e.g. apps team build a browser!). need to evaluate cost of past projects vs a hire
2026
We split the 2026 roadmap in "must have", "nice to have" and "won't do":
Must have
- peace in Gaza
- YEC
- tails moving to Prometheus, requires TPA prometheus server merge (because we need the space, mostly)
- shift merge, which requires tails moving to prometheus
- authentication merge phase 1
- completed trixie upgrades
- SVN retirement or migration
- mailman merge (maybe delegate to tails team?)
- MinIO migration / conversion to Garage?
- marble on main, community and blog websites :star:
- donate-neo CAPTCHA fixes
- TPA-RFC-38 wikis, perhaps just for TPA's wiki for starters?
Nice to have
- RFC reform
- firewall merge, requires TPA and Tails to migrate to nftables
- mailboxes
- Tails websites merge
- Tails mirror coordination (postpone to 2027?)
- Tails DNS merge
- Tails TLS merge
- reform deb.tpo, further idea for a roadmap to fix the tor debian package
- merge (MR) the resulting
debian/directory from the generated source package to the upstreamtpo/core/torgit repository - hook package build into that repo's CI
- have CI upload the package to a "proposed updates" suite of some sort on deb.tpo
- archive the multitude of old git repos used for the debian package
- upload a real package to sid, changing maintainership
- wait for testing to upload to backports or upload to fasttrack
- merge (MR) the resulting
Won't do
- backups merge (postponed to 2027)
long term (2030) roadmap
-
review the tails merge roadmap
-
what's next for tpa?
documentation split
Quick discussion: split documentation between service (administrativia) and software (technicalities)?
Additional idea about this: the switch in the wiki should not be scheduled as a priority task though. we can change as we work on pages...
It is hard to find documentation because the split between service, howto is not very clear and some pages are named after the software (eg. Git) and others after the kind of service (eg. backups).
Maybe have separate pages for the service and the software?
It's good to have some commands for the scenarios we need.
Agreements:
- move service pages from
howto/toservice/(gitlab, ganeti, cache, conference, etc) (done!) - move obsolete pages to an archive section (nagios, trac, openstack, etc)
- make new sections
- merge doc and howto sections
- move to a static site generator
tails replacement servers
- riseup: SPOF, issues with reliability and BGP/RPKI, only accepts 1U, downside to leave is to stop giving that money to riseup
- coloclue: relies on an individual as SPOF as well
missing data on server usage
- possible to host the tails servers (but not TPA web mirrors, so low bandwidth) in mtl (HIVE, see this note) (50TB/mth is 150mbps) for 110CAD, but not mirrors, would replace riseup, only /30 IPv4 though, /64 IPv6
- we could buy a /24 or ask for a donation
- anarcat should talk with graeber again
- we could host tpa / high bw mirrors at coloclue (ams) to get off hetzner and save costs there
- then we can get Supermicro servers from Elco systems which lavamind was dealing with who's in Canada, lavamind will put tails folks in touch
- EPYC 5GHz servers should be fine
team leads and roles
We held a session to talk about the team lead role and roles in general. We evaluated the following as being part of the team lead role:
- meeting facilitation
- architectural / design decisions
- the big picture
- management
- HR
- "founder's syndrome"
- translating business requirements into infrastructure design
- present metrics to operations
- mental load
the following roles are or should be rotated:
- incident lead
- shifts
- security officer
we also identified the team role itself might be ambiguous, in tension between "IT" and "SRE" roles.
the team lead expressed some fatigue about the role, some frustrations were also expressed around communication...
we evaluated a few solutions that could help:
- real / better delegation, so that people feel they have the authority in their tasks
- have training routines where we regularly share knowledge inside the team, perhaps with mandatory graphs
- fuck oracle
- shutting down services
- a new director is coming
- rotating the team lead role entirely
communications
we also had a session about communications, the human side (e.g. not matrix vs IRC), where we felt there were some tensions.
some of the problems that were outlined:
- working alone vs lack of agency
- some proposals (e.g. RFC) take too long to read
solutions include:
- reforming the RFC process, perhaps converting to ADR (Architecture Decision Records), see also this issue
- changeable RFCs
- user stories
- better focus on the process for creating the proposal
- discuss RFCs at meetings
- in-person meetings
- nomic
a few ways the meetings/checkins could be improved:
- start the meeting with a single round table "how are you"
- move office hours to Tuesdays so everyone can attend
wrap up
what went well
- relaxed, informal way
- seemed fun, because we want to do it again (in Brazil next?)
- we did a lot of the objectives we set in this pad and at the beginning of the week
- good latitude on expenses / budget was okay?
- free time to work together
- changing space from day to day
- cycling together
- post-its
what could be improved
- flexibility meant we couldn't plan stuff like babysitters
- would have been nice to quiet things down before the meeting, lots of things happening (BBB switch, onboarding, etc)
- post-its glue
what sucks and can't be improved
- jetlag and long flights
other work performed during the week
While we were meeting, we still had real work to perform. The following were knowns things done during the week:
- unblocking each other
- puppet merge work
- trixie upgrades (only 3 tails machine left!)
- web development
- onboarding
- mkdocs wiki conversaion simulation
We also ate a fuckload of indian food, poutine, dumplings and maple syrup, and yes, that was work.
other ideas
large scale network diagrams
let's print all the diagrams we have and glue them together and draw the rest!
time not found.
making TPA less white male
at tails we used to have sessions discussing chapters from this book, could be nice to do that with TPA as well
time not found.
long term roadmapping
We wanted to review the Tails merge roadmap and reaffirm the roadmap until 2030, but didn't have time to do so. Postponed to our regular monthly meetings.