Continuwuity Postmortem¶

On 2025-12-21, an zero-day vulnerability was exploited against the continuwuity Matrix rooms. This included all of our community rooms, including the space, main room, offtopic room, development room, BSD suport room, announcements room, git notifications room, and worst of all, our minecraft room. All of our communal rooms were left in tatters with only some damn lucky servers escaping unscathed. We were completely blindsided by this attack, and with zero anticipation, also had zero information on it. Even worse, this attack seemingly affected all servers that weren't Synapse or Dendrite. Quite the dire problem!

I'm going to walk through, step by step, what happened, how it all went down, and what we're doing to ensure disasters like this can't happen again.

The timeline¶

We'll start with the most interesting part: our response timeline. This is not the first time continuwuity has suffered attacks on its community rooms, but this was by far the worst one yet, as it exploited an unknown security vulnerability, out of the blue.

All times are in UTC. This timeline is mainly an aggregation of the log I wrote while investigating the exploit

2025-12-21¶

00:00: I was sleeping, along with most of the rest of the team. The majority of the continuwuity team is based in the United Kingdom, with only a couple of us being located in the United States.

01:23: A malicious user joins our space from their own server. They are banned immediately by our moderation bot, as their server name was on our FreeDNS block list from back when we were getting a lot of spam from FreeDNS domains.

01:26: That same user joins our announcements room, gets banned like before.

01:38: The user joins our offtopic room, but does not get banned. Our offtopic room was protected by a different, third party bot, hosted by myself instead of the project, due to some issues we were having prior. Importantly, the third party bot did not follow exactly the same block lists as the project hosted moderation bot, so the malicious user does not get banned. They sent a message saying "hello everyone", and nothing more.

They proceeded to join our minecraft, BSD support, and dev rooms, however unsuccessfully as those were protected by the normal bot.

01:56: A team member pings me asking why the normal moderation bot isn't in the offtopic room.

01:57: This is when, and I mean this in the most technical way possible, shit hits the fan. Our primary moderation bot announces that it has been removed from all the rooms it was protecting. Interestingly, the alternative moderation bot I was hosting was not yet removed from any rooms.

It was at this point that the attacker had started executing their payload. Hundreds of members were seemingly leaving our rooms out of their own free will, all at once.

02:12 to 02:30: A call for all hands on deck is put out in the moderation room, with both of the US-based team members concluded that there was a security vulnerability being exploited. Both are unable to investigate further but can confirm their servers are seemingly performing actions without their consent or initation.

At around 02:25, my own moderation bot announces it has been removed from our rooms, indicating our final line of automated protection had fallen.

02:30: A second call for all hands on deck goes out, this time vibrating my phone enough to wake me. After collecting my consciousness and unlocking my phone to my Matrix client, it takes me a second to get my bearings, and you can see as I gradually piece together what is going on:

Screenshot of continuwuity's moderation room

For some important context - we've been planning on upgrading our Matrix rooms for a week or two by this point, and we had already created new rooms in anticipation of our upcoming 0.5.0 release.

02:33: I deduce from available data points that a known exploit in older Matrix room versions is somehow being used against us here, and make the unilateral executive decision to abandon our upgrade plan and instead upgrade our main room now in order to maintain some line of communication with the >1000 members previously in our community, and any concerned bystanders who heard the news.

It is around the same time we abandon the primary moderation bot in favour of the offsite backup moderation bot, hosted by [asgard.chat], which was set up at the same time as the new rooms were. The offsite backup bot was hosted on Synapse, which we had figured did not share whatever vulnerability we did, so was our safest bet.

With an emergency upgrade in progress, I hop out of bed where I was using my laptop, and boot up my PC.

02:38: Given the symptoms of the vulnerability, we assume the worst and decide that this must be a zero-day backdoor. Given the high value of the maintainers' primary servers (a lot of us are trusted custodians and have an elevated power level in many rooms across popular Matrix communities), we all start pulling plugs on our servers and hopping to our backup accounts on other servers.

Given the situation, a team member suggests that we may not be able to trust the confidentiality of our moderation room anymore, and we announce plans to rendezvous in a clean, end-to-end encrypted room, with only known-safe servers.

02:46: I tell a few people who were reasonably concerned in other rooms what was going on and to shut down their own servers, and then I start pulling the plug on my own to prevent further damages. Keep in mind at this point we still assumed it was a backdoor or RCE vulnerability.

I open my terminal (with six tabs, good lord), and visual studio code, and get reading.

02:56: RCE & Backdoor are ruled out, but the real bug itself is misidentified.

03:19: The attacker exploits the bug in our newly upgraded room. We're now back at square 1, because the theories I was running through prior now no longer matched.

03:23: Network forensics reveals that a remote server has been sending some forged events, and they were somehow signed correctly by the servers that were "supposed" to be sending them, alongside the server that was sending me weird requests. Remember that user that kept getting banned earlier?

Running theory is now that there's somehow some signature verification exploit. I started hacking together a tool to see if the signatures were individually valid, suspecting that one of them was complete bogus.

04:13: Both signatures are found to be valid. New leading theory is that there's an exploit that allows a malicious server to somehow convince a normal server to sign a phony event.

04:25: The last of the US team goes to bed, I continue throwing stuff at the wall to see what might stick. This is a shape I've never seen before, and is the first exploit of its kind that I know of, at least in recent Matrix history.

04:41: I connect the dots between the network forensics, the behaviour of the exploit, its limited nature, and cross-reference with the spec, and it finally clicks what the bug is.

Luckily, it's a simple fix. Mortifyingly simple.

04:48: Security fix is implemented, I start working on reverse engineering the attack to reproduce it and verify that it is patched.

05:41: I finish reaching out to all other affected Matrix server implementations with information, reproduction steps, and a patch file. We're planning on doing a coordinated release to reduce the likelyhood of opportunistic eyes getting some shifty ideas after seeing the fix commits.

06:27: A full reproduction environment is established, and information was gathered regarding the behaviour of patched and unpatched servers. This is also shared with relevant parties.

07:35: I started writing this post-mortem. The rest of the team is still asleep.

10:00: The rest of the team starts waking up, I begin catching everyone up while preparing to push fixes to our main branch.

10:58: The security fix is pushed to our main branch, other server implementations confirmed (prior) that they will be pushing similar fixes shortly after. Our CI produces build artefacts including functional binaries, but fails to push updated docker images (most of our userbase deploys via docker as far as I know, so this was bad).

11:46: Docker builds are successfully manually pushed and verified to be okay. An announcement is released in our new announcements room. I went to bed shortly after, the rest of the team had enough information to hand off.

11:59: continuwuity.org is updated to the latest version

12:34: Tuwunel pushes a fix for the vulnerability

approx 16:50: An email is sent to the Matrix Foundation's security team to keep them in the loop.

17:52: GHSA-22fw-4jq7-g8r8 is drafted

22:13: Grapevine releases their fix for the vulnerability

22:25: The Tuwunel maintainer opened a merge request to fix Conduit

2025-12-22¶

00:11: Continuwuity 0.5.0 is tagged & drafted, pending release notes. I woke up again about 30 minutes later and re-joined the team.

00:52: Tuwunel releases 1.4.8 with the security fix

01:12: The GHSA is reviewed and deemed to be correct, but is updated with new information.

02:36: A security announcement is pushed into our announcement checker, ensuring that almost every deployment (that hasn't disabled the update checker) is made aware that they need to update ASAP.

03:51: Continuwuity 0.5.0 is released, first and third party packages all start being released alongside. Announcements start being distributed off-platform for maximum reach.

04:00: GHSA-22fw-4jq7-g8r8 is published. The CVSS4 score deems this a 10/10 critical vulnerability.

07:54: A CVE for GHSA-22fw-4jq7-g8r8 is requested, pending GitHub manual review.

What happened¶

Now that we know the incident response to the issue, lets take a look at what actually happened here.

Lets start with some basics: Matrix is a federated protocol where servers send each other "events", which are cryptographically signed by each originating server to prove that an event they claim to have produced is actually authentic. Each receiving server has to verify that each event they receive is correctly signed, otherwise the event is assumed to be forged and is dropped.

In order to join, request to join, or reject an invite to a remote room, a Matrix server has to perform several steps:

First it asks a remote server that is in the room to give it a "membership template" (via the make_join, make_knock, and make_leave endpoints)
The remote server fills in details like authentication events, depth, and previous events (since the requesting server probably doesn't know them), and returns something like this:

{
  "event": {
    "auth_events": [
      "$wqhhLFdgwFo87fzN6KarwZ8RwYSFmKMgeI1gj9kYO8M",  // the room's current m.room.power_levels event
      "$jJCmVrOeCC7WwQSMkE-mB5w48COMMLv66yNbTCZcJcE",  // the room's current m.room.join_rules event
      "$rgxE45vpPK7oU7Tstsa_sQguUDXgeKs0wpEhY0UZ9Sg",  // the room's m.room.create event
      "$xewDcZWTivyELNeFDtTHAizhgYgD80ASz-yorgYoHNY"   // the `state_key`'s current m.room.member event (may not be present if the user has never joined before)
    ],
    "content": {
      "membership": "join"
    },
    "depth": 16,
    "hashes": {
      "sha256": "ksRXZLZ+anLuBQZMPWQdmwN4pw2lUBIr1hFTUrGmCCY"
    },
    "origin": "nexy7574.co.uk",
    "origin_server_ts": 1766382148884,
    "prev_events": [
      "$IGTxGY-fybBL72yxk5Q8HaXIpg5A79DwScvexmb-xEM"  // an event sent prior to this join attempt
    ],
    "room_id": "!vVp06n8ePtn4mqBHl6:nexy7574.co.uk",
    "sender": "@nex:hammerhead.nexy7574.co.uk",
    "signatures": {
      "nexy7574.co.uk": {
        "ed25519:efn3fIVR": "3oX6eW1SH05Dg39QH9ZWjGqhq1M6M6+8d0P06zj1NjVxd1YbdI4X0WFUP4LQUT01S+yB3cjrk3a+mt1o+nEzDQ"
      }
    },
    "state_key": "@nex:hammerhead.nexy7574.co.uk",
    "type": "m.room.member",
    "unsigned": {}
  },
  "room_version": "11"
}

Then, the requesting server takes this event, fills in information into content (like the user's profile information)¹, and signs the event.
The requesting server then sends a request to the appropriate send_xyz endpoint to ask the remote server to push the newly signed and filled in membership event into the room graph.

The Matrix specification has a nice ASCII art diagram displaying how this dance works for joining rooms: https://spec.matrix.org/v1.17/server-server-api/#joining-rooms. It also has a nice overview of how event signing works.

So, knowing now that each homeserver needs to sign events to verify their authenticity, and if they can't produce a valid membership event locally, they need to ask a remote server, what do you think happened here (click to fullscreen):

Even if at first you might've just though "I guess they just wanted to leave", look at the event timestamps. They're all set to 01:57, despite a ban being issued by ginger at 02:01.

Naturally, seeing several hundred accounts leaving our room, including bot accounts which cannot do that by their own will, I and the other team members assumed this was some sort of remote code execution or backdoor vulnerability, especially considering it seemingly did not affect Synapse or Dendrite servers.

Lets take a look at the JSON representation of some of these leave events:

{
  "auth_events": [
    "$7bkKK_Z-cGQ6Ae4HXWGBwXyZi3YjC6rIcQzGfVyl3Eo",
    "$v42zspQgnv1YTcpkDQ6byZ4v5czi3_nGTptPZxuDP6Q",
    "$AGa8ptZQsYcsqCmb4b460z-7zTZIUvWA9Ye__KwTItQ"
  ],
  "content": {
    "membership": "leave"
  },
  "depth": 9007199254740991,
  "event_id": "$NkMHcj1RUoI_h31w9exv9L8igz5HEh2YDufmmKhasgU",
  "hashes": {
    "sha256": "D/VVCVUu3BelP2rTzURfsL7VU5f+D4NZR9WwO8GiDf4"
  },
  "origin": "matrixgg.jumpingcrab.com",
  "origin_server_ts": 1766282233558,
  "prev_events": [
    "$0O5uc1FaHgqHU8dxvvjeZmI3bA8_N_nPTZvQ6Lq8L5c"
  ],
  "room_id": "!offtopic-2:continuwuity.org",
  "sender": "@bot:continuwuity.org",
  "signatures": {
    "continuwuity.org": {
      "ed25519:PwHlNsFu": "oq54Vb+ZwGo+Ls9kftJe//Dv9saOb2KA8mdaX7a7J16jhZ3gV6hzZLQ4MaSK1sTY7FB31U5WbJSDvX355kwkBQ"
    },
    "matrixgg.jumpingcrab.com": {
      "ed25519:1eFngaJT": "XeWWHLWA64F735xMnEHQeDH3z1GmILtZ/gZs4Ke3TlYDSsxlAxV0AQiToi2qzToglkWpr22RCDSYO4IYKyzxDQ"
    }
  },
  "state_key": "@bot:continuwuity.org",
  "type": "m.room.member",
  "unsigned": {
    "prev_content": {
      "displayname": "Botty",
      "membership": "join"
    },
    "prev_sender": "@bot:continuwuity.org",
    "replaces_state": "$v42zspQgnv1YTcpkDQ6byZ4v5czi3_nGTptPZxuDP6Q"
  }
}

@bot:continuwuity.org is our community bot which hosts some useful utilities. It is a maubot plugin instance, you can see its source code at https://forgejo.ellis.link/continuwuation/continuwuitybot. If you do look, you can see that it has no functionality that can trigger itself into leaving a room.

So that's suspicious. What else do you notice about this event? That's right, the event signed by both continuwuity.org and matrixgg.jumpingcrab.com. This is weird, because continuwuity.org has several other users in the room, but most importantly, it created the room. There's no reason that continuwuity.org would need to ask an external server to help it leave the room, since it already knows all of the information it would need to create its own leave event. So, where did that extra signature come from?

The attacker also made my own account on my own server, nexy7574.co.uk, leave the room. I log web requests made to my server, so if this was a backdoor/RCE², there would definitely be a trace of it there. I only expose access to my server over port 443 (HTTPS) and port 8448, and both ports lead to my caddy web server, which can only process HTTP requests.

After looking through my logs, I saw this interesting chunk of requests:

01:57:22 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!rooms:continuwuity.org/$Cf-aLuIbzkKBzjGkpxnEAnPDTrcH6WpzQ-_cLG2r9Jo (1.0KiB): 1.0KiB in 0.05s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!rooms:continuwuity.org/$ZwcpBmGOao9Zjd9dtqvSFrjy-TE_z27le6fBcoR9tNc (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!rooms:continuwuity.org/$_F72gCr8av7lODY-V5LQlN2NIIpT3fsoOq32ngec0ME (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!rooms:continuwuity.org/$ocA25AeLs7mPZG8Fbx5VFd-_Tm-8yy5VrJRn2NJoElY (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!main-2.1:continuwuity.org/$aaBxyffwTKdpKaYG2f96a43KHXd3CnyG9DCFl_l9Sfg (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!main-2.1:continuwuity.org/$1t6Sx1qVuiIMun5oOHFgWvl7D0A7OGTEP5AwBfCI43g (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!main-2.1:continuwuity.org/$qbb5qFshE5oQKlnKHhZoRTWwjoS6PFG6VBMMpTuLPEQ (0.9KiB): 1.0KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:23 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!main-2.1:continuwuity.org/$KIE361miTe5n3hJOGhc-22lkyHZB0_1Dd1g-kHk-pBo (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:24 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!dev-1:continuwuity.org/$M3Uw310Dql0BQCmYT2zy0Ix6h0A0MTBafyhc50vvRRE (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:24 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!dev-1:continuwuity.org/$qMeulcpBg6jRMOnre0CzhyqKMJnxt6Zf0sXhXWyvEFo (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:27 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!releases:continuwuity.org/$5VF8OMH5cIuOekngqneSBIWwW6MKv9m-i6TFdcTi-YE (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:27 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!14rYHptQ8ypeYmZTo8:chat.kiefte.eu/$_DNTLCaDArqAFCrj5DiEWhUBpEMA0i4oWrYhpUV2_DY (1.0KiB): 1.1KiB in 0.05s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:27 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!6hZP6cpftNoSqle0Pr:nexy7574.co.uk/$z87nqNdsQMB07LN4xkzx5xMuJkQm7IFuQdSUtd7S42c (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:27 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!6hZP6cpftNoSqle0Pr:nexy7574.co.uk/$m7Bq_IgJI1MNlkmivtp4CDNE4QYF_apEPq76hDdsu3g (1.0KiB): 1.0KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:27 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!minecraft-1:continuwuity.org/$_JC8fLtJI4Ctc6T05k_PRBlhE8rxHsGm2xi1bll9Mwg (1.0KiB): 1.1KiB in 0.05s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:28 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!offtopic-2:continuwuity.org/$2tq2eJZK1dNqkGraDg1rvn1QJKqIuhfiogx2_BDaghc (1.1KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:28 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!offtopic-2:continuwuity.org/$r4blc9-92zYmijv_WEVpVDbkkV2YE5weP3UUfgHhXaM (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187
01:57:28 200 HTTP/2.0 PUT    /_matrix/federation/v2/invite/!offtopic-2:continuwuity.org/$7EW7EHIyyU5JaSD0P64a1ibyiPB4od1b_rjzGXogAuU (1.0KiB): 1.1KiB in 0.04s from matrixgg.jumpingcrab.com (continuwuity/0.5.0-rc.8.1 (cf8d8e4)) @ 86.107.168.187

This immediately flagged to me that something was wrong with the /invite/ federation endpoint. Lets look at the event JSON for that first event ID:

{
  "auth_events": [
    "$EK4oX8FilMB2H0kqTJSujPLaUXYSv3vTfNHcrfIJ1Ak",
    "$k0RqJRSiXFEpG8TP9WdcYFgtIB4wG6EkqU8Pl_PCZ5E",
    "$Wc-7yt-EP-YVNGdFGlp24-XbnpXaPdD5RZTThKQoYes"
  ],
  "content": {
    "membership": "leave"
  },
  "depth": 1536,
  "event_id": "$Cf-aLuIbzkKBzjGkpxnEAnPDTrcH6WpzQ-_cLG2r9Jo",
  "hashes": {
    "sha256": "53jBBaYUsCKG+7gD9hw8aEI3YoHALMVf62EQ+fbhCGY"
  },
  "origin": "matrixgg.jumpingcrab.com",
  "origin_server_ts": 1766282241472,
  "prev_events": [
    "$woMux__fVu5-NKg0XPLn_NXoFTA6uRlCo_sWrjXR1NQ"
  ],
  "room_id": "!rooms:continuwuity.org",
  "sender": "@star:nexy7574.co.uk",
  "signatures": {
    "matrixgg.jumpingcrab.com": {
      "ed25519:1eFngaJT": "dtO16Y8pQ5JeWZL8Bu2PsPCIOf0RiQpoC1cGhiIDH5IiaQ55z7pDtAeSZvSNy7+k/eOJRqQapN2FZoudKcdoDQ"
    },
    "nexy7574.co.uk": {
      "ed25519:efn3fIVR": "24Wg+LTwF/g6LHe3LIJJEg5rcHOV71OXDCgz0rmWa2C5/UvW1smwWZadk6b7826y5NHFCl3OeZ2+R3hZigCrDg"
    }
  },
  "state_key": "@star:nexy7574.co.uk",
  "type": "m.room.member",
  "unsigned": {
    "prev_content": {
      "avatar_url": "mxc://nexy7574.co.uk/TT8aMbj6bWoUu0u8Xcc8K49gvGsVAtst",
      "displayname": "star",
      "membership": "join"
    },
    "prev_sender": "@star:nexy7574.co.uk",
    "replaces_state": "$k0RqJRSiXFEpG8TP9WdcYFgtIB4wG6EkqU8Pl_PCZ5E"
  }
}

Sure enough, that's one of our suspiciously signed events. Even funnier, @star:nexy7574.co.uk is a user who deactivated their account on my server a few months ago, so they couldn't possibly have initiated this leave event themself.

At this point I'm certain that there's something wrong with the invite endpoint, so I go to read the specification for it, and the first paragraph gives you a huge hint as to what might be going on here:

Invites a remote user to a room. Once the event has been signed by both the inviting homeserver and the invited homeserver, it can be sent to all of the servers in the room by the inviting homeserver.

Remember how make_join (et al) has a send_join (et al) component, the former one makes a template event, and the latter sends a formatted event? Well, invite appears to be a hybrid that has the inverse behaviour: the server that wishes to send the membership first creates it locally, then gets it signed by the remote server, then sends it itself.

This means that a malicious server can theoretically send anything to the invite endpoint, and then get a signed & valid event in response. Even better, the invite endpoint expects that the state_key (user being invited) is originating from itself.

Okay, so something in the invite endpoint is obviously allowing either a forged signature, or is tricking the remote server into producing a valid signature for an invalid event.

Lets take a look at the code that was added to fix the vulnerability, and then I'll confirm what was actually fixed:

From https://forgejo.ellis.link/continuwuation/continuwuity/compare/48a6a475ce6ad68e7dca7d1f1bcc632e9a069c60..7fa4fa98628593c1a963f5aa8dbc3657d604b047:

@@ -60,6 +61,46 @@ pub(crate) async fn create_invite_route(
     let mut signed_event = utils::to_canonical_object(&body.event)
         .map_err(|_| err!(Request(InvalidParam("Invite event is invalid."))))?;

+    // Ensure this is a membership event
+    if signed_event
+        .get("type")
+        .expect("event must have a type")
+        .as_str()
+        .expect("type must be a string")
+        != "m.room.member"
+    {
+        return Err!(Request(BadJson(
+            "Not allowed to send non-membership event to invite endpoint."
+        )));
+    }
+
+    let content: RoomMemberEventContent = serde_json::from_value(
+        signed_event
+            .get("content")
+            .ok_or_else(|| err!(Request(BadJson("Event missing content property"))))?
+            .clone()
+            .into(),
+    )
+    .map_err(|e| err!(Request(BadJson(warn!("Event content is empty or invalid: {e}")))))?;
+
+    // Ensure this is an invite membership event
+    if content.membership != MembershipState::Invite {
+        return Err!(Request(BadJson(
+            "Not allowed to send a non-invite membership event to invite endpoint."
+        )));
+    }
+
+    // Ensure the sending user isn't a lying bozo
+    let sender_server = signed_event
+        .get("sender")
+        .try_into()
+        .map(UserId::server_name)
+        .map_err(|e| err!(Request(InvalidParam("Invalid sender property: {e}"))))?;
+    if sender_server != body.origin() {
+        return Err!(Request(Forbidden("Sender's server does not match the origin server.",)));
+    }
+
+    // Ensure the target user belongs to this server
     let recipient_user: OwnedUserId = signed_event
         .get("state_key")
         .try_into()

Obviously, the problem is a lack of validation here. If you look at the code for send_join, you can see that it has some validation on the event that the remote server is asking to send that the invite handler does not have (until the above commits).

So that's the bug: a malicious remote server can send any³ event to the invite endpoint, and receive an event signed by the vulnerable server in response, which it can then federate to every other server in a room. As you can see in the example event JSON bodies presented above, it looks like the attacker was crafting leave events, setting the sender and state key to the user that they want to kick (so that the leave would look like it was self-initated), and then getting a valid signature from the sender's server, so that every other server would see the event and the signature, and go "yeah that checks out" because it does. The sender's vulnerable server has effectively stamped this forged event with a "yep, this is authentic". Other servers have no way to disprove this, so have to accept it as normal.

Reproducing the exploit¶

By this point we now know what the exploit was, how to fix it, and roughly how it worked.

In order to test my discovery (after fixing it), I brought up an unpatched server of mine, and utilised the federation cockpit I implemented into my own homeserver implementation. There's no docs so I won't go into detail, but they allow me to get my own federating homeserver to create, sign, and send arbitrary federation requests and events, basically by proxy.

So, lets get to work on reproducing this!

First, lets create a room. For the sake of simplicity, I'll do this from my unpatched server, as to reduce the amount of faff with prerequisites. I'll just make a standard unencrypted public room, and give my account power level 9001 (which is what my client does by default).

Our exploitable room has now been created with the ID !XRqZN3zQD2dtHc8I4k:timedout.uk. My account @nex:timedout.uk has been joined to the room, and it has been set up with the following important events:

$1OC-XnwPEOZ5bz--qUahdbz-y04lfWdV8gNTLNHWOkA - the room's m.room.create event
$L2H1N6O4b73wR3vzT20DFR5RLB6KeaumRcYUGmH5Yag - the m.room.member event for @nex:timedout.uk
$XuAstr_j-eH9diSqF8J6T3FGDUct5lOqXjBeh_1ZuOk - the power levels for the room
$Pue0EI--tLBmjptjaCCrj3D3fxBpCBmveZY6GgPbPz4 - the join rules, which set the room to public.

Now, I'll join the room from another account on one of my other servers. Note that this can be an account on the same server, but I'm already logged in to another server in another window.

@nex:nexy7574.co.uk has now joined the room, with the relevant membership event having the ID $11x_JW1n9UjggjhekD35aEfFzOdz51tTS-G2pdvnDvk.

I'll switch over to my insomnia client now, and demonstrate how a malicious server could forge events. I assume that the attacker simply modified their continuwuity server, but I already have a server that gives me complete control, so this wouldn't be how they did it.

Lets first join our attacking server to the new room. Since it's public, we don't need to get an invite, and can just join. We start off by sending a [send_join] request from hammerhead.nexy7574.co.uk (our attacking server) to timedout.uk, to get a join template:

And then fill out that template with our own information, generate the content hash, and sign it:

The event ID must also be calculated by the attacking server now in order to proceed to the next step, but I used an administration command in continuwuity to do that, since ironically hammerhead doesn't have a utility that does that yet. The calculated event ID for the finalised membership event is $RnuwufqoQbDSx82MhgZJ_NrFc6CtsoqnYnJhAOzOO7E

And now ask timedout.uk to send that finalised event into the room timeline, and return the room state to us:

As you can see, the request was successful, and timedout.uk has now sent the finalised membership event into the room, but most importantly, it returned the room's current state to us. The room's state contains all events in the room that represent the room's state, i.e. all of the important events above, the room's name, and any members in the room. It does this so that other servers can authenticate new events and whatnot, since they will all need to reference relevant events from this state.

This also shows up in the room timeline as "Super Evil Attacker" (@attacker:hammerhead.nexy7574.co.uk) as having joined the room. Note that this is a valid event because @attacker:hammerhead.nexy7574.co.uk is the sender of the join event, and hammerhead.nexy7574.co.uk signed the event itself before asking timedout.uk to send it off.

So, now we've got this event:

{
  "auth_events": [
    "$Pue0EI--tLBmjptjaCCrj3D3fxBpCBmveZY6GgPbPz4",
    "$1OC-XnwPEOZ5bz--qUahdbz-y04lfWdV8gNTLNHWOkA",
    "$XuAstr_j-eH9diSqF8J6T3FGDUct5lOqXjBeh_1ZuOk"
  ],
  "content": {
    "displayname": "Super Evil Attacker",
    "membership": "join"
  },
  "depth": 9,
  "event_id": "$tW_LYudsZNiKeA-xkK8BPptuTt_9geWHmAn67AlOWKk",
  "hashes": {
    "sha256": "xKeEqYz+8hUSQdFVrFv/itbiVKo8HTmsj+P+7z1heGs"
  },
  "origin": "timedout.uk",
  "origin_server_ts": 1766386379125,
  "prev_events": [
    "$11x_JW1n9UjggjhekD35aEfFzOdz51tTS-G2pdvnDvk"
  ],
  "room_id": "!XRqZN3zQD2dtHc8I4k:timedout.uk",
  "sender": "@attacker:hammerhead.nexy7574.co.uk",
  "signatures": {
    "hammerhead.nexy7574.co.uk": {
      "ed25519:QqJAuQ": "FeiIrk3fz6qGGtDd821PgCvsJcRPzAHkfDp5F0kml96e+zNWAYR83wecS0fiRagE7RRg7vk6muCyz6jl7+NMCQ"
    },
    "timedout.uk": {
      "ed25519:d5KG2RdS": "noRDGYaLy/p+7c9JO/ojWXqyGTP5TpeJXeeK/2pQ5byVjXwOKQKyMCmZEtoaMOggvJtJ9OHL6v8aMvaTk5HvDA"
    }
  },
  "state_key": "@attacker:hammerhead.nexy7574.co.uk",
  "type": "m.room.member",
  "unsigned": {}
}

Notice the two signatures? One is from timedout.uk (the server that supplied the join template), and the other is from hammerhead.nexy7574.co.uk, the server the sender belongs to.

This event is actually completely irrelevant now. What matters is that we now have all of the important state events I mentioned earlier.

Now that we have the membership of every user in the room, lets start forging a leave event. Since @nex:timedout.uk is in the room's m.room.power_levels event with a high power level, it is probably a juicy target to take out. Lets craft a "leave" event for that account, so that it can no longer participate in the room.

We'll start by taking the template we were given above, and modifying it so that sender and state_key are both @nex:timedout.uk. This indicates that the event should be sent by @nex:timedout.uk, and it targets that same account, meaning this will be a leave, not a kick.

Lets also change "membership": "invite" to "membership": "leave" in the content. We also need to replace the auth_events with those which would be valid for a leave event. In a v11 room (and older versions), this is the m.room.create event, m.room.power_levels event, and the m.room.member event for the user outlined in state_key. So here, that would be $1OC-XnwPEOZ5bz--qUahdbz-y04lfWdV8gNTLNHWOkA, $XuAstr_j-eH9diSqF8J6T3FGDUct5lOqXjBeh_1ZuOk, and $L2H1N6O4b73wR3vzT20DFR5RLB6KeaumRcYUGmH5Yag respectively. We also need to remove the signatures from timedout.uk so that it can re-sign it again.

Here's what our finally crafted event will look like:

{
  "auth_events": [
    "$1OC-XnwPEOZ5bz--qUahdbz-y04lfWdV8gNTLNHWOkA",
    "$XuAstr_j-eH9diSqF8J6T3FGDUct5lOqXjBeh_1ZuOk",
    "$L2H1N6O4b73wR3vzT20DFR5RLB6KeaumRcYUGmH5Yag"
  ],
  "content": {
    "membership": "leave"
  },
  "depth": 10,
  "origin": "hammerhead.nexy7574.co.uk",
  "origin_server_ts": 1766386379126,
  "prev_events": [
    "$tW_LYudsZNiKeA-xkK8BPptuTt_9geWHmAn67AlOWKk"
  ],
  "room_id": "!XRqZN3zQD2dtHc8I4k:timedout.uk",
  "sender": "@nex:timedout.uk",
  "state_key": "@nex:timedout.uk",
  "type": "m.room.member"
}

You may also notice that depth was incremented by 1, origin_server_ts was incremented slightly, and prev_events was updated to contain @attacker:hammerhead.timedout.uk's membership event. This is because if the event doesn't look new enough, it will end up being "soft-failed", and our attack won't work.

Now, lets get that hashed and signed by our own server:

Now we'll take that event, and ask timedout.uk to sign it by giving it to its /_matrix/federation/v2/invite/{room_id}/{event_id} endpoint. This exploits the lack of validation in that endpoint's handler code, allowing this invalid event to pass through uncontested, ultimately tricking the server into signing it with its own signing key.

Furthermore, due to the lack of validation, the event ID we give to this endpoint doesn't actually even matter. The attacker probably only got it right because it was presumably simple enough to not care.

Now that we have a signed copy of this event from the server that is supposedly authoring it, lets federate that event out to every other server in the room, to make it look as if @nex:timedout.uk just left the room!

As you can see, both timedout.uk and nexy7574.co.uk accepted this event, because the signature for timedout.uk is present & valid.

If we now look at the room from the point of view of nexy7574.co.uk, the other server in the room, we see that @nex:timedout.uk seemingly left the room:

timeline of the now exploited room from the point of view of another server

Now, repeat the process for every user in the room, but replace the m.room.member auth event with a different one each time before sending it to the relevant server, and you've suddenly got a very fast way to make a room rather vacant.

Against a patched server¶

Just for the sake of it, this is what happens when you try the same attack against a now patched server:

As you can see, the patched server now performs validation on the incoming event, and refuses to process it because the membership in content is set to leave, but the invite endpoint requires that it is set to invite.

Why only memberships? Why not take over power levels?¶

Good question. /invite/ wasn't totally without validation. It required that the incoming event both had a state_key set (which restricts the exploit to only state events), and that it was a valid user ID (which restricts the exploit to m.room.member events³), and that the user ID presented in state_key is local to the server that is being exploited. So this means the exploit is effectively limited to forcing users to leave, or at worst forcing members with an elevated power level to ban or kick other users on the same homeserver. This would have had a limited effect, so the attacker probably didn't bother, and just decided making everyone leave was destructive enough.

How could anyone defend against this without the patch?¶

If you ran a vulnerable homeserver, the only thing you could do is prevent the /_matrix/federation/v2/invite endpoint from being used. That usually was done either by turning off the server, or blocking the endpoint in the applicable reverse proxy.

If you ran a room that had vulnerable servers in it, there wasn't much you could do. After upgrading our room, I made sure to watch and check each new join to make sure it wasn't the attacker's fingerprint, and if it was (they tried twice), very quickly adding them to the room's access control list deny block. Note that because of how the attack worked, even just banning them wouldn't have been effective - only telling other servers not to proces requests originating from them would have.

What next?¶

Well, first of all, I'm going to go update the server I used to demo this exploit on.

And then, I'm going to take some time to give all of our federation code a really good shake. We've known that there's some suspicious looking calls in several places before but never really looked into, however after a few close calls and then a zero-day, it's clear that "it's probably fine" simply isn't good enough. With 0.5.0 finally released, we're able to properly do alpha/beta releases, and also backport critical fixes to previous versions without having to increment an RC, which also introduced new features etc etc.

We're also going to look at making a disaster recovery plan and "business" contuity plan so that we are more prepared for avengers-level threats like this in the future. This attack was especially effective as it was silent to the servers it exploited, and was only opaque if you happened to look at one of the rooms that was being attacked and inspected the resulting event sources. The result was that we lost most of our communication capability with our community, and it took a long time to properly distribute widespread word of the vulnerability. Setting up trusted external platforms to stay in contact with people will also be looked into, although luckily my fediverse account had quite a large reach already, and my announcement regarding the vulnerability got nearly 100 boosts and more interaction than anything I've ever posted, so that's probably a good start.

At the end of the day, we only managed to investigate, remediate, prevent, and recover from this attack because of the keen eyes of the US team members, my quick thinking and all-nighter, the fact we coincidentally had replacement rooms ready to go, the dedication of the rest of the team that joined us later, and the amazing people working on other homeserver softwares for Matrix. If the attack happened against literally any other community, it would likely have had a prolonged impact and been significantly more effective, so we're lucky it hit us first.

Closing notes¶

This was a wild ride. If you find security vulnerabilities in continuwuity (even if you can't verify it!), please follow our security policy at https://continuwuity.org/security.html. It's not advertised in our security policy, put personally I am happy to pay out (what I can) for moderate-high severity vulnerabilities, and would much rather part way with some of my money than have a community I've helped build just be evaporated.

Same goes for any other software I help maintain, but always follow a security policy if present, otherwise just DM me and we can go from there.

Huge thanks to everyone who helped us through this, and also to the community that stuck with us through this unprecedented and scary event. You're all stars.

Credits¶

Big thanks to the following people in particular:

@ginger and @aranjedeath - originally flagging the attack as it happened and confirming some details about the nature.
@jade for reviewing the vulnerabiltiy report, creating the GHSA, contacting the Matrix Foundation's security team, helping me with manually publishing the fixed docker images, and ultimately finalising the 0.5.0 release.
@tom for providing the infrastructure that allowed us to get fixed releases distributed so quickly.
Olivia Lee for coordinating a fix with the Grapevine homeserver implementation, and Charles Hall for reviewing said fix.
Jason Volk for bouncing ideas around during the investigation, coordinating the fix for Tuwunel, and opening a pull request to fix Conduit.
Timo Kösters and Matthias Ahouansou for responding to the security report for Conduit.
Tulir Asokan, Sky, and Cat for providing asgard.chat and hosting our external backup moderation bot that we used until we got everything under control.

The vulnerability could not have been found, fixed, and cured without all of the above people!

There's a lot more validation the server should do when sanitising a given membership template ↩
Obviously, it is possible that a reverse shell or something that is initiated by the compromised deployment would not be detected this way. But I double checked my logs at this time, just in case. ↩
Having a state_key being a user ID has a special meaning in Matrix, so usually it only affects the m.room.member event. Sending an event like m.room.power_levels with a non-empty state key will simply result in it being ignored. It'll still end up in the room state, but it won't have any functional effect. ↩↩