Issues with connecting to the app

Updates

Postmortem
May 20, 2026 at 3:21 AM
Postmortem
May 20, 2026 at 3:21 AM
Fluxer is now running a fully distributed Erlang cluster of 16 gateway instances, spread across four physical machines. This should minimise impact of individual node failures moving forward. Thanks for flying Fluxer.
Resolved
May 20, 2026 at 3:19 AM
Resolved
May 20, 2026 at 3:19 AM
They said it couldn't be done.
Update
May 20, 2026 at 3:09 AM
Update
May 20, 2026 at 3:09 AM
We implemented a fix and are currently monitoring the result.
Update
May 20, 2026 at 3:03 AM
Update
May 20, 2026 at 3:03 AM
We identified some issues with reconnecting to communities that would yield inconsistent state depending on which node in the cluster you're connecting from. We're currently channeling our inner Joe Armstrong and the powers of the BEAM to rectify the situation as quickly as possible.
Update
May 20, 2026 at 2:58 AM
Update
May 20, 2026 at 2:58 AM
Things are progressing on the recovery side of things. There is light at the end of the tunnel!
Update
May 20, 2026 at 2:52 AM
Update
May 20, 2026 at 2:52 AM
The waves are doing their thing, and we're soon ready to reconnect people to communities in waves too. Did I mention that the gateway is now running at 16 replicas across 4 physical nodes?
Update
May 20, 2026 at 2:44 AM
Update
May 20, 2026 at 2:44 AM
We're now attempting to let everyone back in again in ~~waves~~.
Update
May 20, 2026 at 2:37 AM
Update
May 20, 2026 at 2:37 AM
We're changing strategy to prevent thundering herd by forcing the cluster to settle down, and when load returns to normal, we'll disconnect individual gateway sockets in waves at runtime instead.
Update
May 20, 2026 at 2:32 AM
Update
May 20, 2026 at 2:32 AM
We are now rolling the cluster to force clients to reconnect and recognise the new session rollout.
Update
May 20, 2026 at 2:27 AM
Update
May 20, 2026 at 2:27 AM
We have removed the taint on the Kubernetes worker node previously reserved to the single gateway replica, enabled clustering in the gateway deployment, scaled it out to 16 replicas across all four worker nodes, and rescheduled all stateless workloads in the cluster to balance things out, and we are monitoring the result.
Update
May 20, 2026 at 2:22 AM
Update
May 20, 2026 at 2:22 AM
We're taking this opportunity to roll out our new gateway clustering system.
Monitoring
May 20, 2026 at 2:07 AM
Monitoring
May 20, 2026 at 2:07 AM
We implemented a fix and are currently monitoring the result.

Fluxer - Issues with connecting to the app – Incident details

All systems operational