While GitHub obsess over shoving AI into everything, the rest of the platform is genuinely crumbling and its security flaws are being abused to cause massive damage.
Last week Aqua Security was breached and a few repositories it owns were infected. The threat actors abused widespread use of mutable references in GitHub Actions, which the community has been screaming about for years, to infect potentially thousands of CI runs. They also abused an issue GitHub has acknowledged but refused to fix that allows smuggling malicious Action references into workflows that look harmless.
GHA can’t even be called Swiss cheese anymore, it’s so much worse than that. Major overhauls are needed. The best we’ve got is Immutable Releases which are opt in on a per-repository basis.
I worry that CI just got overcomplicated by default when providers started rocking up with templated YAML and various abstractions over it to add dynamic behaviour, dependencies, and so on.
Perhaps mixing the CI with the CD made that worse because usually deployment and delivery has complexities of its own. Back in the day you'd probably use Jenkins for the delivery piece, and the E2E nightlies, and use something more lightweight for running your tests and linters.
For that part I feel like all you need, really, is to be able to run a suite of well structured shell scripts. Maybe if you're in git you follow its hooks convention to execute scripts in a directory named after the repo event or something. Forget about creating reusable 'actions' which depend on running untrusted code.
Provide some baked in utilities to help with reporting status, caching, saving junit files and what have you.
The only thing that remains is setting up a base image with all your tooling in it. Docker does that, and is probably the only bit where you'd have to accept relying on untrusted third parties, unless you can scan them and store your own cached version of it.
I make it sound simpler than it is but for some reason we accepted distributed YAML-based balls of mud for the system that is critical to deploying our code, that has unsupervised access to almost everything. And people are now hooking AI agents into it.
From GitHub CTO in 2025 when they announced they're moving everything to Azure instead of letting GitHub's infrastructure remain independent:
> For us, availability is job #1, and this migration ensures GitHub remains the fast, reliable platform developers depend on
That went about as well as everyone thought back then.
Does anyone else remember back in ~2014-2015 sometime, when half the community was screaming at GitHub to "please be faster at adding more features"? I wish we could get back to platforms (or OSes for that matter) focusing in reliability and stability. Seems those days are long gone.
> I wish we could get back to platforms (or OSes for that matter) focusing in reliability and stability
That's only a valid sentiment if you only use the big players. Both of those have medium/smaller competitors that have shown (for decades) that they are extremely boring, therefore stable.
Try convincing the CTO that this panoply of smaller players will be around for 5yrs or worth the effort migrating to.
I'm at a much smaller outfit now so we have more freedom but I'd dread to think the arguments I would've had at the 4000+ employee companies I was at before.
Just to add a little bit of nuance to this not because I'm trying to defend GitHub, they definitely need to up their reliability, but the 90% uptime figure represents every single service that GitHub offers being online 90% of the time. You don't need every single service to be online in order to use GitHub. For example, I don't use Copilot myself and it's seen a 96.47% uptime, the worst of the services which are tracked.
I’m surprised it’s even as high as three nines, at one point in 2025 it was below 90%; not even a single nine.[0] (which, to be fair includes co-pilot, which is the worst of availabilities).
People on lobsters a month ago were congratulating Github on achieving a single nine of uptime.[1]
I make jokes about putting all our eggs in one basket under the guise of “nobody got fired for buying x; but there are sure a lot of unemployed people”- but I think there’s an insidious conversation that always used to erupt:
“Hey, take it easy on them, it’s super hard to do ops at this scale”.
Which lands hard on my ears when the normal argument in favour of centralising everything is that “you can’t hope to run things as good as they do, since there’s economies of scale”.
These two things can’t be true simultaneously.. this is the evidence.
I'm amazed Microslop let us keep GitHub this long. Probably because they're training AI on it? To have a direct line to developers? I don't see why else they would've bothered with something that was so anti everything they stood for
I wonder how much of this is down to the massive amount of new repos and commits (of good or bad quality!) from the coding agents. I believe that the App Store is struggling to keep up with (mostly manual tbf) app reviews now, with sharp increases in review times.
I find it hard to believe that an Azure migration would be that detrimental to performance, especially with no doubt "unlimited credit" to play with?
You can provision Linux machines easily on Azure and... that's all you need? Or is the thinking that without bare metal NVMe mySQL it can't cope (which is a bit of a different problem tbf).
I'm surprised GitHub got by acting fairly independently inside Microsoft for so long. I'm also surprised GitHub employees expected that to last
The real problem today IMO is that Microsoft waited so long to drop the charade that they now felt like they had to rip the bandaid. From what I've heard the transition hasn't gone very smoothly at all, and they've mostly been given tight deadlines with little to no help from Microsoft counterparts.
This was after seeing those ridiculous PRs where microsoft engineers patiently deconstructed AI slop PRs they were forced to deal with on the open source repos they maintained.
When he was gone a few months later and github was folded into microsoft's org chart the writing was firmly on the wall.
He was never truly independent though. The org structure was such that the GitHub CEO reported up through a Microsoft VP and Satya. He was never really a CEO after the acquisition, it was in name only.
Also of note is that the Microsoft org chart always showed GitHub in that structure while the org chart available to GitHub stopped at their CEO. Its not that they were finally rolled into Microsoft's org chart so much as they lifted the veil and stopped pretending.
A migration like this is a monumental undertaking to the level of where the only sensible way to do a migration like this is probably to not do it. I fully expect even worse reliability over the next few years before it'll get better.
As of recently (workflows worked for months) I even have part of my CI on actions that fails with [0]
2026-02-27T10:11:51.1425380Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2026-02-27T10:11:56.2331271Z ##[error]The operation was canceled.
I had to disable the workflows.
GitHub support response has been
“ We recommend reviewing the specific job step this occurs at to identify any areas where you can lessen parallel operations and CPU/memory consumption at one time.”
That plus other various issues makes me start to think about alternatives, and it would have never occurred to me one year back.
We've jumped ship to self-hosted Jenkins. Woodpecker CI looks cool but Jenkins seemed like a safer bet for us. It's been well worth the effort and it's simplified and sped up our CI massively.
Once we got the email that they were going to charge for self-hosted runners that was the final nail in the coffin for us. They walked it back but we've lost faith entirely in the platform and vision.
Ever since Microsoft's acquisition of GitHub 8 years ago, GitHub has completely enshittified and has become so unreliable, that even self-hosting a Git repository or self-hosted actions yourself would have a far better uptime than GitHub.
This sounded crazy in 2020 when I said that in [0]. Now it doesn't in 2026 and many have realized how unreliable GitHub has become.
If there was a prediction market on the next time GitHub would have at least one major outage per week, you would be making a lot of money since it appears that AI chatbots such as Tay.ai, Zoe and Copilot are somewhat in charge of wrecking the platform.
Any other platform wouldn't tolerate such outages.
While GitHub obsess over shoving AI into everything, the rest of the platform is genuinely crumbling and its security flaws are being abused to cause massive damage. Last week Aqua Security was breached and a few repositories it owns were infected. The threat actors abused widespread use of mutable references in GitHub Actions, which the community has been screaming about for years, to infect potentially thousands of CI runs. They also abused an issue GitHub has acknowledged but refused to fix that allows smuggling malicious Action references into workflows that look harmless.
GHA can’t even be called Swiss cheese anymore, it’s so much worse than that. Major overhauls are needed. The best we’ve got is Immutable Releases which are opt in on a per-repository basis.
I worry that CI just got overcomplicated by default when providers started rocking up with templated YAML and various abstractions over it to add dynamic behaviour, dependencies, and so on.
Perhaps mixing the CI with the CD made that worse because usually deployment and delivery has complexities of its own. Back in the day you'd probably use Jenkins for the delivery piece, and the E2E nightlies, and use something more lightweight for running your tests and linters.
For that part I feel like all you need, really, is to be able to run a suite of well structured shell scripts. Maybe if you're in git you follow its hooks convention to execute scripts in a directory named after the repo event or something. Forget about creating reusable 'actions' which depend on running untrusted code.
Provide some baked in utilities to help with reporting status, caching, saving junit files and what have you.
The only thing that remains is setting up a base image with all your tooling in it. Docker does that, and is probably the only bit where you'd have to accept relying on untrusted third parties, unless you can scan them and store your own cached version of it.
I make it sound simpler than it is but for some reason we accepted distributed YAML-based balls of mud for the system that is critical to deploying our code, that has unsupervised access to almost everything. And people are now hooking AI agents into it.
From GitHub CTO in 2025 when they announced they're moving everything to Azure instead of letting GitHub's infrastructure remain independent:
> For us, availability is job #1, and this migration ensures GitHub remains the fast, reliable platform developers depend on
That went about as well as everyone thought back then.
Does anyone else remember back in ~2014-2015 sometime, when half the community was screaming at GitHub to "please be faster at adding more features"? I wish we could get back to platforms (or OSes for that matter) focusing in reliability and stability. Seems those days are long gone.
GitHub have not really got much better at adding new features either though :(
They added the service unavailable feature.
> I wish we could get back to platforms (or OSes for that matter) focusing in reliability and stability
That's only a valid sentiment if you only use the big players. Both of those have medium/smaller competitors that have shown (for decades) that they are extremely boring, therefore stable.
Try convincing the CTO that this panoply of smaller players will be around for 5yrs or worth the effort migrating to.
I'm at a much smaller outfit now so we have more freedom but I'd dread to think the arguments I would've had at the 4000+ employee companies I was at before.
I think stability and reliability have vastly improved over the last years in general (not necessarily talking about gh specifically)
It's just that everybody is using 100 tools and dependencies which themselves depend on 50 others to be working.
Perhaps when they switch over fully to Azure they'll forget to disable IPv6 access. One can dream
Just to add a little bit of nuance to this not because I'm trying to defend GitHub, they definitely need to up their reliability, but the 90% uptime figure represents every single service that GitHub offers being online 90% of the time. You don't need every single service to be online in order to use GitHub. For example, I don't use Copilot myself and it's seen a 96.47% uptime, the worst of the services which are tracked.
It's time to look for a decentralized Non-Hub alternative.
I’m surprised it’s even as high as three nines, at one point in 2025 it was below 90%; not even a single nine.[0] (which, to be fair includes co-pilot, which is the worst of availabilities).
People on lobsters a month ago were congratulating Github on achieving a single nine of uptime.[1]
I make jokes about putting all our eggs in one basket under the guise of “nobody got fired for buying x; but there are sure a lot of unemployed people”- but I think there’s an insidious conversation that always used to erupt:
“Hey, take it easy on them, it’s super hard to do ops at this scale”.
Which lands hard on my ears when the normal argument in favour of centralising everything is that “you can’t hope to run things as good as they do, since there’s economies of scale”.
These two things can’t be true simultaneously.. this is the evidence.
[0]: https://mrshu.github.io/github-statuses/
[1]: https://lobste.rs/s/00edzp/missing_github_status_page#c_3cxe...
I'm amazed Microslop let us keep GitHub this long. Probably because they're training AI on it? To have a direct line to developers? I don't see why else they would've bothered with something that was so anti everything they stood for
“Microsoft Tentacle” - Now there’s a name for a new product line.
Three nines is more than enough
Cheap, fast, and good. I see which two they chose.
I wonder how much of this is down to the massive amount of new repos and commits (of good or bad quality!) from the coding agents. I believe that the App Store is struggling to keep up with (mostly manual tbf) app reviews now, with sharp increases in review times.
I find it hard to believe that an Azure migration would be that detrimental to performance, especially with no doubt "unlimited credit" to play with?
You can provision Linux machines easily on Azure and... that's all you need? Or is the thinking that without bare metal NVMe mySQL it can't cope (which is a bit of a different problem tbf).
Maybe they need to improve release strategy with Copilot AI Review =)
I'm surprised GitHub got by acting fairly independently inside Microsoft for so long. I'm also surprised GitHub employees expected that to last
The real problem today IMO is that Microsoft waited so long to drop the charade that they now felt like they had to rip the bandaid. From what I've heard the transition hasn't gone very smoothly at all, and they've mostly been given tight deadlines with little to no help from Microsoft counterparts.
If this were a place for memes, then I'd share that swimming pool meme with Microsoft holding up copilot while GitHub is drowning.
Then Azure Dev Ops (formerly known as Visual Studio Team System) dead o n the ocean floor.
Although given how badly GitHub seems to be doing, perhaps it's better to be ignored.
It operated with an independent CEO for a long while.
When I saw his interview: https://thenewstack.io/github-ceo-on-why-well-still-need-hum... i thought "oh, there is some semblance of sanity at Microsoft".
This was after seeing those ridiculous PRs where microsoft engineers patiently deconstructed AI slop PRs they were forced to deal with on the open source repos they maintained.
When he was gone a few months later and github was folded into microsoft's org chart the writing was firmly on the wall.
He was never truly independent though. The org structure was such that the GitHub CEO reported up through a Microsoft VP and Satya. He was never really a CEO after the acquisition, it was in name only.
Also of note is that the Microsoft org chart always showed GitHub in that structure while the org chart available to GitHub stopped at their CEO. Its not that they were finally rolled into Microsoft's org chart so much as they lifted the veil and stopped pretending.
https://news.ycombinator.com/item?id=47315878
see also: https://thenewstack.io/github-will-prioritize-migrating-to-a...
A migration like this is a monumental undertaking to the level of where the only sensible way to do a migration like this is probably to not do it. I fully expect even worse reliability over the next few years before it'll get better.
I wonder if they are still running on a single MySQL machine
The article mentions some concerns related to migrating their MySQL clusters off bare metal.
As of recently (workflows worked for months) I even have part of my CI on actions that fails with [0]
2026-02-27T10:11:51.1425380Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled. 2026-02-27T10:11:56.2331271Z ##[error]The operation was canceled.
I had to disable the workflows.
GitHub support response has been
“ We recommend reviewing the specific job step this occurs at to identify any areas where you can lessen parallel operations and CPU/memory consumption at one time.”
That plus other various issues makes me start to think about alternatives, and it would have never occurred to me one year back.
[0] https://github.com/Barre/ZeroFS/actions/runs/22480743922/job...
We've jumped ship to self-hosted Jenkins. Woodpecker CI looks cool but Jenkins seemed like a safer bet for us. It's been well worth the effort and it's simplified and sped up our CI massively.
Once we got the email that they were going to charge for self-hosted runners that was the final nail in the coffin for us. They walked it back but we've lost faith entirely in the platform and vision.
Ever since Microsoft's acquisition of GitHub 8 years ago, GitHub has completely enshittified and has become so unreliable, that even self-hosting a Git repository or self-hosted actions yourself would have a far better uptime than GitHub.
This sounded crazy in 2020 when I said that in [0]. Now it doesn't in 2026 and many have realized how unreliable GitHub has become.
If there was a prediction market on the next time GitHub would have at least one major outage per week, you would be making a lot of money since it appears that AI chatbots such as Tay.ai, Zoe and Copilot are somewhat in charge of wrecking the platform.
Any other platform wouldn't tolerate such outages.
[0] https://news.ycombinator.com/item?id=22867803