My boss asked me to set up a WordPress for a product landing page.
I naturally won't do this; it's no more than a couple of weeks ago that some SQL injection landed in the search query function of this monstrosity.
WordPress always was and always will be terrible.
So I set up the landing page with a Hugo static site, and I've been vibe-coding a WordPress-like dashboard that operates on git repositories containing Hugo sites.
I call it WorbPress (not released yet), and I'm sure that's what my boss told me to install, or I might've misheard.
And yes, it's written in Rust (with Axum and Alpine.js), because why not?
I feel like not choosing WordPress was a great choice but I'm not sure about the rest of the comment. A simple html file might make for a good landing page though.
The article doesn't go into how they managed the AI context when implementing things but I would not be surprised if it was done in a methodical way, 80% - 90% of the test could have passed.
Ill preface my comment with saying: this might not be the best solution give the goal of your project to iteratively loop through and improve on the tests each round, and using deps would make that process longer/more complicated having to work potentially with another project.
.....however.....
mago, a static analyzer for php is written in rust and might be useful for gaining some "free" performance uplift: https://github.com/carthage-software/mago. iirc it splits out a far bit of its internals so they can be used by other projects (citation needed)
I'm not sure about the other three, but Bun's rewrite from Zig to Rust was a bit of a joke. `unsafe`s in the thousands, a quarter-million lines of diff, and merged inside a week with no significant public discourse (at least, not much that was responded to by the author).
To be upfront about what this is: I'm not a Rust developer or a PHP internals person. This is an experiment in whether the "point the AI at the original project's test suite" methodology (the way Bun was driven against real-world suites) holds up when the human can't review the code. The oracle is php-src's own .phpt corpus, ~22k tests I didn't write. Current honest score: 3,844 passing (17.4%), with a realistic ceiling around 40-45% since the rest tests C extensions (GD, curl, intl, etc.) that are out of scope.
"Renders WordPress" means: fresh install completes into SQLite, the front page renders with real posts, a real theme and /wp-admin/ renders without issues. The REST API is untested, and it's currently ~55x slower than PHP on the front page (a bytecode VM is in progress, micro-benchmarks are already at 1-3x of PHP 8.5).
The scoreboard auto-generates into the repo after every run, whether the number went up or down.
Will you answer questions yourself, or will you simply pass on what your LLM of choice writes for you?
Edit: On further inspection, the blog design, the blog build, the blog articles and even the anecdotes used in the articles are entirely Claude generated.
Stop being so lazy. Get Claude to do something interesting and use your own intellect to assess and challenge the work in your write up. Or the other way around. Inject some amount of human work, at least. Otherwise, what's the point in sharing?
My boss asked me to set up a WordPress for a product landing page.
I naturally won't do this; it's no more than a couple of weeks ago that some SQL injection landed in the search query function of this monstrosity.
WordPress always was and always will be terrible.
So I set up the landing page with a Hugo static site, and I've been vibe-coding a WordPress-like dashboard that operates on git repositories containing Hugo sites.
I call it WorbPress (not released yet), and I'm sure that's what my boss told me to install, or I might've misheard.
And yes, it's written in Rust (with Axum and Alpine.js), because why not?
I feel like not choosing WordPress was a great choice but I'm not sure about the rest of the comment. A simple html file might make for a good landing page though.
Why not use headless WordPress?
Why is the AI only able to reach 17%?
Surely it can just keep iterating until it implements the full test suite?
Money probably. This is a cash burn project.
Is it astonishing you got to 17% with some vibe code? Sure.
But most of the stuff I’ve vibe coded this year has been astonishing by 2025’s standards.
If you got 100% I’d be genuinely blown away.
The article doesn't go into how they managed the AI context when implementing things but I would not be surprised if it was done in a methodical way, 80% - 90% of the test could have passed.
Does anyone know why we write code anymore? Why not pass through to an llm that generates the page on the fly (ssr)?
Is it cost ?
Yes: cost, speed, and reliability.
But all of those things are improving at shocking speeds, so I think we’re on a path where code is losing value quickly.
Yeah, I agree. It will be like serverless but for code : codeless.
It’s a disconcerting future.
Standards vary.
Ill preface my comment with saying: this might not be the best solution give the goal of your project to iteratively loop through and improve on the tests each round, and using deps would make that process longer/more complicated having to work potentially with another project.
.....however.....
mago, a static analyzer for php is written in rust and might be useful for gaining some "free" performance uplift: https://github.com/carthage-software/mago. iirc it splits out a far bit of its internals so they can be used by other projects (citation needed)
Interesting read. Given what the process is producing it's probably quite cost-effective?
Use AI to make Wordpress secure and not suck as much
Even an AGI can't accomplish the impossible.
Maybe the takeaway is that 20% is about all the LLM can muster.
> Maybe the takeaway is that 20% is about all the LLM can muster
At this point there's a long list of projects that have used LLMs to rewrite a system in Rust including:
With the exception of Bun, these projects were done pre-fable too, so I bet Fable will make these types of rewrites even easier.I'm not sure about the other three, but Bun's rewrite from Zig to Rust was a bit of a joke. `unsafe`s in the thousands, a quarter-million lines of diff, and merged inside a week with no significant public discourse (at least, not much that was responded to by the author).
Author here.
To be upfront about what this is: I'm not a Rust developer or a PHP internals person. This is an experiment in whether the "point the AI at the original project's test suite" methodology (the way Bun was driven against real-world suites) holds up when the human can't review the code. The oracle is php-src's own .phpt corpus, ~22k tests I didn't write. Current honest score: 3,844 passing (17.4%), with a realistic ceiling around 40-45% since the rest tests C extensions (GD, curl, intl, etc.) that are out of scope.
"Renders WordPress" means: fresh install completes into SQLite, the front page renders with real posts, a real theme and /wp-admin/ renders without issues. The REST API is untested, and it's currently ~55x slower than PHP on the front page (a bytecode VM is in progress, micro-benchmarks are already at 1-3x of PHP 8.5).
The scoreboard auto-generates into the repo after every run, whether the number went up or down.
Happy to answer anything.
This is a pretty cool experiment. Thanks for sharing!
Compare with FrankenPHP?
Will you answer questions yourself, or will you simply pass on what your LLM of choice writes for you?
Edit: On further inspection, the blog design, the blog build, the blog articles and even the anecdotes used in the articles are entirely Claude generated.
Stop being so lazy. Get Claude to do something interesting and use your own intellect to assess and challenge the work in your write up. Or the other way around. Inject some amount of human work, at least. Otherwise, what's the point in sharing?
But it will be as least 17% correct!
Why stop at 17%, come back when you are at 100% otherwise it's just another project.