In some cases, workers are also being asked to automate the parts of their jobs they enjoy most, Hinds said on the podcast, pointing to customer-service employees who enjoy building relationships but are increasingly expected to supervise AI agents instead.
"That's what gives you joy and meaning at work," she said. "That is very dangerous."
What's a 20% productivity gain if I constantly feel deflated by work that used to energize me? That's going to give back the productivity gain and more, while also decreasing my quality of life.
This is an important point. My light-bulb moment was when I talked to a product owner in a previous job, and I expressed surprise around an expensive planned change, because it didn't seem that valuable to our customers.
He said, "Almost half of what we do is not that valuable to our customers, but it's valuable to him, and her, and him", pointing through the conference-room window at my fellow programmers, "and that's why we do it. If we only did things that were very valuable to our customers, we wouldn't have nearly as many good engineers on the team as we do."
When I was given a semi-ultimatum "use AI or get fired" kind of thing for writing code I had a brief bout of depression/sadness. Whereas my friend doesn't care/says "I get paid to not work". I have gotten past it, now I'm just like, I'll do what I need to do to get paid since unfortunately I'm in a lot of debt so I need this job. I learned to code in 2013 so I like typing the code myself but now it seems like a waste of time. I still write my own code for myself/hardware hobby.
FWIW, I was just like you but then completely gave in and found enjoyment in the act of simply ideating and shipping. The gap now between idea and implementation is so small. At first I was depressed but now I'm in the acceptance phase of grief. We aren't going back, for better or for worse.
Heh, my employer kept pushing us to use Copilot. And over the last months the cli has actually gotten halfway decent... So I did start using it. Albeit sparingly because the token allotment was always pretty low.
Then they announced that they removed the limit/making further request just cost extra for them. That's when I started using it as I did for my personal projects I pay subscriptions for...
Then Copilot increased their pricing. Announced in April I think? But took effect this month. This Monday they announced that the limits are back in effect. So I guess I'll be going back to hand coding next week, as my tokens are about to run out ಥ ‿ ಥ
Corporate is always so silly. I mean I know how it happens: everyone just wants to get their bonus, so different management roles try to coerce the employees to do whatever best serves their bottomline - rarely related to whatever is good for the corporation... But it's always silly to live through it.
Where did the 20% number come from? I’d argue it’s way more than that (or variable, i.e. dependent on who’s using it/how it’s being used/what it’s being used on).
Having said that, the number, to me, doesn’t even matter. You could replace that with 200%, and it’d be just as true.
> customer-service employees who enjoy building relationships but are increasingly expected to supervise AI agents instead.
It sucks for the employees, otoh it might be the only way we're going to beat Baumol's Cost Disease.
In the past few decades productivity has exploded, but service employees have largely failed to increase productivity in any way because it's harder to automate these tasks.
It's the reason the costs of things like education and healthcare are downright extortionate, the reason you're paying back your college well into your fifties, the reason you don't call an ambulance for someone in the US because you don't want to ruin their life financially.
We may have to trade the personal fulfillment in these jobs for the broader affordable access to these services.
Education and healthcare are both ridiculously overpriced in the US for reasons that have little to do with service costs. Questionable financial systems behind these services are much more to blame.
>It's the reason the costs of things like education and healthcare are downright extortionate, the reason you're paying back your college well into your fifties, the reason you don't call an ambulance for someone in the US because you don't want to ruin their life financially.
You might wanna think again on that line of reasoning, because plenty of other countries have the same dynamics with respect to service employees, but they don't suffer the very US-only problem of ridiculous education and healthcare costs where calling an ambulance can ruin someones life.
As a former first responder, I'm interested in hearing more about how AI-powered ambulance services would work. (related question: will the 911 dispatcher be AI?)
I don't think first responders are ever going to be at risk.
Administrators, on the other hand, are a massive part of the costs in the health sector (IIRC the Obama administration chickened out on truly reforming healthcare exactly because the number of administrators that would be made redundant would tank the economy). A significant amount of administrative work can be automated.
The vast majority of jobs are not full-filling or enjoyable. Because there were way more job seekers than jobs.
Programming was one of the ones which was, because there were fewer programmers than openings. Now that's flipping, thus naturally, the enjoyment is going to be sucked out of it.
For me it feels less like filling out reports, and more like mentoring an intern who can search for stuff really quickly but forgets everything at the end of the day due to anterograde amnesia.
Except the intern is trapped inside an iron lung and must communicate entirely by text. And also has zero real creativity or self-motivation.
Most people don't have the luxury of finding joy and meaning in their work. You aren't hired to have fun, you're hired to create value and wealth for your employer. Just do what literally everyone else does and grind through it until you get a pension and hope it's enough to let you die with a bit of dignity.
You pay per token, even on subscription models the limit is tokens.
If I was valued at 1 trillion dollars, and I was in the hole enough to sink a couple small countries' GDP, maybe I would slowly start to optimize to maximize token usage.
I want to sell tokens, how do I sell more tokens? Not by doing the same work in less tokens, that's for sure.
This is like if you pay me by the hour and then excitedly tell me that you keep paying 10k a month and it's great. I will most certainly not work faster, in this hypothetical, if you tell me you love spending money because it gives you a dopamine rush. I would probably spend a couple more hours REALLY thinking about the task, maybe writing some docs nobody will read, maybe considering multiple options, doing benchmarks, doing research, and then later maybe ill do the actual task as well.
Im not saying these AI companies are scamming us, but the incentives are there and extremely clear. The only thing currently holding it back is that there is some vague kind of competition.
6 hours a week is low, unless its the average spread across industries. I think I spend more time in Claude Code via the CLI versus any other app I have on my laptop.
Like others said, the frustration is when it gets something so wrong you just think "wow, how'd you mess that up?" but when it gets it right its kind of nice. I also dont like that I basically tell Claude what to do, and then either go to busy work or waste time on the internet.
I kind of enjoy exploring black boxes, trying how different inputs are mapping to differences in outputs. It's kind of like hacking. The problem is, they keep altering the box.
I spend at least 6 hours a week arguing with bots owned by other teams, as I’m unable to reach a human before I bypass their bot. 10k person company, clients are paying for my time.
I've found that setting good guardrails, and running in a sandbox so that the agent doesn't keep asking tedious permission questions, makes things go a LOT smoother.
Generally, I spend anywhere between 15 mins and an hour setting things up (depending on how well the project is set up for AI work), and then set the agent going, coming back in a half-hour to an hour to check its progress. Generally, the tooling keeps it honest (for golang, forbidigo is AWESOME). 80% of the questions the agent asks me require a lot of thought. 20% of what it does needs correction.
The other thing to remember with LLMs is that they are NOT human, and won't react in a human way. So you'll see strikes of "brilliance" followed by the absolutely bizarre. But good guardrails keep that to a minimum.
> sandbox so that the agent doesn't keep asking tedious permission questions
> 80% of the questions the agent asks me require a lot of thought. 20% of what it does needs correction.
I've found even the permissions questions give me veto power over fruitless lines of exploration, especially in planning mode. For instance, it wants to use tools I don't have installed to access information that I have made available elsewhere? I get a chance to override this decision by declining the permissions check and redirecting it. Feels tedious, but helps me understand what information sources are influencing it. I head off a lot of bugs this way.
I never let it go into planning mode, other than to output a plan file that I can audit before giving it the go-ahead to implement. After that I don't want to be bothered, so --dangerously-skip-permissions keeps all but real questions out of the loop, and I can do something else while it works rather than babysit.
AI should be assisting us, instead it's doing the job and it's us being an assistant to it. This is a monumental shift that people seem to be missing in how knowledge working is changing and it's going beyond mere coding.
Guardrails, prompts, whatever, it's us helping it doing the job, not the other way around.
Opus 4.6 was the last genuinely good assistant LLM, but since then it's quite clear that the training/reinforcement is focused "given prompt -> do task" so it's behavior is more and more about doing it itself, not helping you. If you try to use it as an assistant it just sucks and is perma wired into finding the solution. Many times I want it to help me investigate, and his answer will still be focused on the fix, not answering my questions.
4.7 first, 4.8 later and fable are absolute disasters as assistants.
Fable in particular is so "intelligent" that it will push with very strong and intelligent takes even if it is completely wrong.
Wow... Our experiences have been very different, then. I've found each upgrade of Opus to be a noticeable improvement in its complex reasoning and delegation capabilities over its predecessor.
To me, this feels in many ways like a technical manager or team lead's job, where I guide the process along using my knowledge and experience, and then let the agent fill in the rest (to the best of its ability).
The agent can't really learn from its mistakes (at least, not without consuming precious context), so I apply a blameless postmortem process, updating the guardrails whenever it goes astray in the same way more than once.
And really, I'd rather be contemplating the more difficult and interesting questions of architecture, environment, ergonomics and market fit, so it suits me fine.
AI should be assisting us, instead it's doing the job and it's us being an assistant to it.
If you're a manager and you ask a report to do something and they come back with a question, does that mean you're now their assistant?
I give agents the tasks, I answer their questions, I make choices about the tradeoffs in their plan, I supervise their implementation, I review their output, I have them walk me through things. In what way is this not delegating to them and managing their work, just like a more junior employee?
I think this is just a misunderstanding of how most technology has always worked?
Consider what is happening in most construction sites. The heavy work is absolutely from the technology on site. But without people there to oversee it and keep it working, it would fail.
And that is almost certainly true at any industrial site. Indeed, look up videos of high tech looms. A large portion of the technology added to them are so that the operators can locate the fault and fix it.
The problem (okay, one of the problems) with renting other people's models is, as you mentioned, that they can and will change out the model without notifying you ahead of time, and you don't always get to control which model you use. (They might decide to retire it, and you won't be able to get it back if they do).
Which is why (well, part of why) I think the long-term trend will be towards self-hosting models. Right now the frontier models are far enough ahead of the self-hosted ones that there are lots of people willing to pay by the token to rent someone else's model, because they get more value for money from that than from self-hosting models.
But the frontier companies won't be able to keep up their current levels of expenditure forever. At some point the investors are going to say "Hey, so, um, when am I going to see some return on my investment?" and then the current subsidized subscriptions (including the one my employer uses) are going to go away, much like what happened with Copilot this month.
And then the locally-hosted models are going to suddenly look like a more attractive picture. Because where you might have been willing to spend $100/month/employee to rent time on models in someone else's data center, you might suddenly balk at spending $500/month/employee. You might say "Hey, you know what? A $50,000 up-front capital investment is only, what, one month's worth of subscriptions for our 100 employees? Yeah, okay, I'll approve the hardware purchase. Get that self-hosted model set up and then we'll cancel the subscription and switch over."
Not everyone is going to do that. But once the locally-hosted models are good enough, the first few people who do so and report success are going to start a snowball effect. And it will likely be driven by money first, but it will also have the effect, that people will slowly discover, of meaning that you can better predict the model you're using. It will continue to work the same way next year that it is working this year; or if it doesn't, it's because you chose to install the new version.
And when that happens (I'm saying "when", not "if" because although it might take some time, I think it's inevitable in the long run), the frontier-model rental companies are going to struggle to stay afloat. Except for the ones who saw this coming and transitioned to a non-subscription income source somehow (maybe by selling licenses to self-host their frontier models for $$BIGNUM), or who have some other revenue stream besides renting out models.
Well... as a human software engineer, I've been the one with very strong, intelligent, completely wrong takes. The question is, are the LLMs improving faster than you can improve a junior dev? And is their ceiling as high?
Your experience pretty much mirrors my own. I hate to be the 'they're holding it wrong' guy but there's certainly a lot of people out there that have no real idea how to effectively leverage AI.
That’s a problem with the tool not the people. AI is marketed literally as writing one sentence and having some app perfectly output. Just check any of the landing pages for Claude code or codex or GitHub copilot…
i've seen a number of articles claiming things like "devs self report they'er +x% more productive with AI, but actually they're -y% LESS efficient!". and i think that this is explanation for why.
as a boss (or researcher) i'm going to measure productivity based on amount of output per hour that i'm paying you; as a workers, i'm going to measure productivity based on amount of output relative to the amount of effort i'm putting in.
so what may be happening is that bosses see that output is at 80% (productivity down!) but workers see that they can give that 80% output with 40% effort (productivity up!).
Not sure among devs, but I do know that in other positions in typical corporate bureaucracy, people have a propensity to not report their own automations or productivity gains upward, because the reward structure isn't there.
Early on in my days as a sysadmin, I automated a ton of my role when the rest of the team was still doing ClickOps. The reward for doing so was more work and expectations without the additional pay increase to justify my new found productivity. That happens all over the workforce, and so people will just keep it to themselves. I learned my lesson at that first job real fast that if I'm able to have the same, or greater output, for half the time, I keep that to myself so I can use the automation to free up my own time instead of have it filled by the company.
I wonder how much of that is happening now with AI in non-technical roles.
> so what may be happening is that bosses see that output is at 80% (productivity down!)
If an initiative produces only 80% of the previous results and you’re paying large token bills on top of the same wages, the AI is going to get cut off.
> i've seen a number of articles claiming things like "devs self report they'er +x% more productive with AI, but actually they're -y% LESS efficient!".
Are you thinking of the old METR evals? Their more recent evals showed an actual performance improvement.
The old report is still circulated as bait for AI skeptics.
> so what may be happening is that bosses see that output is at 80% (productivity down!) but workers see that they can give that 80% output with 40% effort (productivity up!).
So why is it that the bosses are the ones that are so enthusiastic about adoption?
My challenge has been trying to manage my higher-level context. I've gotten a pretty good setup where I have project-level orchestrator agents that can spin up workers to implement tasks with minimal oversight, and the resulting work is usually quite good (especially after I give it the mandatory "make the comments less verbose" refining, etc.). But that means I'm doing even more context-switching. I've gotten to the point where I have a half-dozen draft PRs that just need my review before I tag my colleagues, and trying to dig up the context from all of those tasks can be paralyzing.
This kind of reminds me of an article that I saw on HN ages back, there's like a subset of office workers who automated their Excel jobs, and just show up to work, read books, and do literally anything, while Excel does their work for them, and they collect their paycheck.
I just started using Claude Code for my work as a sysadmin. For my work, it's great. I don't need to wrestle with MySQL joins, claude gets even the most complex ones right WAY faster than I would. Same with new Terraform stuff. Things that would have taken me a day are cut to less than an hour.
So for my work, it's made me much better at my job. Much faster and more accurate.
I can write a simple query before Claude finishes reading, querying the semantic layer, checking my files, then writes a query that I have to approve, reads the results, hides them (ctrl+o usually works), and gives me a summary.
We’ve reached this inflection point where it’s faster for me to do most tasks again.
I’m sure fast mode costing more money plays a role.
For me, AI can sometimes create a false sense of productivity. It's similar to how in the past, people would spend time creating the perfect setup with notion templates, pomodoro timers and productivity tools, or tweaking their environment for maximum productivity, instead of actually doing productive work.
But now it's happening at the company level: "We're going to add a chatbot to increase productivity! Now MCP tools! Then agentic workflows! We’ll add skills, and now productivity will go up! Maybe loops will do it?"
I don't see a lot of talk about how AI development breaks the old feedback loop of write code, watch it run, change it, repeat. I really hate sitting around waiting for the agent to get done planning, reading the plan, then waiting for the agent to get done coding. It's those 5-10 minute windows when its working that really sap my patience and suck all the fun out of our jobs. Writing code by hand is just more fun.
This is something that I don't see discussed a lot in these conversations, but its true for a ton of folks.
I didn't end up with a career in tech because I wanted to tell a bot to do the fun part of my job for me, leaving me only with the boring tedious parts. I didn't sign up to be a full time code reviewer, and I certainly never wanted to be a manager, yet alone a manager of bots.
It also can't help but spark feelings of "Why am I getting paid 6 figures for this??" and that makes me nervous for the future.
I imagine the engineers and assemblers in factories pre-assembly line felt the same when things started getting automated there. There's an element of craftsmanship that gets taken away as the product moves from being artisanal, hand crafted to mass produced.
I wonder if its too late for me to pivot to hardware
Yeah its hard to deny just the raw throughput from the AI. Like it really is doing work in hours that would take me days.
But those times when I had to drop down into a repl and play around with the output of a method. Or try different ways of doing what anyone else would think is boring, like array manipulation - that's a lot of what I actually LIKE to do.
A big part of me just hopes I can hang in there for another... decade, or two. Then I can retire! Maybe.
I don't know what they're complaining about. AI has freed us from the drudgery of craftsmanship, letting us focus on the important stuff—managerial and administrative work!
(There's a reason why I call it the MBA's stone. It transmutes all knowledge work into a problem of management.)
Understanding what is going on with AI productivity is … frustrating to say the least.
The best I can say is that genAI is a self reported a 20% efficiency boost, and for a very (very) small group of people, it’s maybe a 2-3x boost. (And if you are at a frontier lab, you go fly into the big bucket of exceptions)
At this point, for most use cases, AI productivity is either the equivalent of giving people 3D printers, and seeing little benefit, or signing up for an outsourcing service, just without the development of human capital anywhere.
I think it depends on how you measure the boost. If you are talking about generating a first draft then yes, the boost is there. If you’re talking about completing the project in all well tested and architected aspects, then overall there really isn’t a boost.
6 hours of debugging and docs reading is not equal to 6 hours of prompt fiddling. The return of value beyond the few fixes applied will be almost nil from the fiddling.
Yeah, Amazon warehouses are just the same. Humans are only used for tasks beyond the comprehension or physical ability of a machine at that point in time.
The problem is, we haven't had the debate on a societal level if we want to go the star trek route (aka, we give our darn best to automate everything so that humans have the time to do whatever they want) or the realcommunism route (we ward off automation so that we have jobs for people).
The result of that debate not having been made is the third possible outcome - rabid capitalism automates everything as soon as it is profitable and lays off the humans, focusing on getting higher margins out of less people if need be; the best example for that IMHO is Disneyland or Vegas going on ridiculous nickel-and-diming tours. In the end however, there will be no one left any more who has employment and we'll be in for quite the riots.
I could care less about bot sitting (haven’t we always written our own automation?), but it’s botsitting the unverified slop that people send you that fuels frustration. I thought I worked with competent people who respected me
Our product lead/manager recently sent me an AI generated PRD (complete with a Claude Code spec!) to build a core feature which we have had for over 2 years (and is a highly used core feature by our customers).
I just can't imagine tanking my trust with my coworkers by doing something like that.
So we're now in this world where everyone is instantly 10x more productive at turning their thoughts into code. Now, think about the coworkers you've had that are middling to mediocre. Do you want them to have a tool that makes them 10x more productive?
That's what I wonder about, what happens to all those folks.
Your coworkers haven't changed. What changed is that people can hand off work they never had to think through themselves. So you don't know what they checked and you don't know what you need to. You just have to read the whole thing.
This really hit home for me:
In some cases, workers are also being asked to automate the parts of their jobs they enjoy most, Hinds said on the podcast, pointing to customer-service employees who enjoy building relationships but are increasingly expected to supervise AI agents instead.
"That's what gives you joy and meaning at work," she said. "That is very dangerous."
What's a 20% productivity gain if I constantly feel deflated by work that used to energize me? That's going to give back the productivity gain and more, while also decreasing my quality of life.
This is an important point. My light-bulb moment was when I talked to a product owner in a previous job, and I expressed surprise around an expensive planned change, because it didn't seem that valuable to our customers.
He said, "Almost half of what we do is not that valuable to our customers, but it's valuable to him, and her, and him", pointing through the conference-room window at my fellow programmers, "and that's why we do it. If we only did things that were very valuable to our customers, we wouldn't have nearly as many good engineers on the team as we do."
When I was given a semi-ultimatum "use AI or get fired" kind of thing for writing code I had a brief bout of depression/sadness. Whereas my friend doesn't care/says "I get paid to not work". I have gotten past it, now I'm just like, I'll do what I need to do to get paid since unfortunately I'm in a lot of debt so I need this job. I learned to code in 2013 so I like typing the code myself but now it seems like a waste of time. I still write my own code for myself/hardware hobby.
FWIW, I was just like you but then completely gave in and found enjoyment in the act of simply ideating and shipping. The gap now between idea and implementation is so small. At first I was depressed but now I'm in the acceptance phase of grief. We aren't going back, for better or for worse.
Heh, my employer kept pushing us to use Copilot. And over the last months the cli has actually gotten halfway decent... So I did start using it. Albeit sparingly because the token allotment was always pretty low.
Then they announced that they removed the limit/making further request just cost extra for them. That's when I started using it as I did for my personal projects I pay subscriptions for...
Then Copilot increased their pricing. Announced in April I think? But took effect this month. This Monday they announced that the limits are back in effect. So I guess I'll be going back to hand coding next week, as my tokens are about to run out ಥ ‿ ಥ
Corporate is always so silly. I mean I know how it happens: everyone just wants to get their bonus, so different management roles try to coerce the employees to do whatever best serves their bottomline - rarely related to whatever is good for the corporation... But it's always silly to live through it.
> What's a 20% productivity gain
Where did the 20% number come from? I’d argue it’s way more than that (or variable, i.e. dependent on who’s using it/how it’s being used/what it’s being used on).
Having said that, the number, to me, doesn’t even matter. You could replace that with 200%, and it’d be just as true.
> customer-service employees who enjoy building relationships but are increasingly expected to supervise AI agents instead.
It sucks for the employees, otoh it might be the only way we're going to beat Baumol's Cost Disease.
In the past few decades productivity has exploded, but service employees have largely failed to increase productivity in any way because it's harder to automate these tasks.
It's the reason the costs of things like education and healthcare are downright extortionate, the reason you're paying back your college well into your fifties, the reason you don't call an ambulance for someone in the US because you don't want to ruin their life financially.
We may have to trade the personal fulfillment in these jobs for the broader affordable access to these services.
Education and healthcare are both ridiculously overpriced in the US for reasons that have little to do with service costs. Questionable financial systems behind these services are much more to blame.
>It's the reason the costs of things like education and healthcare are downright extortionate, the reason you're paying back your college well into your fifties, the reason you don't call an ambulance for someone in the US because you don't want to ruin their life financially.
You might wanna think again on that line of reasoning, because plenty of other countries have the same dynamics with respect to service employees, but they don't suffer the very US-only problem of ridiculous education and healthcare costs where calling an ambulance can ruin someones life.
As a former first responder, I'm interested in hearing more about how AI-powered ambulance services would work. (related question: will the 911 dispatcher be AI?)
I don't think first responders are ever going to be at risk.
Administrators, on the other hand, are a massive part of the costs in the health sector (IIRC the Obama administration chickened out on truly reforming healthcare exactly because the number of administrators that would be made redundant would tank the economy). A significant amount of administrative work can be automated.
The cynic in me has learned one is measurable and can go on a slide deck, the other is vague and hard to measure.
The vast majority of jobs are not full-filling or enjoyable. Because there were way more job seekers than jobs.
Programming was one of the ones which was, because there were fewer programmers than openings. Now that's flipping, thus naturally, the enjoyment is going to be sucked out of it.
It's like if your career switched from solving puzzles to filling out TPS reports.
For me it feels less like filling out reports, and more like mentoring an intern who can search for stuff really quickly but forgets everything at the end of the day due to anterograde amnesia.
Except the intern is trapped inside an iron lung and must communicate entirely by text. And also has zero real creativity or self-motivation.
Don't worry. They'll find some freak that actually enjoys it and is even willing to be paid less!
Most people don't have the luxury of finding joy and meaning in their work. You aren't hired to have fun, you're hired to create value and wealth for your employer. Just do what literally everyone else does and grind through it until you get a pension and hope it's enough to let you die with a bit of dignity.
You pay per token, even on subscription models the limit is tokens.
If I was valued at 1 trillion dollars, and I was in the hole enough to sink a couple small countries' GDP, maybe I would slowly start to optimize to maximize token usage.
I want to sell tokens, how do I sell more tokens? Not by doing the same work in less tokens, that's for sure.
This is like if you pay me by the hour and then excitedly tell me that you keep paying 10k a month and it's great. I will most certainly not work faster, in this hypothetical, if you tell me you love spending money because it gives you a dopamine rush. I would probably spend a couple more hours REALLY thinking about the task, maybe writing some docs nobody will read, maybe considering multiple options, doing benchmarks, doing research, and then later maybe ill do the actual task as well.
Im not saying these AI companies are scamming us, but the incentives are there and extremely clear. The only thing currently holding it back is that there is some vague kind of competition.
6 hours a week is low, unless its the average spread across industries. I think I spend more time in Claude Code via the CLI versus any other app I have on my laptop.
Like others said, the frustration is when it gets something so wrong you just think "wow, how'd you mess that up?" but when it gets it right its kind of nice. I also dont like that I basically tell Claude what to do, and then either go to busy work or waste time on the internet.
I kind of enjoy exploring black boxes, trying how different inputs are mapping to differences in outputs. It's kind of like hacking. The problem is, they keep altering the box.
The box is stochastic by design, and has an untraceable amount of complexity between its context and output by nature.
It may be fun to look at inputs and outputs, but it's not hackable and trying to map one into the other is more like astrology than a science.
It's copromancy. Picking through the clanker's doings in an attempt to predict the future.
Thanks, you taught me a new word today! https://en.wikipedia.org/wiki/Scatomancy
No but you see, I have a system! /s
(I spent too long by the horse racing track)
Welcome to the slot machines!
I spend at least 6 hours a week arguing with bots owned by other teams, as I’m unable to reach a human before I bypass their bot. 10k person company, clients are paying for my time.
I would be tempted to send my own bot to do that drudgery
It may be that they’re protecting their time.
Right. Somewhere there’s a dashboard which lists those 6 hours as time saved.
Just build a bot to bypass their bot.
Corpo bullshittery is the best kind of work. Get paid without actually ever doing anything. Its heaven.
Being alienated from the outcome of your labor is far from my idea of heaven.
Not if you enjoy making things and take pride in your work.
That's some odd image of heaven.
I've found that setting good guardrails, and running in a sandbox so that the agent doesn't keep asking tedious permission questions, makes things go a LOT smoother.
Generally, I spend anywhere between 15 mins and an hour setting things up (depending on how well the project is set up for AI work), and then set the agent going, coming back in a half-hour to an hour to check its progress. Generally, the tooling keeps it honest (for golang, forbidigo is AWESOME). 80% of the questions the agent asks me require a lot of thought. 20% of what it does needs correction.
The other thing to remember with LLMs is that they are NOT human, and won't react in a human way. So you'll see strikes of "brilliance" followed by the absolutely bizarre. But good guardrails keep that to a minimum.
How often are you going into new projects and spending up to an hour on set up?
> sandbox so that the agent doesn't keep asking tedious permission questions
> 80% of the questions the agent asks me require a lot of thought. 20% of what it does needs correction.
I've found even the permissions questions give me veto power over fruitless lines of exploration, especially in planning mode. For instance, it wants to use tools I don't have installed to access information that I have made available elsewhere? I get a chance to override this decision by declining the permissions check and redirecting it. Feels tedious, but helps me understand what information sources are influencing it. I head off a lot of bugs this way.
I never let it go into planning mode, other than to output a plan file that I can audit before giving it the go-ahead to implement. After that I don't want to be bothered, so --dangerously-skip-permissions keeps all but real questions out of the loop, and I can do something else while it works rather than babysit.
It doesn't change the premise.
AI should be assisting us, instead it's doing the job and it's us being an assistant to it. This is a monumental shift that people seem to be missing in how knowledge working is changing and it's going beyond mere coding.
Guardrails, prompts, whatever, it's us helping it doing the job, not the other way around.
Opus 4.6 was the last genuinely good assistant LLM, but since then it's quite clear that the training/reinforcement is focused "given prompt -> do task" so it's behavior is more and more about doing it itself, not helping you. If you try to use it as an assistant it just sucks and is perma wired into finding the solution. Many times I want it to help me investigate, and his answer will still be focused on the fix, not answering my questions.
4.7 first, 4.8 later and fable are absolute disasters as assistants.
Fable in particular is so "intelligent" that it will push with very strong and intelligent takes even if it is completely wrong.
I have never disliked our job more.
Wow... Our experiences have been very different, then. I've found each upgrade of Opus to be a noticeable improvement in its complex reasoning and delegation capabilities over its predecessor.
To me, this feels in many ways like a technical manager or team lead's job, where I guide the process along using my knowledge and experience, and then let the agent fill in the rest (to the best of its ability).
The agent can't really learn from its mistakes (at least, not without consuming precious context), so I apply a blameless postmortem process, updating the guardrails whenever it goes astray in the same way more than once.
And really, I'd rather be contemplating the more difficult and interesting questions of architecture, environment, ergonomics and market fit, so it suits me fine.
Same here. The power upgrade going to Fable in particular is quite impressive.
AI should be assisting us, instead it's doing the job and it's us being an assistant to it.
If you're a manager and you ask a report to do something and they come back with a question, does that mean you're now their assistant?
I give agents the tasks, I answer their questions, I make choices about the tradeoffs in their plan, I supervise their implementation, I review their output, I have them walk me through things. In what way is this not delegating to them and managing their work, just like a more junior employee?
I think this is just a misunderstanding of how most technology has always worked?
Consider what is happening in most construction sites. The heavy work is absolutely from the technology on site. But without people there to oversee it and keep it working, it would fail.
And that is almost certainly true at any industrial site. Indeed, look up videos of high tech looms. A large portion of the technology added to them are so that the operators can locate the fault and fix it.
The problem (okay, one of the problems) with renting other people's models is, as you mentioned, that they can and will change out the model without notifying you ahead of time, and you don't always get to control which model you use. (They might decide to retire it, and you won't be able to get it back if they do).
Which is why (well, part of why) I think the long-term trend will be towards self-hosting models. Right now the frontier models are far enough ahead of the self-hosted ones that there are lots of people willing to pay by the token to rent someone else's model, because they get more value for money from that than from self-hosting models.
But the frontier companies won't be able to keep up their current levels of expenditure forever. At some point the investors are going to say "Hey, so, um, when am I going to see some return on my investment?" and then the current subsidized subscriptions (including the one my employer uses) are going to go away, much like what happened with Copilot this month.
And then the locally-hosted models are going to suddenly look like a more attractive picture. Because where you might have been willing to spend $100/month/employee to rent time on models in someone else's data center, you might suddenly balk at spending $500/month/employee. You might say "Hey, you know what? A $50,000 up-front capital investment is only, what, one month's worth of subscriptions for our 100 employees? Yeah, okay, I'll approve the hardware purchase. Get that self-hosted model set up and then we'll cancel the subscription and switch over."
Not everyone is going to do that. But once the locally-hosted models are good enough, the first few people who do so and report success are going to start a snowball effect. And it will likely be driven by money first, but it will also have the effect, that people will slowly discover, of meaning that you can better predict the model you're using. It will continue to work the same way next year that it is working this year; or if it doesn't, it's because you chose to install the new version.
And when that happens (I'm saying "when", not "if" because although it might take some time, I think it's inevitable in the long run), the frontier-model rental companies are going to struggle to stay afloat. Except for the ones who saw this coming and transitioned to a non-subscription income source somehow (maybe by selling licenses to self-host their frontier models for $$BIGNUM), or who have some other revenue stream besides renting out models.
That sounds weirdly gendered even though there's no reason it should be.
Are you getting LLMsplained? :)
Well... as a human software engineer, I've been the one with very strong, intelligent, completely wrong takes. The question is, are the LLMs improving faster than you can improve a junior dev? And is their ceiling as high?
Your experience pretty much mirrors my own. I hate to be the 'they're holding it wrong' guy but there's certainly a lot of people out there that have no real idea how to effectively leverage AI.
That’s a problem with the tool not the people. AI is marketed literally as writing one sentence and having some app perfectly output. Just check any of the landing pages for Claude code or codex or GitHub copilot…
i've seen a number of articles claiming things like "devs self report they'er +x% more productive with AI, but actually they're -y% LESS efficient!". and i think that this is explanation for why.
as a boss (or researcher) i'm going to measure productivity based on amount of output per hour that i'm paying you; as a workers, i'm going to measure productivity based on amount of output relative to the amount of effort i'm putting in.
so what may be happening is that bosses see that output is at 80% (productivity down!) but workers see that they can give that 80% output with 40% effort (productivity up!).
Not sure among devs, but I do know that in other positions in typical corporate bureaucracy, people have a propensity to not report their own automations or productivity gains upward, because the reward structure isn't there.
Early on in my days as a sysadmin, I automated a ton of my role when the rest of the team was still doing ClickOps. The reward for doing so was more work and expectations without the additional pay increase to justify my new found productivity. That happens all over the workforce, and so people will just keep it to themselves. I learned my lesson at that first job real fast that if I'm able to have the same, or greater output, for half the time, I keep that to myself so I can use the automation to free up my own time instead of have it filled by the company.
I wonder how much of that is happening now with AI in non-technical roles.
https://www.youtube.com/watch?v=OwfNjGxa_D4
> so what may be happening is that bosses see that output is at 80% (productivity down!)
If an initiative produces only 80% of the previous results and you’re paying large token bills on top of the same wages, the AI is going to get cut off.
> i've seen a number of articles claiming things like "devs self report they'er +x% more productive with AI, but actually they're -y% LESS efficient!".
Are you thinking of the old METR evals? Their more recent evals showed an actual performance improvement.
The old report is still circulated as bait for AI skeptics.
> so what may be happening is that bosses see that output is at 80% (productivity down!) but workers see that they can give that 80% output with 40% effort (productivity up!).
So why is it that the bosses are the ones that are so enthusiastic about adoption?
My challenge has been trying to manage my higher-level context. I've gotten a pretty good setup where I have project-level orchestrator agents that can spin up workers to implement tasks with minimal oversight, and the resulting work is usually quite good (especially after I give it the mandatory "make the comments less verbose" refining, etc.). But that means I'm doing even more context-switching. I've gotten to the point where I have a half-dozen draft PRs that just need my review before I tag my colleagues, and trying to dig up the context from all of those tasks can be paralyzing.
My favourite personal experience is how they disabled yolo mode in Claude Code at my workplace
This kind of reminds me of an article that I saw on HN ages back, there's like a subset of office workers who automated their Excel jobs, and just show up to work, read books, and do literally anything, while Excel does their work for them, and they collect their paycheck.
Bot-sitting is the new long compilation times.
I just started using Claude Code for my work as a sysadmin. For my work, it's great. I don't need to wrestle with MySQL joins, claude gets even the most complex ones right WAY faster than I would. Same with new Terraform stuff. Things that would have taken me a day are cut to less than an hour.
So for my work, it's made me much better at my job. Much faster and more accurate.
I don’t know.
I can write a simple query before Claude finishes reading, querying the semantic layer, checking my files, then writes a query that I have to approve, reads the results, hides them (ctrl+o usually works), and gives me a summary.
We’ve reached this inflection point where it’s faster for me to do most tasks again.
I’m sure fast mode costing more money plays a role.
It is surprising! I would have thought it is at least 6 hours per day.
For me, AI can sometimes create a false sense of productivity. It's similar to how in the past, people would spend time creating the perfect setup with notion templates, pomodoro timers and productivity tools, or tweaking their environment for maximum productivity, instead of actually doing productive work.
But now it's happening at the company level: "We're going to add a chatbot to increase productivity! Now MCP tools! Then agentic workflows! We’ll add skills, and now productivity will go up! Maybe loops will do it?"
I don't see a lot of talk about how AI development breaks the old feedback loop of write code, watch it run, change it, repeat. I really hate sitting around waiting for the agent to get done planning, reading the plan, then waiting for the agent to get done coding. It's those 5-10 minute windows when its working that really sap my patience and suck all the fun out of our jobs. Writing code by hand is just more fun.
> Writing code by hand is just more fun.
This is something that I don't see discussed a lot in these conversations, but its true for a ton of folks.
I didn't end up with a career in tech because I wanted to tell a bot to do the fun part of my job for me, leaving me only with the boring tedious parts. I didn't sign up to be a full time code reviewer, and I certainly never wanted to be a manager, yet alone a manager of bots.
It also can't help but spark feelings of "Why am I getting paid 6 figures for this??" and that makes me nervous for the future.
I imagine the engineers and assemblers in factories pre-assembly line felt the same when things started getting automated there. There's an element of craftsmanship that gets taken away as the product moves from being artisanal, hand crafted to mass produced.
I wonder if its too late for me to pivot to hardware
Yeah its hard to deny just the raw throughput from the AI. Like it really is doing work in hours that would take me days.
But those times when I had to drop down into a repl and play around with the output of a method. Or try different ways of doing what anyone else would think is boring, like array manipulation - that's a lot of what I actually LIKE to do.
A big part of me just hopes I can hang in there for another... decade, or two. Then I can retire! Maybe.
You can still write code by hand. Just do that, you will run into tasks that are too boring, those you can do with an LLM.
I don't mind the workflow since I'll spawn new agent sessions in new terminal tabs until my attention is saturated by round-robin'ing through them.
It's actually kinda pleasant, especially when I consider all the tickets I'm not excited about doing. It's prob worth focusing on that aspect of it.
I don't know what they're complaining about. AI has freed us from the drudgery of craftsmanship, letting us focus on the important stuff—managerial and administrative work!
(There's a reason why I call it the MBA's stone. It transmutes all knowledge work into a problem of management.)
Understanding what is going on with AI productivity is … frustrating to say the least.
The best I can say is that genAI is a self reported a 20% efficiency boost, and for a very (very) small group of people, it’s maybe a 2-3x boost. (And if you are at a frontier lab, you go fly into the big bucket of exceptions)
At this point, for most use cases, AI productivity is either the equivalent of giving people 3D printers, and seeing little benefit, or signing up for an outsourcing service, just without the development of human capital anywhere.
I think it depends on how you measure the boost. If you are talking about generating a first draft then yes, the boost is there. If you’re talking about completing the project in all well tested and architected aspects, then overall there really isn’t a boost.
6 hours of debugging and docs reading is not equal to 6 hours of prompt fiddling. The return of value beyond the few fixes applied will be almost nil from the fiddling.
And if management decides we don't need those 6 hours of human work, will everyone still be complaining?
Isn’t this just the new type of work? Human in the loop of automated processes?
Welcome to the factory!
Like Chaplin in Modern Times, we will tighten screws until we lose our minds.
Yeah, Amazon warehouses are just the same. Humans are only used for tasks beyond the comprehension or physical ability of a machine at that point in time.
The problem is, we haven't had the debate on a societal level if we want to go the star trek route (aka, we give our darn best to automate everything so that humans have the time to do whatever they want) or the realcommunism route (we ward off automation so that we have jobs for people).
The result of that debate not having been made is the third possible outcome - rabid capitalism automates everything as soon as it is profitable and lays off the humans, focusing on getting higher margins out of less people if need be; the best example for that IMHO is Disneyland or Vegas going on ridiculous nickel-and-diming tours. In the end however, there will be no one left any more who has employment and we'll be in for quite the riots.
It takes years to adapt fully to new tools, and it takes years for the toolmakers to figure out what the tools need to do
This is all normal. It’s also well worth the time spent learning
'Botsitting' -- that word is going into my 2026 lexicon! :-)
“the incredible ground-level utility that many of us on HN celebrate every day through undeniable, massive productivity gains”
I’ve been told before.
I'm yet to be invited to the celebrations.
Just 6 hours, lol!
I could care less about bot sitting (haven’t we always written our own automation?), but it’s botsitting the unverified slop that people send you that fuels frustration. I thought I worked with competent people who respected me
Our product lead/manager recently sent me an AI generated PRD (complete with a Claude Code spec!) to build a core feature which we have had for over 2 years (and is a highly used core feature by our customers).
I just can't imagine tanking my trust with my coworkers by doing something like that.
Maybe this is the AI layoff wave we'll see. Sorting out incompetent team members.
the ones who spend all day telling the bosses how great AI is?
So we're now in this world where everyone is instantly 10x more productive at turning their thoughts into code. Now, think about the coworkers you've had that are middling to mediocre. Do you want them to have a tool that makes them 10x more productive?
That's what I wonder about, what happens to all those folks.
It's not a lack of respect for you; it's a lack of respect for the work itself. That lack is being rewarded and encouraged.
Managers will be sure to tell you how much they respect you. Ask them if they respect the work and you'll get a blank stare.
Your coworkers haven't changed. What changed is that people can hand off work they never had to think through themselves. So you don't know what they checked and you don't know what you need to. You just have to read the whole thing.
*couldn’t care less