Astro - Hacker News

81 comments

CompoundEyes 4 minutes ago

At work we have unlimited use of models from Anthropic and OpenAI (for now). My coworker, a Claude Code Opus 4.6 diehard, stopped by my desk today to say he finally installed Codex to try 5.5 and his feedback was basically “it just works and does what I ask and it doesn’t disconnect and it’s just so very matter of fact.” “Yeah I’ve been telling you this since like gpt-5 man!” “I know I know…” I have not spent much time with the recent Sonnet and Opus models, but from my experience using Sonnet 4 for 3 months all day everyday (no handwritten code) last summer to make a large Playwright suite was — using Claude Code and those models becomes more about using Claude Code than doing things with it. Codex CLI with the gpt-5 family is ambient and reliable. It’s not orange, there is no little sprite guy, emojis, whimsy, and humor. But I do things with it and they land working in first edits. I also can keep the same session for days and the context doesn’t ever seem to be an issue. Maybe Claude 6 will be earth shattering and I’ll use that. It’s not Coke or Pepsi loyalty I just want to get stuff done.
aliljet an hour ago

This is a tough moment. Claude is simultaneously becoming substantially more expensive, substantially less reliable (single 9 of reliability), and substantially less performant. It's really hard to justify the cost of a subscription over there right now.
[-]
- giancarlostoro an hour ago
  
  There was another thread where some people pointed out, Amazon will give you access to Claude with better uptime for the same price (per million tokens up / down), downside is, it does not have the native ability to browse the web, but maybe that's a hidden blessing, since it's less likely to read some random website that has prompt injection embedded into it.
  For coding its fine, I havent experimented too much with Amazon Bedrock myself, but I just might soon to check for any limitations.
  [-]
  - 2001zhaozhao 23 minutes ago
    
    Maybe the best play is to set up a routing system locally so that when claude.ai is down it automatically switches to Amazon billing and switches back when it comes back up
  - Atotalnoob 17 minutes ago
    
    I’m pretty sure it has the ability to browse the web.
    It can use playwright, web fetch, etc…
    I use bedrock at work and Claude subscription at home. They are pretty much exactly the same in my experience
    Or do you mean the Claude in chrome plugin? Bedrock doesn’t have that, but in my experience it doesn’t work that well.
    Neither does the Claude managed agents or ultra plan.
  - willsmith72 an hour ago
    
    But that's just paying per use right, not with the subscription which is way better value
- edmundsauto an hour ago
  
  YMMV. I would still be very happy with Claude if it hard failed on 20% of tasks. You can always come back to it.
  I say this as someone working for a tech company who does not have to foot the bill (in the >$1k per month bracket)
  I also experienced and accept the 1990s levels of unreliability, which is my “internet generation”. My first access was lifting a handset and placing on a speaker/mic cradle.
  Programmers these days are fucking spoiled. If it’s $220 worth of value for $200 - I get it. But I’m getting $100k of value for $10k and so I’ll put up with some shit.
  [-]
  - willsmith72 an hour ago
    
    > If it’s $220 worth of value for $200 - I get it.
    Wrong comparison. If a competitor gives you $230 of value for $200, of course you shouldn't pick the $220 one
  - bbeonx an hour ago
    
    lol i love this post. not 100% sure i agree, but not 100% i don't. but great post, 10/10
  - wahnfrieden an hour ago
    
    or just use codex...
- datadrivenangel an hour ago
  
  From an economics perspective, it makes sense to make it more expensive if you're having trouble keeping up with demand for a service. It'll be tough getting used to because it was so nice and cheap
  [-]
  - a_victorp an hour ago
    
    On the other hand, it was somewhat expected that we would have a correction for the prices. Hopefully after this correction things will be more stable and we won't have to worry too much about future price increases
- smugma an hour ago
  
  We used to describe our startup as having 5 8’s of uptime
- pkulak an hour ago
  
  Not to mention substantially less open. I've been using an OpenAI subscription in Pi Agent for a couple weeks now and it's great. And from what I can tell, 5.5 is a heck of a model.
- chillfox an hour ago
  
  Interestingly, yeah, I can see that this would really cut into your subscription usage with the 5 hour rate limit windows...
  I am an API user, and while it being down is super annoying, it isn't really as big of a hit to my overall usage as I can just prepare a bunch of stuff to run in parallel when it does come back up.
- Avicebron an hour ago
  
  I'm either extremely lucky or Dario ran the direct fiber to my house because I have never had it go down in any meaningful way..
  Is this just the API and I'm too much of luddite to actually use the API?
  [-]
  - 2ndorderthought an hour ago
    
    Dude dario definitely ran the fiber straight to your place personally. Everything is fine and this is such a good thing.
    
    [-]
    
    Avicebron 44 minutes ago
    
    brother man, if everyone thinks claude sucks, then people won't be on claude. QED. It will work.
- elfly an hour ago
  
  Don't say single nine, it sounds ugly and bad.
  Say five eights of reliability. Maybe six.
  [-]
  - inetknght an hour ago
    
    We're talking about Claude, not GitHub...
- OccamsMirror an hour ago
  
  Plus, they've dumbed down their models to the point where the value just isn't there like it was. If I have to go in and clean up after it, or constantly wrestle with it through prompts, what's the point? Just spending $200 a month to be frustrated at a machine.
- Frannky an hour ago
  
  It's lazy, does not take ownership and responsibility, wants to defer work, and I have to force it to check reality. It likes to guess and assume it's correct and I am wrong. Agents.md is not helping at all. It's in full enshittification phase, yay!
- 2ndorderthought an hour ago
  
  Single nine has good vibes bro. It means when the service is up the results are better. I read about it in a blog. The model hallucinates way less. Even less than grok
jumploops an hour ago

Between GitHub and Claude, it seems Eternal December[0][1] is upon us.
[0]I say December, because that's around the time the models got good enough that non-AI folks started to notice.
[1]https://en.wikipedia.org/wiki/Eternal_September
[-]
- dmix an hour ago
  
  GitHub is a long running business with a mature software stack running into scaling issues while they move to Azure and becoming Microsoft-ified. Claude is a new company in a new market with an extremely fast growing userbase running relatively novel AI infrastructure with a business model they are still figuring out.
  I don't really blame Anthropic here.
  [-]
  - jumploops 16 minutes ago
    
    Not trying to argue with you, GitHub (the core product) seems to have been in maintenance mode since the acquisition.
    I couldn’t find any public data on GitHub, but Google Trends shows a sharp increase starting in December.
    That could be in part to people complaining about the outages, but more people than ever are writing code with AI.
    Hence the parallel to Eternal September – code volume is up, quality is down, and programming is never going to return to how it was (difficult for “normal” people to interface with).
- zackify an hour ago
  
  I use openai team plan whenever its down because its down so much lol
ossa-ma an hour ago

I built a hangout space to chill out in and chat to others while Claude is down (which is happening wayyy more often): https://clawdpenguin.com
There's a live Claude status board in the corner so you know when it's time to get back to work.
[-]
- thisisauserid 20 minutes ago
  
  Because you can't work without it?
  Yikes.
sroussey an hour ago

Oh great. My Max account has been borked for days, and now they will never get to it with everything else burning down.
https://github.com/anthropics/claude-code/issues/54497
philipbjorge an hour ago

So happy to have diversified my model providers this past couple of weeks. GPT-5.5 has had no trouble slotting into Opus workloads. Will be fun to try out more of the models as time goes on to build some resiliency into my engineering workflows :).
[-]
- fooster an hour ago
  
  I found GPT 5.4 terrible. I just tested 5.5 and compared with opus its still not great.
  [-]
  - philipbjorge an hour ago
    
    What I found was that I *strongly* preferred Claude Code with its defaults. Codex was almost unusable to me -- It would spit out a 4-5 page plan where it kept repeating itself, where Claude would give me a crisp 1-2 pager I could actually review.
    *But* I don't work with the defaults -- I work with my own prompt framework based off of superpowers.
    Given sufficient prompt scaffolding, I've found the models relatively interchangeable -- _I might_ be getting some of this for free by basing my own system off of superpowers which is used across various harnesses -- In other words achieving this kind of portability may be a lot harder than it looks and I'm benefiting from other people's work.
  - wahnfrieden an hour ago
    
    In what harness?
hendler an hour ago

With the TPU deal with Google and their relationship with Amazon they will have access compute coming online.
I worked with 4.6 and found some improvements for better planning and sustained us, but agree some posters 4.7 is slower, overthinking.
What I expect is frontier models to get bigger and more expensive (especially fast mode like on Cerberus). And most of his get much smaller distillations for the more generous subscription tiers.
bottlepalm an hour ago

Anybody else double fist Codex/Claude? They both code, solve problems, and find bugs in unique ways. I find using both is more useful than using either alone. I have them code review each others work, it's great.
minimaxir 2 hours ago

Odd time for Claude to go down since it's not peak work hours.
[-]
- neuronexmachina an hour ago
  
  Maybe they target certain types of infra rollouts for non-peak hours?
- AyyEye 13 minutes ago
  
  Almost like "Claude is only down because it's too busy" is cope and the reality is it's vibe coded trash.
- rvz an hour ago
  
  "But humans do it too as they just cool off and check out for the rest of the day."
  It's fine for Claude to be unavailable when there is no work at these hours. However, the problem is Claude gave no notice.
  At this rate, Claude being unavailable every day is no better than a human on a 9 - 5 working day job.
  [-]
  - yakbarber an hour ago
    
    no it's not, it's always work hours somewhere
ahmadyan an hour ago

They are about to lose the second 9
99.02 % uptime
[-]
- datadrivenangel an hour ago
  
  98.68
  Ouch.
  [-]
  - llbbdd an hour ago
    
    Seems like a healthy human temperature to me. Maybe AGI has finally arrived.
boldi 2 hours ago

Yup, major outage on all platforms.
https://status.claude.com/
rishabhaiover an hour ago

This is insane. I have to move to Codex now.
furyofantares an hour ago

I don't really mind hopping between claude/codex/glm/kimi except I don't know a good way to resume as session across agent harnesses.
Normally I'd just have it write out what it's doing to a file, if I need to transfer context, but if it goes down mid-session that's a no-go.
I think people have built tools for this, and of course you could reasonably vibe one yourself, but I don't really trust something like that to work reliably or in an ongoing manner.
Maybe it should just be a skill.
[-]
- alasano an hour ago
  
  switch to Pi.dev or any other multi model harness, you can switch between models every message if you feel like it.
  Anthropic have blocked usage of your subscription however with third party harnesses.
  [-]
  - furyofantares 38 minutes ago
    
    > Anthropic have blocked usage of your subscription however with third party harnesses.
    This is the main reason I use different harnesses, but I also expect (could be wrong) codex is better with codex harness (due to training on it's specific tools) than with other harnesses. I use opencode for everything that's not claude/codex.
- philipbjorge an hour ago
  
  You might search for a concept like `/handoff` that's in ampcode. I'm sure someone's built a skill for just this.
  [-]
  - OccamsMirror an hour ago
    
    That's not going to work if the service is down, however.
    
    [-]
    
    philipbjorge an hour ago
    
    Ahh good point -- I've handled this by switching my harness to `pi` but recognize that may not be for everyone and doesn't directly address OP's question.
- serf an hour ago
  
  self hosted honcho (or other memory systems) and an api agnostic harness gets you most the way there.
- datadrivenangel an hour ago
  
  kilocode allows you to switch between models mid session!
  [-]
  - furyofantares an hour ago
    
    Yeah - it's switching agents (harnesses) that's the hard part!
avaer an hour ago

The models are already commoditized; if this affects you, you should probably fix your stack.
Still, it's pretty crazy that Claude is down to 1 nine.
[-]
- 2ndorderthought an hour ago
  
  Every day I am so much happier that I decided to go fully local for my needs.
galoisscobi an hour ago

But Boris declared coding is solved. How is this possible? Can’t they prompt Mythos to give them better uptime?
[-]
- rvz an hour ago
  
  When Claude is making "0 mistakes", all of his work is 100% done by Claude, therefore "coding is solved!" and we have more time to go on podcasts to tell everyone about it.
  However, when there is an incident it is immediately "human error", not Claude.
  > Can’t they prompt Mythos to give them better uptime?
  Anthropic is currently "vibe coding" the situation right now.
Sabinus an hour ago

I'm feeling a bit sorry for Anthropic. This last month must be very tough on them.
[-]
- throwatdem12311 an hour ago
  
  Oh no won’t someone please think about the trillion dollar corporation!
  [-]
  - kaycey2022 an hour ago
    
    Arent they just on the hook for trillion dollars
rvz an hour ago

So Claude decided to take the rest of the day off without notice or giving a scheduled time off?
Many such cases with humans (given that we continue to compare LLMs to humans these days which you cannot)
[-]
- andai an hour ago
  
  Introducing: Claude AWOL
lovvtide an hour ago

Working for me now
maplethorpe an hour ago

Humans are unavailable from time to time also.
winfredJa an hour ago

how can someone run business on top of their APIs
blurbleblurble an hour ago

More fuel to get 27b running
o10449366 an hour ago

I've been on the $200 plan for 3 months, but this will be my last month. I got great use out of 4.5 for a while, but 4.6 felt like a half step back (conflated with all the random hidden config changes during it), and 4.7 is genuinely terrible.
It's impossible to tell these days whether 4.7 is stuck because it's thinking and Anthropic suppressed all output (seriously, 4.7 will just start making changes without explaining any reasoning - how is that an upgrade?) or because the underlying infrastructure is having issues.
4.5 -> 4.7 feels like going from working with a coach-able, junior engineer that does well with clear guidance to working with a cocky mid-level that will spend too long on pointless tangents and make confidently incorrect changes without any discussion.
[-]
- chillfox an hour ago
  
  4.6 has been excellent when used through the API. But I am right with you on 4.7, so I have been sticking to 4.6.
johndory80 2 hours ago

Code isn’t working on my app. Chats work fine.
[-]
- boldi an hour ago
  
  I can't even sign in, however.
  [-]
  - 2ndorderthought an hour ago
    
    Is signing in actually necessary though? Think of the investors guys.
mrcwinn an hour ago

For what it’s worth, I moved to Codex GPT-5.5 Xhigh Fast in the desktop app and it’s been fantastic.
[-]
- ttul an hour ago
  
  I've been a Codex devotee since around last August. I don't know why everyone is so bonkers about Claude Code. It's not the only belle at the ball. Codex is rock solid.
- gardnr an hour ago
  
  Here is one source that agrees: https://artificialanalysis.ai/models/capabilities/coding
6Az4Mj4D 2 hours ago

Hoping to get another rest of limits.
neoecos an hour ago

Now... I have to cook dinner....
[-]
- amarant an hour ago
  
  Hurry! It'll be back up any minute now!
peebee67 an hour ago

I personally broke it by admonishing it for fucking up its last revision to my project.
emersoftware 2 hours ago

wtf all days same shit