Astro - Hacker News

46 comments

t43562 2 hours ago

I've always wondered at the motivatons of the various string routines in C - every one of them seems to have some huge caveat which makes them useless.
After years I now think it's essential to have a library which records at least how much memory is allocated to a string along with the pointer.
Something like this: https://github.com/msteinert/bstring
[-]
- lesuorac an hour ago
  
  It's from a time before computer viruses no?
  But also all of this book-keeping takes up extra time and space which is a trade-off easily made nowadays.
  [-]
  - rini17 36 minutes ago
    
    Yes, in the old times if you crashed a program or whole computer with invalid input, it was your fault.
    Viruses did exist, and these were considered users' fault too.
- formerly_proven an hour ago
  
  strncpy is fairly easy, that's a special-purpose function for copying a C string into a fixed-width string, like typically used in old C applications for on-disk formats. E.g. you might have a char username[20] field which can contain up to 20 characters, with unused characters filled with NULs. That's what strncpy is for. The destination argument should always be a fixed-size char array.
  A couple years ago we got a new manual page courtesy of Alejandro Colomar just about this: https://man.archlinux.org/man/string_copying.7.en
  [-]
  - dundarious 16 minutes ago
    
    Yes, these were also common in several wire formats I had to use for market data/entry.
    You would think char symbol[20] would be inefficient for such performance sensitive software, but for the vast majority of exchanges, their technical competencies were not there to properly replace these readable symbol/IDs with a compact/opaque integer ID like a u32. Several exchanges tried and they had numerous issues with IDs not being "properly" unique across symbol types, or time (intra-day or shortly before the open restarts were a common nightmare), etc. A char symbol[20] and strncpy was a dream by comparison.
  - Cyph0n an hour ago
    
    strncpy doesn’t handle overlapping buffers (undefined behavior). Better to use strncpy_s (if you can) as it is safer overall. See: https://en.cppreference.com/w/c/string/byte/strncpy.html.
    As an aside, this is part of the reason why there are so many C successor languages: you can end up with undefined behavior if you don’t always carefully read the docs.
    
    [-]
    
    Asooka 36 minutes ago
    
    Back when strncpy was written there was no undefined behaviour (as the compiler interprets it today). The result would depend on the implementation and might differ between invocations, but it was never the "this will not happen" footgun of today. The modern interpretation of undefined behaviour in C is a big blemish on the otherwise excellent standards committee, committed (hah) in the name of extremely dubious performance claims. If "undefined" meaning "left to the implementation" was good enough when CPU frequency was measured in MHz and nobody had more than one, surely it is good enough today too.
    Also I'm not sure what you mean with C successor languages not having undefined behaviour, as both Rust and Zig inherit it wholesale from LLVM. At least last I checked that was the case, correct me if I am wrong. Go, Java and C# all have sane behaviour, but those are much higher level.
  - dingi 5 minutes ago
    
    Isn't strlcpy the safer solution these days?
  - ufo an hour ago
    
    A big footgun with strncpy is that the output string may not be null terminated.
    
    [-]
    
    kccqzy an hour ago
    
    Yeah but fixed width strings don’t need null termination. You know exactly how long the string is. No need to find that null byte.
    
    [-]
    
    ninkendo an hour ago
    
    Until you pass them as a `char *` by accident and it eventually makes its way to some code that does expect null termination.
    There’s languages where you can be quite confident your string will never need null termination… but C is not one of them.
    
    [-]
    
    kccqzy 5 minutes ago
    
    You don’t do that by accident. Fixed-width strings are thoroughly outdated and unusual. Your mental model of them is very different from regular C strings.
    
    Sharlin an hour ago
    
    Good luck though remembering not to pass one to any function that does expect to find a null terminator.
    
    [-]
    
    kevin_thibedeau 32 minutes ago
    
    Ignore the prefix and always treat strncpy() as a special binary data operation for an era where shaving bytes on storage was important. It's for copying into a struct with array fields or direct to an encoded block of memory. In that context you will never be dependent on the presence of NUL. The only safe usage with strings is to check for NUL on every use or wrap it. At that point you may as well switch to a new function with better semantics.
    
    andrepd an hour ago
    
    Seriously. We have type systems and compilers that help us to not forget these things. It's not the 70s anymore!
zahlman 5 minutes ago

> To make sure that the size checks cannot be separated from the copy itself we introduced a string copy replacement function the other day that takes the target buffer, target size, source buffer and source string length as arguments and only if the copy can be made and the null terminator also fits there, the operation is done.
... And if the copy can't be made, apparently the destination is truncated as long as there's space (i.e., a null terminator is written at element 0). And it returns void.
I'm really not sold on that being the best way to handle the case where copying is impossible. I'd think that's an error case that should be signaled with a non-zero return, leaving the destination buffer alone. Sure, that's not supposed to happen (hence the DEBUGASSERT macro), but still. It might even be easier to design around that possibility rather than making it the caller's responsibility to check first.
swinglock 2 hours ago

I'm surprised curlx_strcopy doesn't return success. Sure you could check if dest[0] != '/0' if you care to, but that's not only clumsy to write but also error prone, and so checking for success is not encouraged.
[-]
- jutter an hour ago
  
  This is especially bizarre given that he explains above that "it is rare that copying a partial string is the right choice" and that the previous solution returned an error...
  So now it silently fails and sets dest to an empty string without even partially copying anything!?
- AlexeyBrin 2 hours ago
  I guess the idea is that if the code does not crash at this line:
```
    DEBUGASSERT(slen < dsize);
```
  it means it succeeded. Although some compilers will remove the assertions in release builds.
  I would have preferred an explicit error code though.
  [-]
  - swinglock 17 minutes ago
    
    assert() is always only compiled if NDEBUG is not defined. I hope DEBUGASSERT is just that too because it really sounds like it, even more so than assert does.
    But regardless of whether the assert is compiled or not, its presence strongly signals that "in a C program strcpy should only be used when we have full control of both" is true for this new function as well.
Scubabear68 2 hours ago

From the article:
> It has been proven numerous times already that strcpy in source code is like a honey pot for generating hallucinated vulnerability claims
This closing thought in the article really stood out to me. Why even bother to run AI checking on C code if the AI flags strcpy() as a problem without caveat?
[-]
- CGamesPlay 2 hours ago
  
  It's not quite as black and white as the article implies. The hallucinated vulnerability reports don't flag it "without caveat", they invent a convoluted proof of vulnerability with a logical error somewhere along the way, and then this is what gets submitted as the vulnerability report. That's why it's so agitating for the maintainers: it requires reading a "proof" and finding the contradiction.
- Sharlin an hour ago
  
  Because these people who run AI checks on OSS code and submit bogus bug reports either assume that AIs don't make mistakes, or just don't care if the report is legit or not, because there's little to no personal cost to them even if it isn't.
- saagarjha 2 hours ago
  
  Because people are stupid and use AI for things it is not good at.
  [-]
  - Tempest1981 2 hours ago
    
    > people are stupid
    people overestimate AI
    
    [-]
    
    lesuorac an hour ago
    
    Its weird though because looking through the hackone reports in the slop wiki page there aren't actually reproduction steps. It's basically always just a line of code and an explanation of how a function can be mis-used but not a "make a webserver that has this hardcoded response".
    So like why doesn't the person iterate with the AI until they understand the bug (and then ultimately discover it doesn't exist)? Like have any of this bug reports actually paid out? It seems like quickly people should just give up from a lack of rewards.
    
    [-]
    
    amenhotep 39 minutes ago
    
    As long as the number of people newly being convinced that AI generated bounty demands are a good way to make money equals or exceeds the number of people realising it isn't and giving up, the problem remains.
    Not helped, I imagine, that once you realise it doesn't work, an easy pivot is to start convincing new people that it'll work if they pay you money for a course on it.
pama 3 hours ago

Congrats on the completion of this effort! C/C++ can be memory safe but take some effort.
IMHO the timeline figure could benefit in mobile from using larger fonts. Most plotting libraries have horrible font size defaults. I wonder why no library picked the other extreme end: I have never seen too large an axis label yet.
[-]
- Tempest1981 2 hours ago
  
  Yes, the graph font-sizes seem intended for printing them on a single sheet of paper, vs squeezed into a single column in a blog.
- saagarjha 2 hours ago
  
  Removing strcpy from your code does not make it memory safe.
  [-]
  - kjjfnkeknrn 35 minutes ago
    
    Removing strcpy from your code does make it a little memory safer.
loeg 2 hours ago

A weird Annex-K like API. The destination buffer size includes space for the trailing nul, but the source size only includes non-nul string bytes.
I don't really think this adds anything over forcing callers to use memcpy directly, instead of strcpy.
stabbles 2 hours ago

Apart from Daniel Sternberg's frequent complaints about AI slop, he also writes [1]
> A new breed of AI-powered high quality code analyzers, primarily ZeroPath and Aisle Research, started pouring in bug reports to us with potential defects. We have fixed several hundred bugs as a direct result of those reports – so far.
[1] https://daniel.haxx.se/blog/2025/12/23/a-curl-2025-review/
[-]
- molf 2 hours ago
  
  That's very interesting! It links to:
  https://daniel.haxx.se/blog/2025/10/10/a-new-breed-of-analyz...
  and its HN discussion:
  https://news.ycombinator.com/item?id=45449348
- p2detar 2 hours ago
  
  So? Those are automated analysis tools and by "slop" he seems to refer to careless reports crafted using AI, solely for collecting bounties:
  https://gist.github.com/bagder/07f7581f6e3d78ef37dfbfc81fd1d...
snvzz 2 hours ago

The AI chatbot vulnerability reports part sure is sad to read.
Why is this even a thing and isn't opt-in?
I dread the idea of starting to get notifications from them in my own projects.
[-]
- trollbridge 2 hours ago
  Making a strcpy honeypot doesn’t sound like a bad idea…
```
  void nobody_calls_me(const char *stuff) {
          char *a, *b;
          const size_t c = 1024;

          a = calloc(c);
          if (!a) return;
          b = malloc(c);
          if (!b) {
                  free(a);
                  return;
          }
          strncpy(a, stuff, c - 1);
          strcpy(b, a);
          strcpy(a, b);
          free(a);
          free(b);
  }
```
  Some clever obfuscation would make this even more effective.
- easterncalculus 21 minutes ago
  
  It's a symptom of complete failure of this industry that maintainers are even remotely thinking about, much less implementing changes in their work to stave off harassment over false security impact from bots.
- Y_Y 2 hours ago
  
  Because humans generate and relay the slop-reports in the hopes of being helpful
  [-]
  - captn3m0 2 hours ago
    
    s/being helpful/making money.
TZubiri an hour ago

LMAO
After all this time the initial AI Slop report was right:
https://hackerone.com/reports/2298307
[-]
- lesuorac an hour ago
  
  ?
  Nonce and websockets don't appear at all in the blog post. The only thing the ai slop got right is that by removing strcpy curl will get less issues [submitted about it].
senthil_rajasek 3 hours ago

Title is :
No strcpy either
@dang
[-]
- Snild 2 hours ago
  
  I don't see a problem with that, but for the record, the title on the site is lower-case for me (both browser tab title, and the header when in reader mode).
  [-]
  - 1f60c 2 hours ago
    
    I think the submission originally had a typo ("strpy", with no C)
    
    [-]
    
    Snild an hour ago
    
    Ah.