Astro - Hacker News

36 comments

stinos an hour ago

> Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??
Fair point (although to be honest: 'complexify' feels a bit of an exaggeration here to me), but the answer to this why is simple: document and express intent clearly. The compiler gave you an error first such that you're forced to consider what you're doing. Any seasoned C++ developer seeing this knows what this reinterpret_cast means.
> Wow. With std::span the complexity-meter bumps in the red zone and goes even higher!
Same remark: yes, it's a bit more text to read, but again: to me (and many others I'm guessing) this clearly expresses intent. I also do not find it particularly hard to read. I mean, it's C++, you're likely going to encounter templates at one point or another, except in super specific software perhaps. But no-one also ever argued the C++ learning curve was easy, and trying to make it easier by refusing to use features which were added for good reasons and instead going back to constructs which are the very source of those reasons seems a bit backwards.
> As a nice addition, if you use SAL annotations, the function could be decorated a bit to help code analyzers detecting memory bugs
Some might also say it complexifies and uglifies the code. And in any case makes it non-portable on top of that.
[-]
- VulgarExigency 38 minutes ago
  
  It seems unlikely that this is the case, as the author appears to be experienced, but the post reads like the author has never had to maintain a "simple" and "beautiful" function that was mangled into incomprehensibility over the years, and where if a more expressive type signature had been written from the start, it would have restricted the damage caused over time.
- thomasmg 37 minutes ago
  
  I don't have a strong opinion what is better in this case, but my view is:
  > document and express intent clearly
  Arguably, the void* does that as well?
  > Any seasoned C++ developer seeing this knows what this reinterpret_cast means.
  Same for void*?
  > it's a bit more text to read
  If you have to call it many times, this adds up.
  > Some might also say it complexifies and uglifies the code
  I think the point is that it adds security, which the other options don't. And, it doesn't add complexity on the caller, but only at one place: the implementation.
  > makes it non-portable on top of that.
  This can be solved.
- repelsteeltje 40 minutes ago
  
  +1
  And SAL annotations aren't even C++ proper.
gignico 2 hours ago

> It seems that some people are really losing the taste for good readable code.
It seems that some people never had taste for good reliable code. Use `void ` and now any error whatsoever is a direct undefined behavior. Moreover `std::span` clearly says that you are not* taking ownership of the memory (even though the language does not check it of course), while `void *` does not.
I understand that people can have many things to say about C++, and I do as well, but `std::span` should have been there decades ago and is such a life saver in these situations. A truly zero-cost abstraction which effectively saves you from a lot of troubles.
[-]
- trumpdong an hour ago
  
  There's lots of UB in C-family execution models. Some of which is not actually UB because the implementation defines it - e.g. aligned DWORD-sized memory access is atomic on Windows because Microsoft said it is.
  By choosing to use this language you choose to navigate the UB. Otherwise you'd be writing in Go, or Python.
  It is possible to write reliable code despite the presence of UB in a language just like it's possible to drive to work every day for 20 years despite most of the directions you can point the car leading to an immediate crash. That's a needle with a much thinner eye than UB in C, and most people manage it. Mainly it means being very careful about lifetime and ownership. The Linux kernel manages it 99% of the time simply by being careful about lifetime and ownership, and that's a project with a huge number of contributors who don't intimately know each other's modules. I'm the Linux kernel you can't just say "new whatever" - you must have a plan for a lifetime of that whatever, and other people will review it.
  I agree with you about std::span.
  [-]
  - pjc50 18 minutes ago
    
    > Some of which is not actually UB because the implementation defines it
    No - if something is UB in the spec, it's UB. The implementation will do something, sure, but what it does is not fixed and may even change based on compiler version and optimization level.
    > DWORD-sized memory access is atomic on Windows because Microsoft said it is
    Well, Intel said it is. Mind you I don't think there are any 32-bit native architectures where aligned dword access isn't atomic. Unaligned, on the other hand ...
    
    [-]
    
    simiones 6 minutes ago
    
    > No - if something is UB in the spec, it's UB.
    A compiler is still free to ignore the spec and declare that something is not UB. However, this is very much compiler based, not platform based. Windows might guarantee that aligned DWORD-sized memory accesses are atomic, but that doesn't mean Clang when compiling for Windows would respect this - but MSVC might.
  - repelsteeltje 30 minutes ago
    
    There is a difference between UB in C, and something being undefined in some version of Microsoft C on Windows.
    Many of C's UB is specifically, intentionally left undefined in the standard to express code that relies on some specific way it is handled, is not proper, portable C. Indeed, the DWORD-sized memory access being atomic doesn't apply to MS Windows prior to version 3.0 running on a 80286.
    It's UB because the ISO C spec says it's UB.
  - arcticbull an hour ago
    
    Yeah but also, quick question:
    struct S { char c; int i; }; struct S a = {0}; struct S b = {0}; memcmp(&a, &b, sizeof(a)) == ...
    If you answered 0, you'd be wrong, the answer is undefined, thanks to padding, initialization and alignment rules. Padding bytes are undefined, and not guaranteed to be initialized to zero even if the variable is declared static (where the members would be zeroed).
    This is why the compiler is angry at the post writer, and why the reinterpret_cast is needed. Ideally if they wanted to do something with the data, they'd unbox the structure.
    That's why it's not a good idea to use void* to pass arbitrary data interchangeable with bytes. It's a location, it makes no representation as to what's there and how to interact with it. Let alone who owns it.
    std::span solves two problems here. One is the ownership problem. The other is that span<T> is a T[]. void* is god only knows.
    The post asserts:
    > The code is very clear and straightforward: you pass a pointer to the custom data structure, and its size in bytes. That’s it. Simple and clear.
    This is unfortunately entirely false in C thanks to the aforementioned alignment/padding UB (and of course inner pointers). This is addressed with std::span. You'd still have to reinterpret_cast your structure to get the UB.
    > Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??
    tl;dr: because it doesn't. It just kinda looks like it does if you squint, and it's going to lead to the gnarliest bugs in the world.
- delta_p_delta_x an hour ago
  
  > A truly zero-cost abstraction
  Sadly the MSVC ABI makes std::span and std::string_view a pessimisation:
  https://github.com/tringi/win64_abi_call_overhead_benchmark
  https://godbolt.org/z/7baaox7re
  [-]
  - usrnm 17 minutes ago
    
    Sounds like a compiler bug to me. It is a valid reason to avoid them in some rare cases right now, but it doesn't make the feature itself bad
- spacechild1 an hour ago
  
  > but `std::span` should have been there decades ago
  Absolutely! I now use it consistently in all new projects where I can afford to mandate C++20. I guess nobody bothered to make a proposal before...
  [-]
  - pjmlp 43 minutes ago
    
    They did in C, from one of the language authors even, and it was not accepted.
    https://www.nokia.com/bell-labs/about/dennis-m-ritchie/varar...
    By the way, both Extended Pascal, Mesa/Cedar and Modula-2 have them, under the name of open arrays.
    Basically it took Go, C# and others for C++ to finally get its span.
    C probably never will.
- pjmlp an hour ago
  
  That is quite common among C developer culture, play loose and brace for impact.
- locknitpicker an hour ago
  
  > I understand that people can have many things to say about C++, and I do as well, but `std::span` should have been there decades ago (...)
  Decades is kind of a stretch. C++11 introduced smart pointers, and finally getting C++0x out of the door was already a major victory. Given the history of C++, it would be unrealistic to introduce something like std::span before C++17.
  Meantime, some organizations are still struggling to migrate to something like C++14.
  [-]
  - pjmlp 37 minutes ago
    
    It could have been there since the beginning, given that open arrays (aka spans) already existed in other languages, and there was even a failed proposal from Denis Ritchie regarding C.
    The C++ span proposal came from Microsoft,
    https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p01...
delta_p_delta_x an hour ago

The blogger and the blog says:
> BTW: As a nice addition, if you use SAL annotations
> Windows C++ Programming
Not everyone will see the irony, but the Windows user-mode application and library suite and the kernel now very heavily rely on the safety mechanisms of C++ that the author calls 'complex', 'uglif[ied]', and has 'los[t] the taste for good readable code'. I'm of course referring to the Windows Implementation Library: https://github.com/microsoft/wil This is explicitly an effort from MS WinDev to make Windows C++ code safer. User-mode applications writing native Windows code can and absolutely should use it, too.
Any time I see `void*` in C++ I ring-fence it as a C-ism and make sure I `reinterpret_cast`. For me, a bag of bytes is `std::span<std::byte>`. void* is a memory location with no provenance, no ownership, no size information, nothing. Do I even know if it is this program's memory, or some shared memory construct, or maybe even a pointer into GPU memory? No for all.
C likes to play fast and loose and its proponents call it 'beautiful and simple', I call it a segfault/use-after-free/double-free waiting to happen.
[-]
- pjmlp 33 minutes ago
  
  It goes even further their beloved C code is compiled in compilers written in C++, including the standard library, exposing C++ implementations as extern "C" functions.
  It is a pity that Microsoft backtracked on their C support.
  WWDC is happening this week, one set of announcements at State of the Union was how Apple replaced a few C, Objective-C and C++ components, including at OS level with Swift.
  [-]
  - repelsteeltje 22 minutes ago
    
    Interesting. I'm know nothing about Apple, but maybe you can explain how idiomatic Swift handles Blobs and how that interfaces with C or C++ around void ptrs, std::spans etc.?
    
    [-]
    
    pjmlp 12 minutes ago
    
    Those are unsafe buffers, and have specific primitives to handle them, Swift also has span, and interoperability with Objective-C and C++ code.
voidUpdate 2 hours ago

> "An interesting question you may ask in C++ is: “How would you declare a function that takes a blob of memory as input?”"
> "Now, suppose that you want to pass to this function a custom structure, like this:"
You would create another function that actually works based off that structure, rather than using your first function which operates on a set of bytes in memory. That way it's readable, like they want, and type-safe
[-]
- trumpdong an hour ago
  
  I find this to be a snarky non-answer. You really think everyone should write their own memcpy for every POD type they want to memcpy?
  [-]
  - mfost an hour ago
    
    There's no need: there's std::copy already.
    Or maybe the idea was to create a typesafe template wrapper around the generic function which is also very common and really nice. No need to create one wrapper per type, a single template should work.
arcadialeak an hour ago

char* is an exception to strict aliasing rules of C++ precisely to facilitate the author's use case. You would still need a reinterpret_cast to make it work, but it's actually good because it makes the intent clearer, and the cast would have still happened either way to read the raw bytes.
[-]
- quietbritishjim an hour ago
  That was my first instinct too, but nothing the author said indicates they actually need non-strict aliasing. If the function had been:
```
   void DoSomething(void* src, void* dst, size_t numBytes);
```
  ... then it would be a different matter since maybe you want to allow src and dst to alias. Although, even then, they're still allowed to alias so long as the function accesses them both through char*, so the function signature can still use void*.
  (Going deeper, non-strict aliasing applies to any pointers of the same type passed to a function. So if src and dst were both cast to float* inside the function, and if they really are both of that type (technically "an object of type float exists at the pointed-to location) then they can still alias. The char* exception is the only case that you can access a memory location through two different types of pointer and they can still alias.)
  It's interesting the author mentions uint8_t. It's certainly more explicit than char, but it doesn't have the same aliasing guarantee (very strictly speaking - in practice it's almost always an alias for unsigned char or char, which does).
  [-]
  - myrmidon 6 minutes ago
    
    This is actually pretty annoying in embedded programming in C, because you'd often really prefer to use a uint8_t buffer[] for serialization functions (e.g. to write arbitrary data on some bus etc.) over char*, but you'd actually lose the aliasing permissiveness that you need (if you are strictly sticking to the standard-- this is often ignored in practice).
akkaygin 27 minutes ago

> In fact, std::span is a class template, and somebody would suggest to make the function that processes the generic memory blob a function template! Really? Something like this??
Yes.
delegate an hour ago

It depends on what your function does with that memory. If the fn expects any kind of structure at that address, you and your callers are on your own, compiler can't help if the caller passes the wrong thing. Worse, acessing that memory might not immediately crash, but lead to strange side effects in your program.
Dynamic languages can handle this with reflection, but with void* you can only pray nobody makes the mistake..
arka2147483647 an hour ago

The best part of void* is that it is very terse. Both in definitions, and in access.
All cpp alternatives are more wordy.
I wonder how this conversation wound go if the was an as terse, but also typesafe cpp alternative.
api 12 minutes ago

Real programmers use uintptr_t for pointers.
themafia 2 hours ago

I'm not a fan of C++ precisely because of template noise but what you gain with span, in that the pointer and the length are joined together, seem to outweigh the complaints on style.
Isn't there a way to make this an alias anyways?
squirrellous an hour ago

One could argue the reinterpret_cast makes the intent more explicit which is a good thing.
That said I don’t have much against the use of void* or even char* here. If it works in C, it works in C++ just fine. std::span is not the right tool for this.
adev_ an hour ago

This post post is honestly speaking a bag of garbage and ill advises:
> Some good old habit from C can still be positively used in C++, like the void* pointer and the size parameters.
That's garbage.
There is a clear interest of passing both size AND pointer in a single parameter like `std::span<std::byte>: It bind both value together and guarantee that you do not mess with the size of your buffer.
Pass "data" and "size" parameters through a chain of 5 function calls and there is a non-null probability that you passed "other_size" instead of "size" somewhere. This pattern happens everywhere in old C codebase and has been the source of countless security vulnerabilities and random buffer overflows for decades.
All modern languages (including freaking minimalist Golang) have now a "slice/span" concept built in.
It is not just to annoy programmers (and allow them to complain about 'complexity' in blog posts) but because it is a major improvement in term of memory safety and in term of reducing user errors.
> It seems that some people are really losing the taste for good readable code.
If 'span<std::byte>' or 'span<char>' are unreadable for you. The problem is not span, the problem is you.
These are concepts that has been existing for decades in almost all modern programming languages.
Even in conservative C++, it exists since 2014 in the GSL, in Qt and in boost.
And the interface is no different from vector...no excuse here... It is itself the most basic data-structure in C++.
> Why should people complexify and uglify their C++ code with the uint8_t pointer (or std::byte), when void* works just fine??
Sure. Let's extend the logic: I do propose also to replace all typed arguments with a void* pointer.
Because after all: 'It will just works fine' right ?
Type-safety and clear interface are overrated, we could all use only bytes and remove interface all together to get a closer experience of Fortran 77.
/irony
> Or maybe something even more complicated, like this? > template <typename T, std::size_t N> void DoSomething(std::span<T, N> data)
First that is non-sense.
If you want to pass a mutable buffer of byte, the correct signature is:
``void DoSomething(std::span<std::byte> data)``
There is no need for template signature here. You are making things up.
Second, there is also no need for the N parameter
``span<Type,N>`` is only used when enforcing a buffer with its size known at compile time is desirable. It can be for vectorization (e.g buffer is a multiple of the SIMD line) or to make it explicit in the interface (e.g for bloc cipher for instance)
> states that the pointer points to input read-only memory (_In_reads_)
You do that by using `std::span<const std::byte>` in any C++ codebase.
The fact he brags about that as "an advantage" for separated parameter passing just show currently how little is known here.
> My Pluralsight Courses
The kind of C++ code proposed in this blog post would be straight be refused in any PR in almost any serious organization with a proper review process.
So bragging about it on a blog while proposing some C++ teaching is audacious to say the least.
> To finish on that.
The sad thing is that there would be very valid criticism on `std::span<std::byte>`:
- Span does not do boundary check on access by default. Which is a bad design decision in 2026.
- It has an impact on compilation time due to the header inclusion
- std::byte is annoying to work with because it is a hack around an enum instead of a proper C++ builtin type.
But the blog post misses all these points entirely and sticks to complaining about 'Old C being better' the same way your family Grand-Uncle still brags about 'lead gasoline being better' for his 70s Pontiac.
_the_inflator an hour ago

I think that the author is right in everything he says and yes, there is beauty in it.
However, the antithesis is also correct that there exist better solutions to solve the issues.
Both premises hold true.
I have an extensive assembler coding background on 6510, M68000, and i486. I had a very hard time accepting that something could be solved faster and more stable in a higher order language while the downside is more memory, more CPU etc.
More and more it turns out that programming languages are something accidentally read by machines and written by humans, even though this premise got destroyed lately by AI.
However, what I love about C++ is, that it has a basic canon of commands that can be used to build nearly everything while looking extremely ugly and hard to grasp if you don't read very slowly and accurately - so it is a very error prone and dangerous thing that rightfully got substituted by better constructs that allow for better distinctions as well as usage.
I could do everything in assembler (Hey Python users: you know that in the end everything ends up as machine code, don't you?) but it takes 100x times longer and is constantly reinventing the wheel.
Have you ever started to get into the intricacies of bit signs? No? Well, you should definitely, and to this day it gave me a lasting impression when I started wrapping my head around it, when I was 10 to 11 years old hacking my way into the world of assembler programming on C128.
You don't want to take every concept into consideration. You don't want to take interoperability into consideration. All the time!
You want to focus on the problem to solve, not the implications of the implementations all the time.
I am having such a blast very often using Python since it just works with much cognitive distraction about which language construct to use in order to get the machine doing what you want. It is so capable, enable it, to simply ensure within boundaries that the compiler uses the best decision given the context, which is up to analysis.
That's why I stopped using C++ or more precisely stopped any attempts and trying to be smart or fancy. I got to re-read and maintain the code month to years later and history showed, I don't marvel at how magic the line works and brutally smart I was at the time, but simply hate me for obscuring something in a line, that could be well understood if I had used 10 lines, while the compiler gives a damn anyway.
C++ is still necessary but every discussion to this day is about the point you made: every digit counts - and also which position, context etc. You got to be very prolific in order to put into a line what other put into 10.
Is it worth it? No.
In early days it was the correct decision. Memory was sparse, CPU power slow, and the language was small compared to today.
The last time I felt comfortable with a "assembler kind feeling" was with JavaScript before ES6. Peak jQuery level, with the most coolest concept only JavaScript has: Function.prototype.toString()
John Resig will have his place in my programming heroes olymp, who revealed this secret for me, and it opened my eyes for the beauty of higher order languages.
I admire C++, but so do I Python.
But I hope I won't have to ever use C++ again.
[-]
- pjc50 17 minutes ago
  
  One of these days I want to do a "typesafe macro assembler" that actually is the language people think that C is.