Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The proposal is just syntactic sugar for an size argument. It doean't add or solve anything really.


In my experience with language design, a little bit of syntactic sugar can have transformative results.

C's function prototypes, syntactic sugar added circa 1990, were transformative for C programming.


> In my experience with language design, a little bit of syntactic sugar can have transformative results.

I agree 100%. This also reminds me of this article:

https://nibblestew.blogspot.com/2020/03/its-not-what-program...

HN discussion: https://news.ycombinator.com/item?id=22696229


I totally agree that C shepherds you into pointers. I also think C shepherds you into writing everything from scratch. Most of all I think it's a self-perpetuating cycle of "there's no system for X (e.g. sized arrays, packaging system, classes) so everyone makes their own, and now all other code feels slightly incompatible with all other code."


C could have gained the safety of function prototypes without them. Note that the function bodies and call sites had the type information. It could have been propagated through the compiler and assembler to the linker, which would then check for compatibility.

In some ways it would have worked much better. Header files can easily be wrong. The object files being linked are what really matter.

So, suppose we decided to implement this today, on a GNU toolchain. At the call site, we'd determine the parameter types based on the conventional promotions. This info gets put into the assembly output with an assembler directive. The assembler sees that, then encodes it in an ELF section. It might get a new section called ".calltype" or it is an extension to the symbol table or it involves DWARF. Similar information is produced for the function body. The linker comes along, compares the two, and accepts or rejects as appropriate.


This would require storing type information in a symbol, would it not? Either via mangling or some other method.


Yes. I proposed a method without mangling, but I suppose there isn't any reason why C couldn't use mangled names.

It also isn't a requirement that C++ use mangled names. Other ways of carrying the type information are possible. I like the idea of a reference to DWARF debug info, which C++ is already using to support stack unwinding for exceptions.


> It also isn't a requirement that C++ use mangled names.

Overloading requires the type information to be part of the symbol "name" (ie, whatever is used for symbol lookup and linking) wether that is a mangled string or more complex data structure.


> In my experience with language design, a little bit of syntactic sugar can have transformative results.

Promises/async functions in JS and C# do absolutely nothing that you couldn't do without them. But they've had a structural effect on the average developer's ability to write scalable code.


I wonder if a better idea (in principle) would be to have some kind of hardware implementation, sort of like a finer-grained memory segmentation.


CHERI is an attempt. DARPA paid to have it for RISC-V and ARMv8. It was originally for MIPS.

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

It is fundamentally like the 80286 segments, but with all sorts of usability troubles solved. The 80286 segments were impractical because there were a small number available and because the OS couldn't safely hand over direct control. Every little segment adjustment required calling the OS.


x86 has a BOUND instruction which generates an exception if an index is out of bounds. It didn't make it into x64.


This is likely coming in ARM soon. (And I'm hopeful it's not even soon™, or "it came but nobody used it" as has happened on x86.)


Going forward Android will require hardware metadata extensions on ARM devices, as of Android 11.


You bring this up often and I ask you to modify your wording about every time I see it. Android will not require these extensions, no hardware ships with it yet. Android says they will support it.


And I give you the official wording of Google every time.

> Google is committed to supporting MTE throughout the Android software stack. We are working with select Arm System On Chip (SoC) partners to test MTE support and look forward to wider deployment of MTE in the Android software and hardware ecosystem. Based on the current data points, MTE provides tremendous benefits at acceptable performance costs. We are considering MTE as a possible foundational requirement for certain tiers of Android devices.

https://security.googleblog.com/2019/08/adopting-arm-memory-...

> Starting in Android 11, for 64-bit processes, all heap allocations have an implementation defined tag set in the top byte of the pointer on devices with kernel support for ARM Top-byte Ignore (TBI). Any application that modifies this tag is terminated when the tag is checked during deallocation. This is necessary for future hardware with ARM Memory Tagging Extension (MTE) support.

>....

> This will disable the Pointer Tagging feature for your application. Please note that this does not address the underlying code health problem. This escape hatch will disappear in future versions of Android, because issues of this nature will be incompatible with MTE

https://source.android.com/devices/tech/debug/tagged-pointer...

So unless you have other official feedback from Google management, I will keep repeating myself.


I am not sure if I have made myself clear, because I have no issues with the Google documents on this and I believe they are very clear: this feature is optional! Optional optional optional, only on hardware that supports it will Google implement these things because they literally cannot use it otherwise. Your wording has always implied that this is a requirement to run Android 11 and it is not, and that is what I am asking you to change. Like, what’s wrong with being accurate and saying “Google is implementing support for this in Android 11”? “This feature may be used to classify Android devices”?


Well, it could also solve the problem of sizeof(array) not working inside the function.

More specifically, at the moment, it evaluates to the size of the pointer itself, which is useless. On the other hand;

  static void
  foo(int a[..])
  {
    for (size_t i = 0; i < (sizeof a / sizeof int); i++)
    {
      // ...
    }
  }
... would be very useful, as it's the same syntax you can already use inside the function where the array is declared, which makes refactoring code into separate functions easier, as you don't have to replace instances of sizeof with your new size_t parameter name.

The only thing I'd like to see is compatibility with the static keyword; so that you can declare it as a sized-array but still indicate a compile-time minimum number of array elements. At the moment, in C99, this does not compile without serious diagnostics which would immediately highlight the problem:

  #include <stdio.h>

  static void
  foo(int a[static 4])
  {
    for (size_t i = 0; i < 4; i++)
      printf("%d\n", a[i]);
  }

  int
  main(void)
  {
    int a[] = { 1, 2, 3 };
    foo(a);     // Passing an array with 3 elements to a function that requires at least 4 elements
    foo(NULL);  // Passing no array to a function that requires an array with at least 4 elements
    return 0;
  }



  demo.c:14:3: warning: array argument is too small; contains 3 elements, callee requires at least 4 [-Warray-bounds]
  demo.c:15:3: warning: null passed to a callee that requires a non-null argument [-Wnonnull]


It is not just for the size argument. The array becomes an abstract data type whose bounds are consulted when the array is indexed. That's the key.

Yes, it's not a lot of effort to manually add a size_t argument. But it is far too tedious and error-prone to expect a programmer to add all the bounds checks. Being able to effortlessly tell the compiler "please check for me" is the huge win.

The second huge win is that the array is type-checked. So if you pass it to another function, the compiler enforces that it must again be passed with the size included. You don't get that by manually adding a size argument.


It allows automatic bounds checking i.e. I don't need to point out how many bugs that could fix.

If you're worried about performance test it and turn it off.


Bounds checking is a solved problem in C. The challenge is proving that your program _cannot_ go out of bounds. That is very much not a solved problem in C.


I'm curious what you'd want the automatica bounds checking to do?

Just terminating the program won't be much better than OOB memory access in many cases.

Continuing but discarding OOB writes/use a dummy for OOB reads could lead to much worse behavior.

An exception (or setting errno since this is C) would need that exception to be handled somewhere in a sensible way in which case you could just as easily add manual bounds checking.


I may be wrong, but something that you and I recognize as syntactic sugar may not be recognized as such by other, less experienced programmers.

So those programmers might just use the sugared approach and avoid the problem of writing past the end of an array, without ever knowing how tedious and/or difficult debugging such problems can be. They might sort of never even realize that they dodged a bullet simply due to some sugar.

How do you see it?


Agreed. The proposal is also wasteful.

>extern void foo(size_t dim, char *a);

And the like assumes that I have the space to waste a native type on every array. So if I’m using a 10 length array, I need to provision a native 32 or 64 bit value for “10”.

In embedded system this wouldn’t happen. At least not mine, I’m running up on limits all over the place even being careful with bitfields and appropriately sized types.

He’s right of course that foo(array[]) is converted to a pointer but that’s why I think you should always use array as a pointer so YOU know not to rely on its automatic protections.

I get the point; but I just don’t see C making this change.


So... don't use array syntax in the function prototype and definition? The proposal doesn't PROHIBIT passing a pointer, it would just offer an option to pass a fat array.


I think you mean fat pointer. And yea, that’s nice for people that don’t care their 4bit array has 64bits of native type reserved... I think other people would care.

So, I go back to the idea that it seems unlikely this would ever be an official C change.


I think you and I are talking past one another.

Currently, f(a[]) with declaration void f(int a[]) passes a pointer to the first element, with no additional overhead.

Under the proposal, f(a[]) with declaration void f(int a[]) passes a pointer to the first element, with no additional overhead.

Help me understand why "other people would care"? What is the negative impact on someone who would not use the a[..] functionality?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: