> I do fully understand people who can't get their heads around threads and prefer async
This is a bizarre remark
Async/await isn't "for when you can't get your head around threads", it's a completely orthogonal concept
Case in point: javascript has async/await, but everything is singlethreaded, there is no parallelism
Async/await is basically just coroutines/generators underneath.
Phrasing async as 'for people who can't get their heads around threads' makes it sound like you're just insecure that you never learned how async works yet, and instead of just sitting down + learning it you would rather compensate
Async is probably a more complex model than threads/fibers for expressing concurrency. It's fine to say that, it's fine to not have learned it if that works for you, but it's silly to put one above the other as if understanding threads makes async/await irrelevant
> The stdlib isn't too bad but last time I checked a lot of crates.io is filled with async functions for stuff that doesn't actually block
Can you provide an example? I haven't found that to be the case last time I used rust, but I don't use rust a great deal anymore
>Case in point: javascript has async/await, but everything is singlethreaded, there is no parallelism, Async/await is basically just coroutines/generators underneath.
May be I just wish Zig dont call it async and use a different name.
Async-await in JS is sometimes used to swallow exceptions. It's very often used to do 1 thing at a time when N things could be done instead. It serializes the execution a lot when it could be concurrent.
if (await is_something_true()) {
// here is_something_true() can be false
}
And above, the most common mistake.
Similar side-effects happen in other languages that have async-await sugar.
It smells as bad as the Zig file interface with intermediate buffers reading/writing to OS buffers until everything is a buffer 10 steps below.
It's fun for small programs but you really have to be very strict to not have it go wrong (performance, correctness).
That being said, I don't understand your `is_something_true` example.
> It's very often used to do 1 thing at a time when N things could be done instead
That's true, but I don't think e.g. fibres fare any better here. I would say that expressing that type of parallel execution is much more convenient with async/await and Promise.all() or whatever alternative, compared to e.g. raw promises or fibres.
I replied directly to you. There are valid arguments as to why async-await sucks, even after you're deeply familiar with how it works. Even though it's just generators/coroutines beneath, async-await pollutes the code completely if you're not strict about its usage.
`is_something_true` is very simple, if condition is true, and then inside the block, if you were to check again it can be false, something that can't happen in synchronous code, yet now, with async-await, it's very easy to get yourself into situations like these, even though the code seems to yell at you that you're in the true branch. the solution is adding a lock, but with such ease to write async-await, it's rarely caught
My comment was responding only to the person who equated threads and async. My comment only said that async and threading are completely orthogonal, even though they are often conflated
> `is_something_true` is very simple, if condition is true, and then inside the block, if you were to check again it can be false, something that can't happen in synchronous code
It can happen in synchronous code, but even if it couldn't - why is async/await the problem here? what is your alternative to async/await to express concurrency?
Here are the ways it can happen:
1. it can happen with fibers, coroutines, threads, callbacks, promises, any other expression of concurrency (parallel or not!). I don't understand why async/await specifically is to blame here.
2. Even without concurrency, you can mutate state to make the value of is_something_true() change.
3. is_something_true might be a blocking call to some OS resource, file, etc - e.g. the classic `if (file_exists(f)) open(f)` bug.
I am neutral on async/await, but your example isn't a very good argument against it
Seemingly nobody ever has any good arguments against it
> async-await pollutes the code completely if you're not strict about its usage
This is a good thing, if a function is async then it does something that won't complete after the function call. I don't understand this argument about 'coloured functions' polluting code. if a function at the bottom of your callstack needs to do something and wait on it, then you need to wait on it for all functions above.
If the alternative is just 'spin up an OS thread' or 'spin up a fiber' so that the function at the bottom of the callstack can block - that's exactly the same as before, you're just lying to yourself about your code. Guess what - you can achieve the same thing by putting 'await' before every function call
Perhaps you have convinced me that async/await is great after all!
Async and threading are applied to similar problems, that's how I understood OP.
`if (file_exists(f))` is misuse of the interface, a lesson in interface design, and a faulty pattern that's easy to repeat with async-await.
> I don't understand this argument about 'coloured functions' polluting code.
Let's say you have some state that needs to be available for the program. What happens in the end game is that you've completely unrolled the loads and computation because any previous interface sucks (loading the whole state at the beginning which serializes your program and makes it slower, or defining `async loadPartial` that causes massive async-await pollution at places where you want to read a part of the state, even if it is already in cache).
Think about it, you can't `await` on part of the state only once and then know it's available in other parts of code to avoid async pollution. When you solve this problem, you realize `await` was just in the way and is completely useless and code looks exactly like a callback or any other more primitive mechanism.
A different example is writing a handler of 1 HTTP request. People do it all the time with async-await, but what to do when you receive N HTTP requests? The way to make code perform well is impossible with the serial code that deals with 1 request, so async-await just allowed you to make something very simple and in an easy way, but then it falls apart completely when you go from 1 to N. Pipelining system won't really care for async-await at all, even though it pipelines IO in addition to compute.
I think "how to express concurrency" is a question I'm not even trying to answer, although I could point to approaches that completely eliminate pollution and force you to write code in that "unrolled" way from start, something like Rx or FRP where time is exactly the unit they're dealing with.
> `if (file_exists(f))` is misuse of the interface, a lesson in interface design, and a faulty pattern that's easy to repeat with async-await.
It's even easier to repeat without async-await, where you don't need to tag the function call with `await`!
> Think about it, you can't `await` on part of the state only once and then know it's available in other parts of code to avoid async pollution. When you solve this problem, you realize `await` was just in the way and is completely useless and code looks exactly like a callback or any other more primitive mechanism.
I don't understand why you can't do this by just bypassing the async/await mechanism when you're sure that the data is already loaded
```
data = null
async function getDataOrWait() {
await data_is_non_null(); // however you do this
return data
}
function getData() {
if (data == null) { throw new Error('data not available yet'); }
return data;
}
```
You aren't forced into using async/await everywhere all the time. this sounds like 'a misuse of the interface, a lesson in interface design', etc
> I think "how to express concurrency" is a question I'm not even trying to answer
You can't criticise async/await, which is explicitly a way to express concurrency, if you don't even care to answer the question - you're just complaining about a solution that solves a problem that you clearly don't have (if you don't need to express concurrency, then you don't need async/await, correct!)
> point to approaches that completely eliminate pollution and force you to write code in that "unrolled" way from start, something like Rx or FRP where time is exactly the unit they're dealing with.
So they don't 'eliminate pollution', they just pollute everything by default all the time (???)
> You aren't forced into using async/await everywhere all the time. this sounds like 'a misuse of the interface, a lesson in interface design', etc
Exactly, async-await does not allow you to create correct interfaces. You cannot write code that partially loads and then has sync access, without silly error raises sprinkled all over the place when you're 100% sure there's no way the error will raise or when you want to write code that is 100% correct and will access the part of the state when it is available.
Your example is obvious code pollution. For "correctness" sake I need to handle your raise even though it should not ever happen, or at least the null, to satisfy a type checker.
> So they don't 'eliminate pollution', they just pollute everything by default all the time (???)
That's not the case at all. They just push you immediately in direction where you'll land when you stop using async-await to enforce correctness and performance. edit: you stop modelling control flow and start thinking of your data dependency/computation graph where it's very easy and correct to just change computation/loads to batch mode. `is_something_true` example is immediately reframed to be correct all of the time (as a `file_exists` is now a signal and will fire on true and false)
> You can't criticise async/await, which is explicitly a way to express concurrency, if you don't even care to answer the question - you're just complaining about a solution that solves a problem that you clearly don't have (if you don't need to express concurrency, then you don't need async/await, correct!)
I'm critical of async-await, I'm not comparing it to something else, I can do that but I don't think it is necessary. I've pointed to FRP as a complete solution to the problem, where you're forced to deal with the dataflow from the start, for correctness sake, and can immediately batch, for pipelining/performance sake.
IMO, just like file_exists is a gimmick, or read_bytes(buffer), or write_bytes(buffer) is a leaky abstraction where now your program is required to manage unnecessary wasteful buffers, async-await pushes you into completely useless coding rituals, and your throw is a great example. The way you achieved middleground is with code pollution, because either full async load at the beginning is not performing well, or async-await interleaved with computation pollutes everything to be async and makes timeline difficult to reason about.
This late in the game, any concurrency solution should avoid pointless rituals.
I have tried a model on my laptop+GPU before, and it is incredibly unusable. Incredibly slow and just bad output for exactly the work you describe
If you're looking for a cheap practical tool + don't care if it's not local, deepseek's non-reasoning model via openrouter is the most cost efficient by far for the work you describe.
I put 10 dollars in my account about 6 months ago and still haven't gotten through it, after heavy use semi regularly.
This isn't what people are talking about, you aren't understanding the problem
With RAII you need to leave everything in an initialized state unless you are being very very careful - which is why MaybeUninit is always surrounded by unsafe
{
Foo f;
}
f must be initialized here, it cannot be left uninitialized
std::vector<T> my_vector(10000);
EVERY element in my_vector must be initialized here, they cannot be left uninitialized, there is no workaround
Even if I just want a std::vector<uint8_t> to use as a buffer, I can't - I need to manually malloc with `(uint8_t)malloc(sizeof(uint8_t)*10000)` and fill that
So what if the API I'm providing needs a std::vector? well, I guess i'm eating the cost of initializing 10000 objects, pull them into cache + thrash them out just to do it all again when I memcpy into it
This is just one example of many
another one:
with raii you need copy construction, operator=, move construction, move operator=. If you have a generic T, then using `=` on T might allocate a huge amount of memory, free a huge amount of memory, or none of the above. in c++ it could execute arbitrary code
If you haven't actually used a language without RAII for an extended period of time then you just shouldn't bother commenting. RAII very clearly has its downsides, you should be able to at least reason about the tradeoffs without assuming your terrible strawman argument represents the other side of the coin accurately
You're repeating propaganda from a far right newspaper headline, written misleadingly to make it sound like labour have said something recently about VPNs (they haven't)
I don't care where the headline is from. Other places have the same suspicion. There clearly is _some_ concern in Labour that VPNs could be used to bypass the OSA and it doesn't take much imagination to see where this is going.
'Kyle told The Telegraph last week in a warning: "If platforms or sites signpost towards workarounds like VPNs, then that itself is a crime and will be tackled by these codes."'
"In 2022 when the Online Safety Act was being debated in Parliament, Labour explicitly brought up the subject of VPNs with MP Sarah Champion worried that children could use VPNs to access harmful content and bypass the measures of the Safety Act. "
It may not be recent but it is something that Labour MPs have said before in the context of the OSA.
The Labour think tank Labour Together also recently brought up a manditory goverenment ID called BritCard, ostensibly for government services but to be rolled out else where.
At the same time they've just set up an elite police force to monitor social media.
Labour must know people are rattled by all this, they just published a response to the petiion they recieved.
They're not addressing any concerns though, it's all we know best or shutting down debate with slurs.
In the absence of anything new we just have to take Labour policy on the last things they've said or done.
No, this is how everyone incompetent designs systems
Layers of generic APIs required to be 1000x more complex than would be required if they were just coupled to the layer above
Changing requirements means tunneling data through many layers
Layers are generic, which means either you tightly couple your APIs for the above-layer's use case, or your API will limit the performance of your system
Everyone who thinks they can design systems does it this way, then they end up managing a system that runs 10x slower than it should + complaining about managers changing requirements 'at the last minute'
The point of abstraction is to limit blast radius of requirement changes.
Someone decides to rename field in API? You don't need to change your database schema and 100500 microservices on top of it. You just change DTO object and keep old name in the other places. May be you'll change old name some day, but you can do it in small steps.
If your layer repeats another layer, why is it a layer in the first place? The point of layer is to introduce abstraction and redirection. There's cost and there's gain.
Every problem can be solved by introducing another layer of indirection. Except the problem of having too many layers of indirection.
Every layer you create is another public API that someone else can use in some other code. Each time your public API is used in a different place, it gathers different invariants - 'this function should be fast', 'this function should never error', 'this function shouldn't contact the database', etc. More invariants = more stuff broken when you change the layer.
So let's say you have some 'User' ORM entity for a food app. Each user has a favourite food and food preferences. You have a function `List<User> getListOfUsersWithFoodPreferences(FoodPreference preference)` which queries another service for users with a given food preference.
The `User` entity has a `String getName()` and `String getFavouriteFood()` methods, cool
Some other team builds some UI on top of that, which takes a list of users and displays their names and their favourite food.
Another team in your org uses the same API call to get a list of users with the same food prefs as you, so they loop over all your food prefs + call the function multiple times.
Amazing, we've layered the system and reused it twice!
Now, the database needs to change, because users can have multiple favourite foods, so the database gets restructured and favourite foods are now more expensive to query - they're not just in the same table row anymore.
As a result, `getListOfUsersWithFoodPreferences` runs a bit slower, because the favourite food query is more expensive.
This is fine for the UI, but the other team using this function to loop over all your food prefs now have their system running 4x slower! They didn't even need the user's favourite food!
If we're lucky that team gets time to investigate the performance regression, and we end up with another function `getListOfUsersWithFoodPreferencesWithoutFavouriteFoods`. Nice.
The onion layer limited the 'blast radius' of the DB change, but only in the API - the performance of the layer changed, and that broke another team.
This is where command/query separation is strongest regardless of onion/layered architecture. Your queries/reads are treated entirely separately from your commands/writes so you're free to include/exclude any of the joined data a particular query doesn't need.
Forgive me for not tying it back to your example explicitly.
Your example was a read. So in that case since there's no change in state (no need for protection of the data/invariants) there's no dangers in having different clients read the User records from the datastore however makes sense for them. They could use the ORM or hit the DB directly or anything, really. So getListOfUsersWithFoodPreferences and getListOfUsersWithFoodPreferencesWithoutFavouriteFoods living together as client-specific methods is absolutely fine. It's only when state changes that you need to bring in the User Entity that has all of the domain rules and restrictions.
The idea is that while on Commands (writes) you need your User entity, but on Queries (reads) there's no need to treat the User data as a one-size-fits-all User object that must be hydrated in the same way by all clients.
> So getListOfUsersWithFoodPreferences and getListOfUsersWithFoodPreferencesWithoutFavouriteFoods living together as client-specific methods is absolutely fine
Sorry; my point was that adding this function as a public API 'onion layer' in your code means you're less able to adapt to change. The fact this function returns a `User` entity isn't particularly important - it's the fact when you make a function public, other teams will reuse your function and add invariants you didn't realise existed, so that changing your function in the future will break other teams' code.
> The point of abstraction is to limit blast radius of requirement changes.
No, the point of abstraction is to make things easier to handle.
At least that is the original meaning of the term, before the OOP ideology got its hands on it. A biology textbook talks about organs before it talks about tissues before it talks about cells before it talks about enzymes. That is the meaning of abstraction: Simple interface to a complex implementation.
In OOP-World however, "abstraction", for some reason, denotes something MORE COMPLEX than the things that are abstracted. It's a kind of logic-flow-routing-layer between the actually useful components that implement the actual business logic.
And such middleware is perfectly fine ... as long as it is required. Usually it isn't, which is where YAGNI comes from.
Now, pointless abstractions are bad enough. But things get REALLY bad, when we drag things that should sit together in the same component, kicking and screaming, into yet another abstraction, so we can maybe, someday, but really never going to happen, do something like rename or add a field to a component. Because now we don't even have useful components any more, we have abstractions, which make up components, and seeing where a component starts and ends, becomes a non-trivial task.
In theory this all seems amazing, sure. It's flexible, it's OOP, it is correct according to all kinds of books written by very smart people.
In reality however, these abstractions introduce a cost, and I am not even talking about performance here, I am talkig about readability and maintainability. And as it turns out in the majority of usecases, these costs far outweigh any gains from applying this methodology. Again: There is a reason YAGNI became a thing.
As someone who had the dubious pleasure to bring several legacy Java services into the 21st century, usually what following these principles dogmatically results in, is a huge, bloated, unreadable codebase, where business functionality is nearly impossible to locate, and so are types that actually represent business objects. Because things that could be handled in 2 functions and a struct that are tightly coupled (which is okay, because they represent one unit of business logic anyway), are instead spread out between 24 different types in as many files. And not only does this make the code slow and needlessly hard to maintain, it also makes it brittle. Because when I change the wrong Base-Type, the whole oh-so-very-elegant pile of abstractions suddenly comes crashing down like a house of cards.
When "where does X happen" stops being answerable with a simple `grep` over the codebase, things have taken a wrong turn.
> The point of abstraction is to limit blast radius of requirement changes.
The problem is in many/most? systems there's no way it can possibly do this, because the abstraction that looked like a perfect fit for requirements set 1 can't know what the requirements in set 2 look like. So in my experience what ends up happening with the abstraction thing is people put all sorts of abstractions all over the place that seem like a good idea and when requirements set #2, #3, etc come along you end up having to change all the actual code to meet the requirements and all of the abstraction layers which no longer fit.
To choose a couple of many examples from my personal experience:
- One place I worked had a system the author thought was very elegant which used virtual functions to do everything. "When we need to extend it we can just add a new set of classes which implement this interface and it will Just Work". Except when the new requirements came in we now needed to dispatch based on the type of two things, not just one. Although you can do this type of thing in lisp and haskell you can't in C++ which is what we were using. So the whole abstraction ediface cost us extra to build in the first place, performance while in use and extra to tear down and rewrite when the actual requirements changed
- One place I worked allowed people to extend the system by implementing a particular java interface to make plugins. Client went nuts developing 300+ of these. When the requirements changed it was clear we needed to change this interface in a way a straight automated refactor just couldn't achieve. Cue me having to rewrite 300+ plugins from InterfaceWhichIsDefinitelyNeverGoingToChangeA format to InterfaceWhichIsHonestlyISwearThisTimeAbsolutelyNeverGoingToChangeB format. I was really happy with all the time this abstraction was saving me while doing so.
Most of the time abstraction doesn't save you time. It may save you cognitive overload by making certain parts of the system simpler to reason about, and that can be a valid reason to do it, but multiple layers is almost never worth it and the idea that you can somehow see the future and know the right abstraction to prevent future pain is delusional unless the problem space is really really well known and understood, which is almost never the case in my experience.
I've experienced the same. It's difficult for frontend and backend to communicate because there's a "translation layer" in between. Shipping a new feature is 100x harder than it needs to be because everything has to be translated between two different paradigms.
This article is disingenuous with its Vec benchmark. Each call to `validate` creates a new Vec, but that means you allocate + free the vec for each validation. Why not store the vec on the validator to reuse the allocation? Why not mention this in the article, i had to dig in the git history to find out whether the vec was getting reallocated. This feels like you had a cool conclusion for your article, 'linked lists faster than vec', but you had to engineer the vec example to be worse. Maybe I'm being cynical.
It would be interesting to see the performance of a `Vec<&str>` where you reuse the vector, but also a `Vec<u8>` where you copy the path bytes directly into the vector and don't bother doing any pointer traversals. The example path sections are all very small - 'inner', 'another', 5 bytes, 7 bytes - less than the length of a pointer! storing a whole `&str` is 16 bytes per element and then you have to rebuild it again anyway in the invalid case.
---
This whole article is kinda bad, it's titled 'blazingly fast linked lists' which gives it some authority but the approach is all wrong. Man, be responsible if you're choosing titles like this. Someone's going to read this and assume it's a reasonable approach, but the entire section with Vec is bonkers.
Why are we designing 'blazingly fast' algorithms with rust primitives rather than thinking about where the data needs to go first? Why are we even considering vector clones or other crazy stuff? The thought process behind the naive approach and step 1 is insane to me:
1. i need to track some data that will grow and shrink like a stack, so my solution is to copy around an immutable Vec (???)
2. this is really slow for obvious reasons, how about we: pull in a whole new dependency ('imbl') that attempts to optimize for the general case using complex trees (???????????????)
You also mention:
> In some scenarios, where modifications occur way less often than clones, you can consider using Arc as explained in this video
I understand you're trying to be complete, but 'some scenarios' is doing a lot of work here. An Arc<[T]> approach is literally just the same as the naive approach but with extra atomic refcounts! Why mention it in this context?
You finally get around to mutating the vector + using it like a stack, but then comment:
> However, this approach requires more bookkeeping and somewhat more lifetime annotations, which can increase code complexity.
I have no idea why you mention 'code complexity' here (complexity introduced by rust and its lifetimes), but fail to mention how adding a dependency on 'imbl' is a negative.
> This article is disingenuous with its Vec benchmark. Each call to `validate` creates a new Vec, but that means you allocate + free the vec for each validation. Why not store the vec on the validator to reuse the allocation? Why not mention this in the article, i had to dig in the git history to find out whether the vec was getting reallocated.
The idea comes back to [0] which is similar to one of the steps in the article, and before adding `push` & `pop` I just cloned it to make things work. That's what Rust beginners do.
> This feels like you had a cool conclusion for your article, 'linked lists faster than vec', but you had to engineer the vec example to be worse. Maybe I'm being cynical.
Maybe from today's point in time, I'd think the same.
> It would be interesting to see the performance of a `Vec<&str>` where you reuse the vector, but also a `Vec<u8>` where you copy the path bytes directly into the vector and don't bother doing any pointer traversals. The example path sections are all very small - 'inner', 'another', 5 bytes, 7 bytes - less than the length of a pointer! storing a whole `&str` is 16 bytes per element and then you have to rebuild it again anyway in the invalid case.
Yeah, that makes sense to try!
> This whole article is kinda bad, it's titled 'blazingly fast linked lists' which gives it some authority but the approach is all wrong. Man, be responsible if you're choosing titles like this. Someone's going to read this and assume it's a reasonable approach, but the entire section with Vec is bonkers.
> Why are we designing 'blazingly fast' algorithms with rust primitives rather than thinking about where the data needs to go first? Why are we even considering vector clones or other crazy stuff? The thought process behind the naive approach and step 1 is insane to me:
> 1. i need to track some data that will grow and shrink like a stack, so my solution is to copy around an immutable Vec (???)
> 2. this is really slow for obvious reasons, how about we: pull in a whole new dependency ('imbl') that attempts to optimize for the general case using complex trees (???????????????)
That's clickbait-y, though none of the article's ideas aim to be a silver bullet. I mean, there are admittedly dumb ideas in the article, though I won't believe that somebody would come up with a reasonable solution without trying something stupid first. However, I might have used better wording to highlight that and mention that I've come up with some of these ideas when was working on `jsonschema` in the past.
> I understand you're trying to be complete, but 'some scenarios' is doing a lot of work here. An Arc<[T]> approach is literally just the same as the naive approach but with extra atomic refcounts! Why mention it in this context?
If you don't need to mutate the data and need to store it in some other struct, it might be useful, i.e. just to have cheap clones. But dang, that indeed is a whole different story.
> I have no idea why you mention 'code complexity' here (complexity introduced by rust and its lifetimes), but fail to mention how adding a dependency on 'imbl' is a negative.
Fair. Adding `imbl` wasn't a really good idea for this context at all.
Overall I think what you say is kind of fair, but I think that our perspectives on the goals of the article are quite different (which does not disregard the criticism).
You can tell when, because if it uses the allocator it will return an error. So the first line definitely doesn't allocate, and the second definitely does.
That is, unless you explicitly handle OOM conditions inside your construct, e.g. 'crash if you're OOM', which isn't typical in zig code. All code I interact with will return an allocator error if allocation fails.
This is a bizarre remark
Async/await isn't "for when you can't get your head around threads", it's a completely orthogonal concept
Case in point: javascript has async/await, but everything is singlethreaded, there is no parallelism
Async/await is basically just coroutines/generators underneath.
Phrasing async as 'for people who can't get their heads around threads' makes it sound like you're just insecure that you never learned how async works yet, and instead of just sitting down + learning it you would rather compensate
Async is probably a more complex model than threads/fibers for expressing concurrency. It's fine to say that, it's fine to not have learned it if that works for you, but it's silly to put one above the other as if understanding threads makes async/await irrelevant
> The stdlib isn't too bad but last time I checked a lot of crates.io is filled with async functions for stuff that doesn't actually block
Can you provide an example? I haven't found that to be the case last time I used rust, but I don't use rust a great deal anymore