Hacker Newsnew | past | comments | ask | show | jobs | submit | nickez's commentslogin

Only if your system/browser is set to dark mode


Sorry all, I didn't realize this was happening. I don't use dark mode at the OS level, just per app. Fixed.


Found an error immediately "Any lowercase character" doesn't match all Swedish lowercase characters.


Ok. This sounds like an interesting detour. Can you elaborate on that one? I doubt I will ever use that knowledge, but it sounds like it is worth knowing anyway.



The author says “any lowercase character” but they mean “any character between the character ‘a’ and the character ‘z’”, which happens to correspond to the lower case letters in English but doesn’t include ü, õ, ø, etc.


lol really? Why not? Is that true for all encodings? Is it a bug or a feature? What about a simple character set like gsm-7 Swedish?


The author says “any lowercase character” but they mean “any character between the character ‘a’ and the character ‘z’”, which happens to correspond to the lower case letters in English but doesn’t include ü, õ, ø, etc.


> but they mean “any character between the character ‘a’ and the character ‘z’”, which happens to correspond to the lower case letters in English

‘Only’ in the most commonly used character encodings. In EBCDIC (https://en.wikipedia.org/wiki/EBCDIC), the [a-z] range includes more than 26 characters.

That’s one of the reasons POSIX has character classes (https://en.wikipedia.org/wiki/Regular_expression#Character_c...). [:lower:] always gets you the lowercase characters in the encoding that the program uses.


I would expect [a-z] to mean any lowercase in any language, not lowercase but only a to z. So I’d get bitten by that one.


The letters with diacritics sort lexicographically after 'z', so it does stand to reason they wouldn't appear in that range.


The Swedish alphabet includes characters outside of the a-z range.


It would be semver compatible if they used 0.DATE.0 or 0.0.DATE instead.

from semver spec:

4) Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable.


Does the data support the "general rule" then? I think 65 year olds today are much more able than 65 year olds 50 years ago.


Please read the article. His produces 200x intermediary values. Clickbait title, since it wasn't a leak.


I agree that this is not a memory leak.

However, the semantic distinction between "this uses much more memory than expected" and "this is a memory leak" is a little subtle, and it seems pretty rude to call it clickbait.


No, memory leak is a very distinct definition: unused and stored, but inaccessible memory. Memory leak can be as small as a single word. In this case, it's just a memory. There is another term for this scenario, which I don't remember.

This is a case of optimization gone wrong, but nothing is leaked, and every single byte is accounted for.

The title is click bate, but article still interesting to read.


Clickbait (in the context of Rust). In languages with managed memory there are no true memory leaks so such wastes are called leaks. In lower-level languages, we should stay more strict with what we call things.


… Box::leak¹ is a function that exists. That seems like a memory leak, no?

Less tongue-in-cheek, if a program allocates far more memory than expected of it, I going to colloquially called that a "memory leak". If I see a Java program whose RSS is doing nothing but "up and to the right" until the VM runs out of memory and dies a sweet sweet page thrashing death, I'm going to describe that as a "memory leak". Having someone tell me, "well, actually, it's not a leak per se it's just that the JVM's GC didn't collect all the available garbage prior to running out of memory because …" … I don't care? You're just forcing me to wordsmith the problem description —-the problem is still there. Program is still dead, and still exceeding the constraints of the environment it should have been operating in.

The author had some assumptions: that Vec doesn't overalloc by more than 2x, and that collect allocates — one of those did turn out to be false, but I think if I polled Rust programmers, a fair number of them would make the wrong assumption. I would, and TIL from this article that it was wrong, and that collect can reuse the original allocation, despite it not being readily apparent how it knows how to do that with a generic Iterator. (And, the article got me to understand that part, too!)

Unlike most clickbaits which lure you in only to let you down, I learned something here. Reading it was worthwhile.

¹https://doc.rust-lang.org/stable/std/boxed/struct.Box.html#m...


> "If I see a Java program whose RSS is doing nothing but "up and to the right" until the VM runs out of memory and dies a sweet sweet page thrashing death, I'm going to describe that as a "memory leak"."

By this definition, if a program reads in a file and you point it to a small file then the program does not have a memory leak, but if you point it to a large enough file, then the program does have a memory leak. Whether or not a program has a memory leak doesn't depend on the code of the program, but how you use it. But then on a bigger computer, the program doesn't have a memory leak anymore.

That seems a less useful definition than the parent poster's / the common definition.


… that's really not the idea I'm trying to convey with the comment.

Clearly, if you feed a program a larger file that it is going to read into memory to process, it is then expected that it will consume more resources on account of it doing more work. But that is memory being expended on visible, useful work. All of the examples in the comment are referring to memory being "allocated" (in the sense of being assigned to the program) but not fulfilling any visibly useful function insofar as the operator/programmer can see: Java's GC being unable to effectively reclaim unused memory prior to killing a machine, the OP's example of a Vec allocating without (seemingly) have a purpose (…as it is excess of what is required to allow for amortized appends).


There is an implied steady state in what the program is doing. If it goes right from "loading" to "exit" then you need a more complicated analysis.

When you have that steady state, that definition looking at uncontrolled growth is more useful than trying to dissect whether the memory is truly unreachable or only practically unreachable.


>I don't care? You're just forcing me to wordsmith the problem description

Yes, because if you don't define the problem clearly, the problem won't be solved. Java being inefficient with memory use doesn't mean any memory was leaked.

Memory leaks can be tricky to track down, and if I spent 6 hours looking for a memory leak only to come back and found out you meant it uses more memory than what's efficient I'd be pissed I wasted 6 hours because you wanted to save 5 minutes.


"uses more memory than what's efficient"

There is a hidden memory store using orders of magnitude more RAM than the live data. Why do we need to nitpick exactly how hidden it is? Are you going to be mad if I don't know whether it's literally inaccessible or not?


Because there are legitimate reasons why memory can be allocated. This is like calling your OS cache a memory leak when you open up Task Manager and see you only have 400MB free. A memory leak implies memory that is lost for good and it no longer being kept track of.

Consider it this way - if I had a program that connected to a database and used a connection pool to improve performance, would it be a "connection leak" that 5 connections were opened even though the database was idle?

The framing here is similar - Rust, in an attempt to improve performance reused large memory allocations. Some applications do this on purpose and call it buffer pools.


Except this memory is not being intentionally used as a cache/buffer.

Rust attempts to keep the vector capacity in a range for that purpose, and failed to do so here.

No matter what, it's a bug. So none of those possible justifications fit, because it's a bug.

For the database analogy, I would call it a connection leak if the number of idle connections greatly exceeded the amount that had ever been simultaneously busy and they weren't actually getting reused.


I disagree that it’s a small semantic difference.

I don’t think it’s clickbait though, I think the author was just misusing terminology.


I said it was subtle, not small. I agree it's a valuable distinction.


A memory leak means it leaks, it's not anymore under control. Here the memory is under control, it can be reclaimed by the program.


I did read the article.

> the memory waste from excess capacity should always be at most 2x, but I was seeing over 200x.

So the 200x analysis is his problem?


200x is correct. What's happening is that he makes a vector with tons of capacity and only a few elements, so lots of wasted space. Then he turns that vector into another vector using an operation that used to allocate a new vector (thus releasing the wasted space) but now reuses the previous vector's allocation (retaining the wasted space).

It's definitely a sneaky bug. Not a "memory leak" in the normal sense since the memory will still be freed eventually. I'd call it an unexpected waste of memory.


Rust can re-use an allocation, but if the new item is smaller than the previous it doesn't automatically remove (free) the "wasted" memory left over from the previous allocation. I think this is categorically not a memory leak as the memory was absolutely accounted for and able to be freed (as evidenced by the `shrink_to_fit()`), but I can see how the author was initially confused by this optimization.

The 2x versus 200x confusion IMO is the OP was conflating that Vec will double in size when it needs more space, so they were assuming the memory should have only ever been 2x in the worst case of the new size. Which in the OPs case because the new type size was smaller than the previous, it seemed like a massive over-allocation.

Imagine you had a `Vec<Vec<u16>>` and to keep it simple it there were only 2 elements in both the inner and outer Vec's, which if we assume Rust doubled each Vec's allocation that'd be 4x4 "slots" of 2 bytes per slot (or 32 bytes total allocated...in reality it'd be a little different but to keep it simple let's just assume).

Now imagine you replace that allocation with a `Vec<Vec<u8>>` which even with the same doubling of the allocation size would be a maximum of 4x4 slots of 1 byte per slot (16 bytes total allocation required). Well we already have a 32 byte allocation and we only need 16, so Rust just re-uses it, and now it looks like we have 16 bytes of "waste."

Now the author was expecting at most 16 bytes (remember, 2x the new size) but was seeing 32 bytes because Rust just re-used the allocation and didn't free the "extra" 16 bytes. Further, when they ran `Vec::shrink_to_fit()` it shrunk down to only used space, which in our example would be a total of 4 bytes (2x2 of 1 byte slots actually used).

Meaning the author was comparing an observed 32 byte allocation, to an expectation of at most 16 bytes, and a properly sized allocation of 4 bytes. Factored out to their real world data I can see how they'd see numbers greater than "at most 2x."


Clickbait implies a specific intent to mislead for clicks, whereas I think there’s a completely good-faith disagreement here about the meaning of the word “memory leak.”


The reason gsm, 3g, 4g, sms and so on succeeded was because everyone could implement them. I guess you had to pay license or patent fees, but they are not walled gardens. Phones from different manufacturers and/or different operators can communicate. I'm surprised that "chat"-protocols are allowed to be monopolies by the regulators. The regulators probably don't understand tech.


It is indeed confounding how something as simple as chat and messaging can be so difficult to standardize. I suppose looking at the shitshow that is email standards (and how difficult it is to ensure valid senders) gives some insight, but yikes.


Because "chat" is pretty much meaningless as a term. The only common thread between chat apps is "bidirectional data transfer between at least two devices." SMS and Discord are both chat but have wildly different completely incompatible semantics, iMessage embeds full iOS apps and a payment network into the chat.

I can't see any world where chat gets standardized that doesn't involve throwing out everything except the most basic sms-style semantics which is basically what RCS is.


You can have different standards for different use cases. So you mean that iMessage and rcs are so different that Android can't use iMessage or apple couldn't use rcs? We don't need to find one standard to rule them all. But we need to stop anti-competition behavior like allowing these protocols to be exclusive.


Used as a common denominator for basic communication, yes. Use for the kinds of rich interactions and modalities (like the "server" metaphor) that Apple, Google, and everyone else wants to add to chat, no. And that's where we get lost in "extension hell."


The roundtrip time is never consistent. Light travels with different speed in fiber depending on the temperature. This is why you calibrate every second.


The SEK used to be locked. That lead to another kind of crisis in the nineties. You are right that we should've adopted the euro though.


"They have all these useless little coins!!"

Said the swedes back then apparently completely forgetting about the 50 öre coins and the gigantic 5sek coins.


Tax, if your country has capital gain tax you need to declare every single trade to real USD. When you are trading between crypto pairs you are not realizing any gain, thus you don't need to declare it. Until you eventually sell of course.


If you're trading crypto pairs then that is considered a taxable event.[0] Many people owed taxes in 2017 because of that rule. It's not considered a like-to-like exchange.

[0] https://www.coindesk.com/learn/crypto-capital-gains-and-tax-...


Are you sure? What I've read is that every cryptocurrency transaction realizes the gain/loss, including trading one currency for another. See https://www.irs.gov/individuals/international-taxpayers/freq...


99.999% sure the IRS is not going to accept "but it was technically USDC not USD".

The main reason was that historically, crypto institutions tended to be no-questions-asked blacklisted from any real bank, so those who could get banking made stablecoins. Circle, who runs USDC, got banking access from some favorable VC connections iirc, and Tether .... who knows.


It's wrong as other comments said, but there are related use cases. If you have much XMR, want to evade tax, and want your assets to be stable: you can't exchange it to real USD because it will be followed by tax office. However just exchange it to USDT (between your crypto wallet, not wallet on exchange) won't be followed by your local tax office. Finally you still need to wash the money or need to use them for virtual things like NFT, but it works as temporary solution.


I'm also a proponent of nuclear, but modern nuclear power plants seem to take 15-20 years to construct and typically go 10s of billions of EUR over budget. So one does not simply "build a reactor".. look at Olkiluoto 3 in Finland, Flamanville 3 in France and Hinkley Point C in UK.

https://en.wikipedia.org/wiki/EPR_(nuclear_reactor)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: