Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Being fluent in Japanese as a second language, I agree with this. It's Zipf's law and sounds great, but 90% isn't as useful as it sounds. You'll mostly recognize a single common Kanji of compound words consisting of 2-3 characters, or common structural words. It's a far cry from being able to understand content you find in the wild.

Also, "understanding" a Kanji is an ill-defined term. Most Kanji have multiple meanings based on context, and many different readings. So for each Kanji you're not learning a single character, but possibly a lot more. Especially the Kanji that correspond to more abstract concepts you cannot learn by themselves. They don't have a concrete meaning like "eat" or "drink". You essentially have to memorize all of their word compounds, a single character does not help.

EDIT: I also started out learning Japanese by memorizing the top 2000 Kanji using Spaced Repetition. While it definitely helped, it wasn't nearly as useful as some of these marketing-driven sites want you to believe. Kanji meaning are too complex to be captured like that. Even if you "know" all Kanji in a word you'll likely not understand the word's meaning unless it's something simple and concrete. I think you are much better of memorizing and studying word compounds. Over time you will automatically "pattern match" the Kanji you see often to their abstract concepts.

Example from 1min of browsing a JP text: Take "可能" which is a very common word that usually means "possible". Knowing the two Kanji (tolerant and ability) it not going to help you. It could mean dozens of other things based on those simplified Kanji meanings. This is not an exception, the majority of words are like this. On the other hand, let me give you a bunch of words containing "可": 許可、可決、可能性、不可欠 (permission, approval, possibility, essential) and you start to pattern-match that "可" corresponds to something like "positive possibility", but it's hard to translate.



Let me add that just as it's not enough to focus on learning Kanji, even learning compound words made out of Kanji is not enough. There are many Japanese words that are never or only rarely written with Kanji, and depending on the text there can be many sentences with no Kanji at all. I recognize 4000+ Kanji by knowing Mandarin, so I can somewhat read e.g. software manuals (many Kanji, some English loanwords) but children's books are too advanced for me.

The "777 Kanji for 90% coverage" figure is probably more relevant for illiterate native speakers. Based on a corpus of 167,281 Japanese sentences I had segmented and lemmatized some time ago, you'd need 2,685 words for 90% (up to "故郷", "birthplace", usually written "ふるさと"), 6,564 for 95% (up to "すらり", "slender, smooth"), 14,098 words for 98% (up to "鼻先", "tip of the nose") and 20,657 words for 99% (up to "恐慌", "panic"). Obviously the exact numbers depend a lot on the diversity of the corpus, so don't mistake them for fixed targets to aim for.


I know there are a lot of ideas that there isn't a "right way" to use SRS, and I'm sure that someone would argue that what I am about to say is a no-true scotsman or whatever, but the simple fact of the matter is that you were using SRS wrong. In this instance 'wrong' means 'inefficient', 'suboptimal', and 'in a severely defective fashion'.

You don't just load single words up into it and study them on their own, you load words, compounds, sentences, sentence fragments, radicals, etc. etc. and everything else you can, and the SRS system helps you study and remember it. I can understand SRS not being as useful to you if you were only using it for single characters.


Just to clarify, I did use SRS for all my learning materials, including words and sentences. But I also had decks for single characters (you can find a lot of these online). Over time I found these pure character decks to be the least helpful or nearly useless, so I dropped them. Using SRS for other things like grammar and words/compounds is great.


Afterthought: Kawaii (cute) is actually 可愛い, also containing "可", which literally may mean something like "a thing that can be easily loved", or simpler, cute. But you wouldn't be able to guess that if you just know the Kanji.


I don't know for Japanese as meaning sometimes shifts from Chinese, but in Chinese the standard definition of 可 is "can, may, be able to".

You obviously learn it by itself but as Chinese words are mostly a combination of 2 characters, you immediately also have to learn e.g. 可以 (can, may, be able to), 可能 (maybe), 可爱 (cute) etc.

So someone who's learning characters in order to get 90% coverage (or whatever) would not simply learn characters but learn actual words. Learning characters in isolation would not be that helpful, indeed.

When you don't know a word (i.e. a combination) but you know the individual characters it is much easier to learn the new word either by guessing or checking.

In context, the meaning of 可爱 would be fairly straightforward to guess, for example. Even in English 'lovable' is a synonym of 'cute'.


> the meaning of 可爱 would be fairly straightforward to guess

To be honest this whole thread about 可愛い is more or less bonkers, because it's an ateji. The word's meaning doesn't derive from the characters, the characters got arbitrarily attached to an existing word because they were similar in sound and meaning.

As such, the whole thing is about as meaningful as talking about how easy it is to guess that 珈琲 means "coffee"...


I must say I don't know much about ateji in Japanese.

In this case, though it does seem that the characters where chosen at least partly because of their actual meaning.

It seems that it is both an ateji and a jukujikun [1] because the word does not come from the characters but the characters do have the correct meaning.

[1]https://en.wiktionary.org/wiki/%E5%8F%AF%E6%84%9B%E3%81%84#J...


The characters of 可愛い do not have the “correct” meaning. Just like in Chinese, 可 means “can”, or “possible”, and 可愛い is a weird exception. (It’s also nearly always written in hiragana.)

Nearly every word involving the kanji 可 in Japanese has something to do with permission: impossible (不可能), possible (可能), permission (許可), approval (可決), etc.

That said, the kanji-centric view of the world is ineffective, and one primarily adopted by beginner learners who have not spent any significant amount of time studying words in Japanese. It is always better to just learn words.


> the characters where chosen at least partly because of their actual meaning

Sure, didn't I say they were in my post?

The point was, in a discussion of how well X predicts Y, it's not very useful to examine a test case where Y came first and X was chosen post-hoc to match it.


> Sure, didn't I say they were in my post?

No, quite the opposite actually ;)


> the characters got arbitrarily attached to an existing word because they were similar in sound and meaning


It does not seem arbitrary in this case because the meaning does match.

I do take your point that using that word in the discussion above, which is about Japanese was not the best example. On the other hand, it is a good example in chinese.


(A) In Japanese the meanings don't match that closely. The word didn't originally mean cute, but rather pathetic or pitiable, and evolved over time. More info: http://gogen-allguide.com/ka/kawaii.html

(B) By arbitrary here I mean that there is no linguistic connection. "Arbitrarily chosen because they are similar" => "chosen for no reason other than their similarity".


If the characters were chosen because they fit the meaning, what does that change?


The thread was about how words' characters and meanings relate, and with ateji that relation is highly atypical, is the basic point.


Actually, no. 可 is like 'the be able to' prefix in Japanese/Chinese, so 可愛 is can be seen as a compound word because of that, literally means 'is able to be loved'

In Chinese, it can be further compounded into many common phrases:

可悲(pathetic, 可+'sad')

可气(annoying/irritating, 可+'anger')

可怜(pitiful, 可+'pity')

可恨(resentful, 可+'hate')

What would be a more suitable example would be 可(ke)乐(le) in my opinion, in which case it means Coke in Chinese, it is a transliteration.


Yes, Most of the time 可 means 'able' in '__able'.

But it could also be 'a thing that can be easily loved'. Rr simply, 'pleasing'. For example, 可口 means tasty. So, 可爱 could be understood in both ways actually.


Perhaps not the best example. Knowing that 可 is usually read 'ka' in compound words, and that 愛 is read as 'ai', you get to 'ka-ai-i', at which point the meaning of 愛 (love) will likely push you to かわいい (kawaii).


Of course, but you are assuming that you already know the meaning of "kawaii". My point is that if you don't know the word, you can't infer the meaning from the Kanji, except in very simple cases. That's why memorizing Kanji alone really isn't as helpful. Knowing 90% of Kanji meaning and reading doesn't help you much in understanding compounds/words.


> but you are assuming that you already know the meaning of "kawaii".

Have you ever been around non-native students of Japanese? This is the one word that is pretty much guaranteed to be known by all with an interest in anything remotely Japanese.

Your point is valid, but kawaii is perhaps not the right example.


Actually, I suspect it's a pretty decent example.

Most non-native students of Japanese will almost always encounter that word in hiragana, katakana, or even Romaji--encountering that as a kanji is actually remarkably rare.

The fact that the kanji for kawaii is an ateji makes it one of those odd ducks.


> Most non-native students of Japanese will almost always encounter that word in hiragana, katakana, or even Romaji--encountering that as a kanji is actually remarkably rare.

Interestingly, it's very common for students of Chinese. The word 可爱 (kě'ài, "cute") is dirt-common, but it isn't native to Chinese -- it originates as a loan from Japanese.

This isn't clear to the Chinese themselves, who use 卡哇伊 (kǎwāyī) if they want to refer to the Japanese word.

There's another modern word for cute, 萌 méng, which is also a loan from Japanese, though I think popular awareness of it as a weird loanword is higher, since its literal meaning ("sprout") is so far removed from the concept "cute".


The counterpoint is that if you are fluent in spoken Japanese you can get by with a fairly minimal number of kanji (and complete understanding of the kana as well, obviously).

Which is the trend with native Japanese people too.


> Even if you "know" all Kanji in a word you'll likely not understand the word's meaning unless it's something simple and concrete.

I don't think anyone passingly familiar with Japanese thinks otherwise. That is, I think you're arguing against a position here that nobody actually holds. The argument for memorizing kanji is that it makes it easier to learn compound words, not that you'll just know them without learning them.


I agree that people with some Japanese knowledge likely already know this. But out there you will find a lot of marketing material/posts targeting absolute beginners promoting some kind of "Top X% Kanji lists", as if they are a huge shortcut and secret to quickly learning Japanese. So I'm just saying they aren't.

If you talk to people who don't know much about Japanese they often believe that memorizing the characters is the difficult part and doing so will help you understand a large fraction of written text. I think it's quite a common misconception.


I feel like a lot of this perception comes from a tendency for people to equate Kanji characters with words. I always explain to others that a character is kind of like a root, e.g. Sub-optimus-al = suboptimal. It would be crazy to say one can learn English by memorizing just a few hundred Latin roots and suffixes, and in fact such a claim is so irrelevant nobody even keeps track of the statistics.


This is why I created an Anki deck that contains both kanji and common words that use those kanji [1]. It even includes statistics on how much each reading is used, so you can estimate if it's worth learning this reading or treat it as an exception.

[1] https://ankiweb.net/shared/info/831167744


I agree, and that's why I find wanikani so useful. After 7-8 years trying to learning kanji and always failing after 3-400, I have been using wanikani for ~ 6 months, and the system of building vocabulary on top of kanji on top of radicals really did it for me. I learnt 600 kanji in 6 months, ~ 2000 words of vocabulary, and I am quite confident I will reach 1000+ EOY.

Even at the beginning, learning the simple kanji, having the vocabulary I knew/heard living in Japan associated to the kanji made it very useful from the start.

For anybody learning kanji with anki, etc. and failing, I recommend giving wanikani a try. I am just a happy user. Disclaimer: I live in Japan, work in a mixed English/Japanese environment and speak simple Japanese w/ my wife every day


There’s some level of proficiency (whatever the percentage is) where each sentence is understood save for one word, and that word because of a single character that you can then look up in a dictionary. To me, that’s a useful level to achieve, because it means that you don’t have to constraint-solve sentences by guessing at the meaning of one character to decode the meanings of others, but instead can just hold onto the complete sentence minus one “hole” in your mind while you go look that “hole” up. Much less mentally-taxing!


When did you start studying Japanese?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: