You know it's been a long time since this conversation but I think, reflecting, it has to do with grapheme clusters not being particularly consistent across operating systems and over time. The article even has an example where one Unicode spec encodes the same 5 USVs and either 1 or 2 graphemes.