The entire matter in question is whether CSV is encoding-independent, operating on bytes (we’re addressing AndyKelley’s comment). The answer clearly demonstrated here is: no, CSV is operating upon characters, not bytes, so you need to decode the Unicode first and let the CSV operate on Unicode data, so that it’s splitting on U+002C, rather than 0x2C in the byte stream before Unicode decoding which destroys the data.