Or we could use the actual characters for this purpose - the FS (file separator)...

ilyagr · 2025-04-27T03:14:22 1745723662

I don't think popularizing these ASCII characters would solve the problem in its entirety.

If RS and US were in common use, there would be a need to have a visible representation for them in the terminal, and a way to enter RS on the keyboard. Pretty soon, strings that contain RS would become much more common in the wild.

Then, one day somebody would need to store one of those strings in a table, and there would be no way to do so without escaping.

I do think that having RS display in the terminal (like a newline followed by some graphic?) and using it would be an improvement over TSV's use of newline for this purpose, but considering that it's not a perfect solution, I can understand why people are not overly motivated to make this happen. The time for this may have been 40+ years ago when a standard for how to display or type it would be feasible to agree upon.

eviks · 2025-04-27T03:53:08 1745725988

> there would be a need to have a visible representation for them in the terminal, and a way to enter RS on the keyboard.

Both already possible, they have official symbols representing them

> Then, one day somebody would need to store one of those strings in a table, and there would be no way to do so without escaping.

Why? But also, yes, escaping also exists, just like in the alternative formats

ilyagr · 2025-04-27T04:23:33 1745727813

> Both already possible, they have official symbols representing them.

I'm not sure what you mean. For an illustration, my terminal does not print anything for them.

    $ printf "qq\36\37text\n"
    qqtext

*Update/Aside:* "My terminal", in this case, was `tmux`. Ghostty, OTOH, prints spaces instead of RS or US.

Unicode does have some symbols for every non-printable ASCII character, which you can see as follows with https://github.com/sharkdp/bat (assuming your font has the right characters, which it probably does):

    $ printf "qq\36\37text\n" | bat -A --decorations never
    qq␞␟text␊

Here, `␞` is https://www.compart.com/en/unicode/U+241E, one of the symbols for non-printable characters that Unicode has; different fonts display it differently. See also https://www.compart.com/en/unicode/block/U+2400.

Is there some better representation it has?

eviks · 2025-04-27T04:44:53 1745729093

Yes, I did mean the ␞ U+241E unicode symbols that represent separators. And as your `| bat` example shows, they can also be displayed in the terminal.

If you meant the default should always be symbolic, not sure, like newline separator isn't displayed in the terminal as a symbol, but maybe that's just a matter of extra terminal config

EvanAnderson · 2025-04-27T01:19:57 1745716797

I did an ETL project for an ERP system that used these separators years ago. It was ridiculously easy because I didn't have to worry about escaping. Parsing was an easy state machine.

Notepad++ handles the display and entry of these characters fairly easily. I think they're nowhere as unergonomic as people say they are.

addoo · 2025-04-27T00:51:21 1745715081

I’m pretty sure part of the intent is that it should be easy to write (type) in this format. Separator characters are not that. Depending on the editor, they’re not especially readable either.