The comment to which you reply uses the %[] conversion to impose stricter limits...

sdenton4 · on March 1, 2021

(As opposed to the comma-separated data files that we enlightened future-folk use!)

Darkphibre · on March 1, 2021

You know, why don't we use 0x1F (Unit Separator)? Seems like we have had a hierarchical delimiters since ASCII / punch cards, yet I never hear about them. https://en.wikipedia.org/wiki/C0_and_C1_control_codes#Field_...

CSVs are such a pain if you have freeform text (in which you have to handle newlines and escaped quotes).

It seems like using FS/GS/RS/US from the C0 control codes would be divine (provided there was implicit support for viewing/editing them in a simple text editor). I get that it's a non-printable control character, and thus not traditionally rendered... but that latter point could have been done ages ago as a standard convention in editors (enter VIM et. al., as they will indeed allow you to see them).

jimmaswell · on March 1, 2021

It's nice being able to edit a CSV in a text editor. It wouldn't be worth it to lose that just to not have to escape some strings.

JdeBP · on March 1, 2021

Text editors that can put control characters into files, and indeed PC keyboards that have a [Control] key, have been around for decades, though. For example: With WordStar, SideKick, and Microsoft's EDIT, one just typed [Control]+[P] and then the control character. M. Darkphibre mentioned VIM, but really this isn't an idea that was ever limited to just VIM and emacs.

sp332 · on March 1, 2021

It wouldn't take that much effort to add a glyph in the editor for it?

afiori · on March 1, 2021

more than that it is nice to edit files with a standard keyboard,

db48x · on March 1, 2021

If your editor doesn't allow you to enter arbitrary characters using a standard keyboard then you need a better editor.

chousuke · on March 1, 2021

The problem is not the editor, it's the human typing on the keyboard that sees , and thinks "that'll do". Using an obscure dedicated character is not going to happen.

Darkphibre · on March 3, 2021

I guess my point was a curiosity that these control codes had become obscure, when text format interchange is so prevalent throughout computing history.

afiori · on March 2, 2021

I do not want to learn new input methods for every program, I want a standardized compose-like functionality that allow me to write both diacritics and backticks ( a plus for the rest of unicode)

db48x · on March 2, 2021

All operating systems have that too. Look up "Input Method Editor" in your favorite OS's documentation. See also dead keys, compose key, AltGr, etc, etc. If you think you can't edit a file because it has funny characters in it, then you're not trying hard enough.

afiori · on March 2, 2021

I know, still there is no reasonable way* to do both common european diacritics and backticks/tilde on windows without installing third-party software.

* I find dead keys irritating beyond reason, so I do not count them as an option

0xbadcafebee · on March 1, 2021

There are plenty of ways a lazy programmer will want to pass the records into something else that either doesn't handle non-printables, or does but the user doesn't like how it looks.

TSV also works more reliably than CSV, because most people don't put tab characters in the data in these kinds of records. Tab is even the default field delimiter for cut. But everybody uses CSV, because again it's easier to reason about the above. (shrug)

nitrogen · on March 1, 2021

No matter what delimiters we use, we still have either a data sanitization problem or a data escaping problem. I've worked with a wire protocol that used the ASCII record delimiters, but with binary data so you also had to use binary escape (0x1b) and set the eighth bit on the following byte.

elygre · on March 1, 2021

Which is a mess, because some parts of the world use commas as decimal commas: 3,14159etc. So we have to use semicolon for our CSVs.

bmn__ · on March 1, 2021

> some parts of the world

most parts

Why can't you be normal? https://commons.wikimedia.org/wiki/File:DecimalSeparator.svg

dmit · on March 1, 2021

I submitted a patch to support decimal commas when parsing timestamps in Go in 2013. I thought this was a slam dunk because while major users of decimal dots include USA, China, India, and Japan, the decimal comma is used by pretty much every other country in the world. Going from ~40% support to ~99.9% support seemed like an obvious win.

Rob Pike politely declined the patch, commenting "I might prefer to leave it unaddressed for the moment. Proper locale treatment is coming (I hope soon) and this seems like the wrong place to start insinuating special locale handling into the standard library."

Three years later another Go team member commented that "Date localization is definitely still in the planning."

We're in year 8 now. The issue is still open. Rob Pike is still hoping.

boogies · on March 2, 2021

> ~40% support to ~99.9% support

№ of countries ≠ № of people though. ⅘ of the 5 most populous countries¹ use decimal points, and together they alone² have ~42½% of the world’s people. Do all decimal point countries together have ≤50%? (Edit: Is it really not “normal” to do things the Anglophone way?)

¹According to https://en.wikipedia.org/wiki/List_of_countries_and_dependen...

²Not counting territories

nicoburns · on March 1, 2021

This is where Rust's decision to keep such functions out of the standard library entirely starts to look pretty smart.

tyg13 · on March 2, 2021

Wow, taking a look at the related issues, they are really hostile towards fixing this particular pain point. I knew Go had a reputation for being condescending and opinionated, but I had no idea it was this bad.

https://github.com/golang/go/issues/6189

https://github.com/golang/go/issues/26002

https://github.com/golang/go/issues/27746

https://github.com/golang/go/issues/36145

fomine3 · on March 2, 2021

I found the issue: https://github.com/golang/go/issues/6189

unilynx · on March 1, 2021

C is full of record based stuff. Apart from the numeric modifiers to printf and scanf we have fread, fwrite, calloc all taking record sizes