It is, the tokenizer isn't reversible (and it adds spaces all over the place). B... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		yeldarb on March 14, 2019 \| parent \| context \| favorite \| on: Show HN: This Question Does Not Exist It is, the tokenizer isn't reversible (and it adds spaces all over the place). But a lot of these I should be able to add to my regex that converts the output back into more human readable format (in the raw output, there's a space before every punctuation mark so I already remove those extraneous spaces from periods, commas, etc). I just haven't gotten around to adding in any heuristics specifically for code but adding a bit more post-processing is on my to-do list.

yeldarb on March 17, 2019 [–]

I updated my regexes to clean up some of the tokenizer noise last night. So many of the formatting in the code snippets should look a bit more natural now.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact