Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To me the problem seems to be old-school C programmers thinking that they don't need to use a library to parse a standard format. I have only ever seen JSON parsing errors in C and C++ codebases because they are the only languages where developers seem to think it's a good idea to parse JSON using string functions rather than a JSON parser.


Some of that is that they don't have systems like maven or npm that would make it really easy to incorporate libraries into the build.

That said, what I like about C is that it is a simple language where you can throw out the standard library and start anew: it makes me think of the old versions of ALGOL that didn't specify mechanisms for I/O, and now we know you can specify that behavior in libraries and leave the language pure.


> Some of that is that they don't have systems like maven or npm that would make it really easy to incorporate libraries into the build.

Definitely! I wish the C and C++ communities would take this problem seriously.

> That said, what I like about C is that it is a simple language where you can throw out the standard library and start anew

This definitely is a nice property. But it's not just C that can do this. You can for example do this with Rust, and get an excellent package management system (that can indeed package these alternatives to the stdlib) to boot.


As a C programmer, i can say that most of the json libraries fail on simple fuzzing test, or do some other idiotic stuff like trying to handle numbers (and doing it wrong of course). We have awesome parser generators such as ragel, yet people still handroll parsing code that is both buggy and slow.


> We have awesome parser generators such as ragel, yet people still handroll parsing code that is both buggy and slow.

Wasn't ragel responsible for the cloudbleed vulnerability? Wherever possible we really shouldn't be using C at all.


No, the bug was in code that was feeding input to ragel [1]. Ragel itself was fine. (Ragel also can output to non C languages). I've used ragel myself extensively and it's really rock solid, I've fuzzed the state machines it generates a lot. I've even used it to parse protobuf specifications and generate code from them, the company still uses that code. (and it's really not a lot of code)

1: https://blog.cloudflare.com/incident-report-on-memory-leak-c... (As for cloudflare, I don't think it's good idea to let single company to MITM the whole internet)


It sounds like Ragel still requires the user to "use it correctly", which is really not great.


That's like saying driving car into a wall full speed should not kill you


It's like saying that we ideally shouldn't have cars that can drive full speed into a wall, or at least that these cars shouldn't be the cars that most people use. And indeed modern cars will generally at least attempt to break before they hit a wall. It seems not too fanciful to think that in another 10 years we will have cars that can reliably prevent themselves from being driven into walls.

Our tech with regard to type and memory safety is more advanced than our car tech: we already have the tech to prevent these errors. And I believe we should be using it wherever possible. Obviously there will be exceptions (the more obvious being legacy codebases), but they should be exceptions.


My Subaru will do its best to turn off the gas and auto break when it detects I’m about to drive into a wall.


I used to be the only one where I worked who would use the standard library for xml in perl rather than just ad hoc regexes.

They'd recycle my scripts when dealing with very large files but wouldn't learn it themselves.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: