> Do you know the precendence of --, > , ++ and = by heart? > Also if you lose a...

bsder · on March 2, 2014

For any junior engineers keeping score at home:

NEVER WRITE THIS CODE! EVER!

This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

Also, how does this code work if what one of those pointers points to are declared volatile? What if both pointers point to volatile things? What happens if one of those pointers crosses through 0 because it rolled over from max?

Thanks. I doubt you could have provided a better example of the precise problem.

Dylan16807 · on March 2, 2014

> This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

The few characters it would take to check length didn't effect this API, it was just part of the decision to use null terminated strings instead of storing lengths.

>Also, how does this code work if what one of those pointers points to are declared volatile? What if both pointers point to volatile things?

I can't imagine any way in which volatile would cause problems. What are you pondering? strcpy doesn't take volatile pointers anyway.

> What happens if one of those pointers crosses through 0 because it rolled over from max?

That's just a very specific type of buffer overflow. strcpy is the wrong function to use in such a scenario.

> Thanks. I doubt you could have provided a better example of the precise problem.

strcpy is for very specific situations. Writing it in a simple way isn't the problem. It's not like it's even theoretically possible for strpy to avoid overflowing. It doesn't have enough information, only two pointers.

In short, this is a bad API, it is not a bad implementation.

bsder · on March 2, 2014

> I can't imagine any way in which volatile would cause problems. What are you pondering?

Termination condition. Which data applies and does it have to be reread after the assignment?

userbinator · on March 2, 2014

It is read once and written once, and the value that was read is the value of the expression.

Note that this "elaborated" version,

    while(*src) {
      char c = *src;
      src++;
      *dst = c;
      dst++;
    }

if the pointers were volatile, would actually read the source value twice.

bsder · on March 2, 2014

The original while loop is functioning more like a do {} while() loop and the c is declared outside.

  char c;
  do {
    c = *src;
    src++;
    *dst = c;
    dst++;
  } while (c != '\0');

But, even here, people reading the code can reason about the volatiles without having to go look up the standard.

And don't even get me started about compiler implementation of volatile.

Dylan16807 · on March 2, 2014

So your conclusion is that the original code is more logical with regard to volatile pointers, even though calling it with one would mean casting away the volatile and basically inviting disaster?

No coding style is immune to problems caused by changing the number of reads to a variable. Volatile is an exceptional thing that must be treated with care and heavy documentation. It has no relevance to the quality of a normal-code strcpy.

userbinator · on March 2, 2014

This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

Or rather, buffer overflows arise because someone did not think of lengths, and somehow this mentality became ingrained in a rather large number of programmers; this code is perfectly fine if the length of the source string is always less than or equal to that of the destination. I think saying "never use X" is almost always a bad idea, since it discourages reasoning, and not thinking about lengths is what lead to this problem in the first place.

On the other hand, "never use gets()" is more appropriate, since whoever came up with that function clearly never thought of lengths...

chetanahuja · on March 2, 2014

"Those are fundamental things that should be second-nature to you if you claim to really know C"

That's a bogus argument. There's an easy way to write code that's readable and maintainable by mixed teams and then there's this macho way of showing off to no particular advantage.

- just remember that this is a strcpy, as shown in K&R: while(dst++ = src++) ;

Or -- just don't. Just write code that doesn't rely on every reader to mentally clear through a dense thicket of precedence and side-effects to understand your strcpy and get your job done in a pragmatic fashion with 4 lines of easily understood C code instead of 1 line that makes your heart glow with momentary pride.

userbinator · on March 2, 2014

The problem here is that the bug is not in the "dense" part - it's more likely that whoever wrote that code thought memset was always used to set blocks of memory to 0, and as for why it wasn't discovered sooner: it wasn't being used for anything else! Someone with that misunderstanding could've just as well have written what you consider more "readable" code, but with the same bug:

    *s = 0;
    s++;

chetanahuja · on March 2, 2014

You're right the bug is not in the dense part, but adjacent to it. My argument was that the extra cognitive overhead of digesting this complexity leads to the original authors and reviewers of the code to miss other obvious bugs.

One could reasonable speculate if a code-reviewer had sent the original code back to be written in a readable fashion, the actual bug in the code would have been caught too.

Dylan16807 · on March 2, 2014

As far as I'm concerned, that while expression is simpler than any possible three-statement for loop could be. Similarly, as long as the line is short, I have to dedicate less thought to tracking a single-line bracketless loop than a multi-line bracketed loop. (But don't use multiple lines without brackets, that makes it too hard to find the start and end and slows me down.)

I can see your point about *p++ being a speed bump, but overall this version seems less complex to me.

JoeAltmaier · on March 2, 2014

Not that dense. Difference is professional vs hobbyist. Don't imagine the professionals are going to write down to the hobbyist.

chetanahuja · on March 2, 2014

Sorry our definitions of what constitutes a "professional" are entirely different. The best teams I've worked in take great care to avoid needless complexity in code. Readability and maintainability take precedence over everything else... (even performance unless proven otherwise by measurement).

Given the kinds of projects I've been part of are already swarming with (externally imposed) system complexity to begin with, an attitude of "I write dense code since I'm a professional" won't get anyone very far in such teams.

JoeAltmaier · on March 2, 2014

I'm sorry if that seems dense to you. That's pretty much beginner C stuff; after years of writing it, its absolutely transparent to an experienced programmer. I'm astonished at the vitriol; every language takes some learning whether it be lambdas or annotations or declarations or whatever; if it seems opaque to you, perhaps some reading is in order.

chetanahuja · on March 2, 2014

"That's pretty much beginner C stuff"

And yet we started off discussing a serious bug in libc discovered in exactly that kind of code. That's the libc in Android!! My argument is that the extra cognitive overhead of digesting needless complexity leads to the original authors and reviewers of the code to miss other obvious bugs.

I'm astonished at the vitriol

I'm sorry if my straightforward and respectful response to your dismissive and patronizing remarks sounds vitriolic to you.

JoeAltmaier · on March 2, 2014

Because somebody is writing the libc code says nothing about their skill; just the opposite because who wants to be in charge of reimplementing tired old library routines? The new guy I would imagine.

This bug is simply a failure to test. Where was the trivial automated unit test for this library routine? Never written I imagine.

Reminds me of another buggy library I had to work with - the Linux c runtime. I was porting to a risc processor oh ten years ago. Something was wrong with the memcpy routine. So I wrote a unit test. A very thorough unit test. Found 12(!) bugs before I was through.

Test was: copy from aligned source + (0..128) to aligned destination + (0..128) length (0..128). Simple triple loop. Anybody could have written it. Nobody ever had.

This is not a C problem, or an Android problem. Its an industry problem. Intel shipped a Pentium processor once upon a time that couldn't divide by 10.

JoeAltmaier · on March 2, 2014

Apology accepted. I didn't mean you; there is plenty of vitriol elsewhere in this thread.

chris_wot · on March 2, 2014

A "professional" would never write that sort of dense C code.

JoeAltmaier · on March 2, 2014

Again, that's normal C code. Maybe you don't like the language; maybe you don't want to learn it. But its absolutely common and you won't get very far without mastering that stuff.

Would we criticize assembler because its full of confusing registers and stuff? Sure go ahead; but it sounds silly.