"The reason caches come in "lines" that are bigger than the words size of the processor is not an optimization (i.e. they're not deliberately bringing in nearby memory in the hope that it will be useful)."
I do not think that is true. There is considerable chance that that extra memory will be useful. The simplest example to think of are cache lines that contain program code. The other canonical example is the "for item in array do item = f(x)" loop.
No, there's hardware to do that too (speculative reads), but it's even more complicated. Really the driver of cache line size is simple efficiency. Software has to jump through lots of hoops in practice to try to align accesses to cache lines, and that could be avoided if memory could be efficiently cached in word-sized chunks.
I still doubt that. IMO, cache lines are larger than the largest item a CPU can read because the probability that the extra data will be needed soon is high enough to offset the extra work needed to read that larger cache line and the (few) transistors needed to increase cache line size.
In some sense, large cache lines are just cheap ways to implement speculative reads.
"A significant observation is that, although increasing line sizes can result in a higher hit ratio, it also considerably increases traffic to main memory, thereby degrading the performance."
I do not think that is true. There is considerable chance that that extra memory will be useful. The simplest example to think of are cache lines that contain program code. The other canonical example is the "for item in array do item = f(x)" loop.