How does checksumming help if the data is in cache and waiting to be written ?
For ex:
I have 1MB of data, i write it but it stays in buffer cache after written and when you do checksum you are computing the checksum on buffer cache .
On Linux you have to drop_caches and then read get the checksum to be sure. Now per buffer or file drop_cache isnt available as per my knowledge . If you are doing a systemwide drop_caches you are invalidating the good and bad ones.
What if now if device is maintaing cache as well in addition to buffer cache?
How do you know you put good data into the cache in the first place?
There's always going to be a place where errors can creep in. There are no absolute guarantees; it's a numbers game. We've got more and more data, so the chance of corruption increases even if the per-bit probability stays the same. Checksumming reduces the per-bit probability across a whole layer of the stack - the layer where the data lives longest. That's the win.
I was asking this thinking of open(<file>,O_DIRECT|O_RDONLY);
that bypasses buffer cache and read directly from the disk that atleast solves buffer cache i guess. The disk cache is another thing ie if we disable it we are good at the cost of performance.
I was pointing that tests can do these kind of things.
On Linux you have to drop_caches and then read get the checksum to be sure. Now per buffer or file drop_cache isnt available as per my knowledge . If you are doing a systemwide drop_caches you are invalidating the good and bad ones.
What if now if device is maintaing cache as well in addition to buffer cache?
Can someone clarify ?