Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

rsync is going to perform checksums on blocks to see if the blocks are the same. It transmits these checksums, and where the checksums differ, it deltas the blocks. Note that insertion/deletion in a file can push block boundaries off between two files, causing a problem known as "stream alignment", which can cause your binary delta to be much larger because it doesn't realize the block really shifted 16384 bytes over (or whatever), and so it thinks the client really doesn't have any of the bytes of that block.

In any case, if you know the files are related, you

1. Don't need to do any of this. You can simply send the binary delta that is is usually copy/add instructions (IE copy offset 16384, length 500 to offset 32768)

2. Can precompute the deltas.

You can actually precompute in any case, it just makes no sense unless you know you will be diffed against something else.



I always thought rsync detected block moves and that's what made it a worthy PhD thesis.


I thought that too. It'd be interesting to see a comparison of the two software designs with actual difference in resource usage (cpu, io, bandwidth).

That would be really cool in fact.


Yes, I simplified and I shouldn't have. It does detect them, but it does have a minimum size of block move it can detect due to the signature matching method.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: