Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Their report (linked to from the post) goes into greater detail: http://developer.yahoo.com/blogs/hadoop/Yahoo2009.pdf

I'd love to know why the 500 GB and 100 TB sorts ran at about half the speed of the other two (~0.5 TB/min as opposed to ~1 TB/min).



They doubled the ram before the petabyte sort. Is that what you're talking about?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: