The blog I linked to (about the data analysis we did) talks about the different data sources we used. We had the sampled logs (which you mention), plus statistical data on traffic through individual sites, plus crash logs. A lot more than "2 weeks worth of logs".
"We have the logs of 1% of all requests going through Cloudflare from 8 February 2017 up to 18 February 2017 (when the vulnerability was patched)...Requests prior to 8 February 2017 had already been deleted."
Sorry, if it came across as me implying you pulled that from thin air.
One of the most interesting things we had was the core dumps. Randomnly (depending on memory state) we'd crash rather than dump out memory in the HTTP response. We had all that data going back over the entire period. That gave us a lot of confidence this hadn't been exploited because we could see the rate of crashes plus we could see the actual core dumps to see the memory state when the crash happened.
Right. Which was one of the reasons we used YARA on all the data we pulled from Google and other caches so we could extract the leaked data and categorize it. Then we called all the affected customers (I did a lot of those calls personally) and offered to give them the leaked information so they could look for exploitation. The idea being that if they had seen some anamoly with something that we knew was in Google's cache then it would be evidence of exploitation that way.