This was already the bottleneck for all of the top solutions. Reducing the start up time was the primary goal and the top solutions were heavily optimizing for it, including compiler configurations. One of the primary reason to compile in a bloom filter instead of a hash set was that it made your binary ~2 MB smaller which was a significant advantage in start up time.
I was #6 using a bloom filter. I don't think anybody properly used both a libc startup optimization and a bloom filter because the numbers I saw were significantly better than those required to score in the top few.