That's been my experience. RF tends to do quite well out of the box, and is very fast to fit. It's less of a pain to cross-validate too, with fewer tuning parameters. XGBoost has a huge number of knobs to tune, and its performance varies from god-awful with bad hyperparameters to somewhat better than RF with good ones. Giant PITA with nested cross-validation, etc. though.
I haven't read in detail what their validation strategy is but this seems like the kind of problem where it's not so easy as you'd think -- you need to be very careful about how you stratify your train, dev, and test sets. A random 80/10/10 split would be way too optimistic: your model would just learn to interpolate between geographically proximate locations. You'd probably need to cross-validate across different geographic areas.
This also seems like an application that would benefit from "active learning". given that drilling and testing is expensive, you'd want to choose where to collect new data based on where it would best update your model's accuracty. A similar-ish ML story comes from Flint, MI [1] though the ending is not so happy
The drilling and active learning part reminded me of this very nice article on Bayesian Optimization from Distill publication [0].
They explain it for selecting the hyper parameters for ML models:
> In this article, we talk about Bayesian Optimization, a suite of techniques often used to tune hyperparameters. More generally, Bayesian Optimization can be used to optimize any black-box function.
But the example at the beginning of the article is mining gold:
> Let us start with the example of gold mining. Our goal is to mine for gold in an unknown land 1 . For now, we assume that the gold is distributed about a line. We want to find the location along this line with the maximum gold while only drilling a few times (as drilling is expensive).
> your model would just learn to interpolate between geographically proximate locations
At a particular scale, this is entirely correct; if what I'm looking for is 'large', a measurement 1m away from a known hit would also be likely to be a hit.
That particular issue sounds like it should be addressed with more negative samples.
I haven't read in detail what their validation strategy is but this seems like the kind of problem where it's not so easy as you'd think -- you need to be very careful about how you stratify your train, dev, and test sets. A random 80/10/10 split would be way too optimistic: your model would just learn to interpolate between geographically proximate locations. You'd probably need to cross-validate across different geographic areas.
This also seems like an application that would benefit from "active learning". given that drilling and testing is expensive, you'd want to choose where to collect new data based on where it would best update your model's accuracty. A similar-ish ML story comes from Flint, MI [1] though the ending is not so happy
[1] https://www.theatlantic.com/technology/archive/2019/01/how-m...