Reminds me of a time some real estate website hotlinked a ton of images from my website.
After I asked them to stop and they ignored me I added an nginx rewrite rule to send them a bunch of pictures of houses that were on fire.
For some reason they stopped using my website as their image host after that.
there was another time a site hotlinked to a js file. After asking them to stop, i found that they had a contact form with a homebrew captcha which created the letters image like http://evilsite.com/cgi-bin/captcha.jpg?q=ansr
A little while later, their captcha form had a hidden input appended with the correct answer value, and the word to solve was changed to a new 4 letter word from a dictionary of interesting 4 letter words. The form still worked because of the hidden input. I might have changed the name on the "real" input also.
Additionally if they decide to blackhole the fake/honeypot url, since you mentioned they pass along the user agent, you could mixin some token in a randomized user agent string that your scraper uses so that you could duck-type the request on your end to signal when to capture the egress ip.
#5 and #6 are key. Don't try to block them directly, just get them delisted. When you've worked out a way to identify which requests belong to the scammer, feed them content that the search engines and their ad partners will penalize them for.
Reminds me of a time some real estate website hotlinked a ton of images from my website. After I asked them to stop and they ignored me I added an nginx rewrite rule to send them a bunch of pictures of houses that were on fire.
For some reason they stopped using my website as their image host after that.