Here is a comparison of the prompt "I want to create a basic Flight simulator in Bevy and Rust. Help me figure out the core properties I need for take off, in air flight and landing" between Claude Sonnet 3.5 and Qwen2.5-14B-Instruct-Q4_K_M.gguf:
Comparable, I guess. But the result is a lot worse compared to Sonnet for sure. Parts of the example code doesn't make much sense. Meanwhile Sonnet seems to have the latest API of Bevy considered, and mostly makes sense.
https://gist.github.com/victorb/7749e76f7c27674f3ae36d791e20...
AFAIK, there isn't any (micro)benchmark comparisons out yet.