Depends. ROCm is pretty well-supported for example.
Non-NVIDIA backends tend to get less support and new features land slower, or features that are expected to improve performance wind up hurting it instead. That sort of thing.
For basic “token in/token out” workloads without fine tuning, it’s probably fine ??
The Ryzen AI Max 395 128gb is super cool, but not fast for inference. Order of magnitude slower than dedicated GPU but at half the cost. You can run larger models on it but it's slow. Great for local async work. Not great for daily chat or code agent driver.