polyphonic note detection is largely solved at this point.
But "solved" here means "when not doing the analysis in real time". The realtime solutions are not as good. NN's are not typically great at realtime either, so this may not help very much with this particular goal.
I never claimed or implied anywhere that it does, and I know for a fact TFA doesn't involve note detection.
I'm just disputing the GP general assertion about NNs that "NN's are not typically great at realtime either", which is quickly disproven by the TFA which uses NN for realtime audio.
But "solved" here means "when not doing the analysis in real time". The realtime solutions are not as good. NN's are not typically great at realtime either, so this may not help very much with this particular goal.