Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd expected some training required for each separately so we've got a decent collection of voice examples. Some of us mix languages within a sentence but that's a bad habit anyway so unsupported. I don't see figuring out the language as being a problem as my proof-of-concept already handles that well enough with around 90% accuracy. Once its selected a language then the appropriate model can be used. To make it fast we'd likely just keep the whole thing in RAM. Might need more however.

The issue I see with talon is its currently mac only. That would however still help one of us who lives on wheels (got a 16" macbook IIRC and a mac mini as well). Different set of use-cases so things would be more relaxed.

I see some hints about a linux version however. I've got windows / linux VMs on the server but no other macs. GPUs will be installed soon when I decom some old gaming rigs.

Plenty to think about.



The talon beta is on windows/linux/mac. I was recommending wav2letter directly instead of talon specifically because you mentioned thin clients, and I'm not really targeting something like headless raspberry pis yet.

I mostly mentioned wav2letter@anywhere because it could handle a bunch of audio streams centrally, so you can stream from 16 pis to a central box, and it's very accurate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: