Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't find references to HMM-based large language models. Small HMM language models generate gibberish very similar to this.

A HMM consists of a state space, a state transition matrix, and an output probability matrix. A token space of 50k and a state space of something like 60k would have seemed impossible 10-20 years. It has only recently become viable.

Training using Baum-Welch on a big enough text data set would be interesting. It should be much faster than back-propagation with a transformer-model.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: