Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At what point did AI-generated human speech become so remarkably realistic?

I recall just a couple of years ago when even the best models, like WaveNet, still had a subtle robotic quality.

What architectures or models have led to this breakthrough? Or is it possible that, as a non-native English speaker, I’m missing some nuances?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: