Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't r/AskDocs in the corpus on which ChatGPT was trained on the first place?


It probably used that more to figure out how to word sentences, but I assumed it relied more on wiki or academic articles to diagnose.


Why would it be more probable and why would you assume "wiki" and "academic articles"?


Not sure why you put those things in quotes, that's kind of strange.

That aside, the training isn't blind, it's guided, and it's likely they use verified correct sources of info to train for some things, like medical diagnoses.


I can help with "verified correct sources", have a look at "Language Models are Few-Shot Learners" section 2.2 [1].

You may also be interested in Apendix A in the same document: "Details of Common Crawl Filtering"

[1] https://arxiv.org/pdf/2005.14165.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: