It is surprising that it is prompt based model and not RLHF.
I am not an LLM guy but as far as I understand, RLHF did a good job converting a base model into a chat model (instruct based), a chat/base model into a thinking model.
Both of these examples are about the nature of the response, and the content they use to fill the response. There are so many differnt ways still pending to see how these can be filled.
Generating an answer step by step and letting users dive into those steps is one of the ways, and RLHF (or the similar things which are used) seems a good fit for it.
Prompting feels like a temporary solution for it like how "think step by step" was first seen in prompts.
Also, doing RLHF/ post training to change these structures also make it moat/ and expensive. Only the AI labs can do it
The problem is you'd then have to do all the product-specific post training again once the new base model comes out a few months later. I think they'd rather just have general models that are trained to follow instructions well and can adapt to any kind of prompt/response pattern.
as I am also thinking mildly about doing masters cause I want to break into ai research, I am curious what your motivations are, if you would be open to share those.
It can sound significantly better but there’s a couple hoops you have to jump through - and even then it’s decent, but not the same as Siri.
You need the user to download ‘enhanced’ or ‘premium’ voices in the settings app.
(Settings -> Accessibility-> Spoken Content -> Voices -> [Language of choice] -> [Voice of choice] -> Enhanced or Premium)
In the app you have to search for the enhanced or premium voices when doing TTS.
Yeah, I use a premium voice but was still disappointed when we added the feature to my reader app. I decided to leave it in the app since we'd already built it at that point, but it's kind of a bummer since obviously they could use Siri-level TTS if they wanted to.
Yes, their quality is great but the cost is astronomical — I pay about $8 in Azure TTS bills alone for TTS-ing a 500-page book (what you can scan per month with a $10 subscription), whereas Eleven Labs would be about $100 for the same length. I found Azure to be the best bang-for-the-buck, although I'm on the lookout for more affordable high-quality TTS, which would also let me drop the price point of the app.
Same feeling. It does not take you towards the answer. So I added the first movie in the today's game (say Spiderman) and the answer was spiderman 2. When I entered Spiderman, it should have told how close I am to the answer
I am not an LLM guy but as far as I understand, RLHF did a good job converting a base model into a chat model (instruct based), a chat/base model into a thinking model.
Both of these examples are about the nature of the response, and the content they use to fill the response. There are so many differnt ways still pending to see how these can be filled.
Generating an answer step by step and letting users dive into those steps is one of the ways, and RLHF (or the similar things which are used) seems a good fit for it.
Prompting feels like a temporary solution for it like how "think step by step" was first seen in prompts.
Also, doing RLHF/ post training to change these structures also make it moat/ and expensive. Only the AI labs can do it