Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just need a way to talk to ChatGPT anytime. Microphone, speaker and permanent connection to ChatGPT. That’s all you need: io

One need is being able to talk to ChatGPT in a whisper or silent voice… so you can do it in public. I don’t think that comes from them, but it will be big when it does. Much easier than brain implants! In an ear device, you need enough data of listening to the muscles and the sounds together, then you can just listen to the muscles…

I assume they want to have their own OS that is, essentially, their models in the cloud.

so, here are my specific predictions

1. Subvocalization-sensing earbuds that detect "silent speech" through jaw/ear canal muscle movements (silently talk to AI anytime)

2. An AI OS laptop — the model is the interface

3. A minimal pocket device where most AI OS happens in the cloud

4. an energy efficient chip that runs powerful local AI, to put in any physical object

5. … like a clip. Something that attaches to clothes.

6. a perfect flat glass tablet like in the movies (I hope not)

7. ambient intelligent awareness through household objects with microphones, sensors, speakers, screens —



The form factor that suggests is an AR headset. Google, Meta, and others have those. They're all flops. Too bulky.

Carmack has said that for VR/AR to get any traction, the headgear needs to come down to swim goggle size, and to go mainstream, it has to come down to eyeglass size. He's probably right. Ive would be the kind of guy to push in that direction.


> Carmack has said that for VR/AR to get any traction, the headgear needs to come down to swim goggle size, and to go mainstream, it has to come down to eyeglass size. He's probably right. Ive would be the kind of guy to push in that direction.

I agree with the first 2 sentences, but not the last. Everyone and their grandmother knows size and bulkiness are big blockers to VR/AR adoption. But the reason we don't have an Apple Vision Pro in an eyeglasses form factor isn't an issue of design, it's an issue of physics.

Meta seems to have decent success with their Ray Bans, which can basically do all the "ask AI" use cases, but true VR/AR fundamentally require much bulkier devices, most of all for battery life.


Tbh I rarely use my meta glasses for AI because most of the time I don't want to ask out loud in public. So I just get my phone out and ask chatgpt or gemini. I think voice is doomed due to that as a UI


> it's an issue of physics

Engineering, not physics?

I doubt anyone would have believed you could have a phone with AI chips inside it that fit in your pocket 30 years ago.


An aside, but I feel pretty old, because I remember 30 years ago quite well, and no, I don't think people (at least people in tech) would be that surprised by the miniaturization and technological advancement that has occurred. Moore's law had already been churning along for decades in 1995, laptop computers had been out for a while, people (including lots of university students) were browsing the web, and heck, there were even PDAs that could do handwriting recognition in 95.

People were already saying "Isn't it amazing that this computer that you can carry around in your hand is more powerful than a giant room of computers that NASA built to send astronauts into space" in the mid 90s, so while people wouldn't necessarily guess the details, I think they fully expected the technological advancements to continue apace.


Right. So the trick is to get people to put up with carrying the necessary hardware around. Ive made iDweebs cool. Even the wired version.

Apple already tried a version of their headgear where an additional belt-mounted box and cable are needed. This was unpopular but necessary. It's up to Ive to make wearing a utility belt cool.

It just takes marketing.[1]

[1] https://previews.123rf.com/images/pressmaster/pressmaster110...


I can't imagine that Jony Ive built a more advanced AR headset than Meta, Apple and all the others in two years.


It's a technical problem right, not design? Make it smaller, make it sexier!


> 5. … like a clip. Something that attaches to clothes.

I feel like the most natural thing would be basically push-to-talk-to-AI:

1. Some sort of mic + earpiece that you can wear comfortably(e.g. airpods)

2. A wireless button that you can put on a ring to activate the mic in the most ergonomic way possible

3. Any time you press the button, everything you say gets sent to a running AI chat


That's genius!

Like a pin. With AI. And it would talk to you like a human, so we could call it the Humane AI Pin.

How did nobody thought about that?


It wasn't connected to your personal phone in any way, it was doomed from the start.


I can't tell if you're being sarcastic or not but:

1. It would be easier to use than a pin because it's connected to your hand. You can press it without anyone knowing that you are pressing it. You can presse it with a single hand in a comfortable motion. It doesn't fall around or look weird on your body. It also just be a ring because it only needs to connect over short-range to your phone.

2. Earbuds give you privacy, volume control, good mic quality, better battery life. Also again they are slightly more subtle than a large pin.

3. You don't make these stand alone hardware. You just have them talk to your phone and have it handle networking, camera, compute as needed.

Not at all comparable to humane AI pin...


I’ve been using #5 for a few weeks now (Limitless.ai pendant, clips to clothes, records and transcribes everything all day)

It sounds cool, and the idea of asking questions about your day seems like it would be cool, but a few weeks later I’m finding myself forgetting to take it with me. The value just isn’t there yet. (And why have a clip on microphone when everyone already has a microphone in our pocket?)

It’s a cool toy though. Also a creepy toy since it can double as an eavesdropping device.

I have a feeling these AI companies will fall back to selling our data for advertising purposes once these companies realize their core products aren’t valuable enough for consumers to want to pay for the cost of it.


(Co-founder & CEO of Limitless) Thanks for trying it and I hope to win you back with the new features we have in the works!

As for selling data if consumers don’t want to pay for it: I commit publicly to never doing this. I will shutdown the company and return remaining capital to investors if consumers don’t want to pay for what we are building. So far, so good, and we were actually cash flow positive a few of the last few weeks.


Congrats!

I actually like your Limitless meeting transcription tool and have a subscription for that reason.

I wish your focus was on the software but rather than the hardware.

#1 request is simply the ability to export my data so that I can more easily load it into other tools to ask questions against.

You have a treasure trove of all of my meeting transcripts for the past year but I’m really nervous they will be lost forever at some point.


How does that work socially, is everyone just fine with their conversations with you being recorded? or do you just not mention it.


That’s one of the reasons I don’t clip it on and take it with me. I don’t quite feel like justifying or explaining it to everyone.


>Just need a way to talk to ChatGPT anytime. Microphone, speaker and permanent connection to ChatGPT. That’s all you need

So like a smartphone in your pocket connected to an earphone.

The whisper thing is nice. Sounds like a feature for next gen earphones.


> The whisper thing is nice

Amazon Alexa already has this (albeit you need to whisper loud enough for it to hear), and replies in a whisper. It works with any earbuds, but is kinda useless until Alexa+ (LLM integration) is more widely available; and it would be nice to have it reply in a normal voice when using earbuds.

Silent speech recognition is already a thing [0], so pairing it up with an LLM would be straightforward.

https://ieeexplore.ieee.org/document/10411110


Don’t bone conduction microphones already exist for this?


What exact use cases do I get from being able to talk to chatGPT when I am out in public? I can think of close to 0 value add to have an AI voice in my head when I'm taking a walk in the park or out to dinner.


People stare at their phones while walking, having dinner, and driving. It's not a big leap to imagine replacing that with subvocal conversations with AI.


Having ongoing conversations with a sentence completion algorithm while out in public sounds extremely depressing tbh.


Already people staring at their phones all day is depressing. But I agree, this sounds even more depressing. But it's not a very big leap from where we stand as society today.


I’m sure people would think many of the things you engage in are extremely depressing.

I know people for instance who could not think of anything more depressing than working with computers for a living. But hey, they do them and I do me. That’s the glory of things.


Having ongoing conversations with a recently-descended primate that needs 8 hours of daily unconsciousness and gets emotionally invested in the opinions of complete strangers sounds extremely depressing tbh.

This comment was written by Claude.


> It's not a big leap to imagine replacing that with subvocal conversations with AI.

There's no such thing as "subvocalised conversation". It's a pervasive sci-fi term that has no bearing on real life.


This is better than staring at your phone how?


presumably because you don't have to stare at your phone


i already do that with my iphone by mapping the action button to start conversation. if this product isn’t replacing the phone, then it needs to do something my phone (or watch, or glasses) doesn’t do.


A friend of mine is constantly asking it questions everytime something comes up. She opens her phone, loads the app, hits the mic button, then listens with the phone to her ear. Would work a lot better as some sort of device.


'every extension is also an amputation' - Marshal McLuhan

(since Neil Stephenson's recent essay brought that quote up)


The interface on iPhone still can be improved - latest ones have dedicated action buttons and camera buttons. Once you can plug it to better assistant and do it without phone unlocking then it becomes more seamless.

Make the same with apple watch to make hand gesture covering your ear like listening to something and then you don't even have to pickup phone from the pocket.

I think there is a lot of way how iphone, apple watch, airpods (case as pendant) could deliver the best UX but it doesn't matter as long as siri sux.


Just tell her to buy some earbuds that can trigger the assistant on your phone. Bam, problem solved.


I’m as much of a deep Ai skeptic as anyone but I can definitely think of use cases for while driving or walking, like asking questions about my own schedule or what people have emailed or asked me for in the last hour, or where I can get something specific to eat nearby and so on.

Not sure it’s worth the hype but there are use cases. I do think it’s an interesting contrast with crypto, where there aren’t really.


What I want is for it to surface information to me, not me have to query it.

Where is that AI? For example, if I usually eat between 2-4 PM, and I'm in the middle of time square, start suggesting places to eat based on my transaction history, or location history of restaurants I frequent. Something like that would be useful.

If I have to ask, I might as well look at my phone most of the time. It'd likely be faster in most cases.

I don't need something like that, where it must be queried to be useful, like asking it to read back my text messages, but I sure would love it if when my wife messaged me, it was smart enough to play the message unprompted if my headphones are already connected and active


Not saying it's your exact use case, however, in the "saved info" section of gemini I have a prompt about the llm letting me *know what's on it's mind " along with some other details to where when I am just "chatting" it has brought up relevant books to our previous discussion / projects. Alongside local events (bay to breakers, roots game memorial day weekend, some single events, etc within that first" hello" of our conversations and brought some news to the foreground that was relevant to me, although I wouldn't necessarily seek out that info. It's been so handy to bring relevant info into my hands in an actionable amount of time. Plain Jane gemini didn't offer those amenities but I was able to build them out.


How do you propose you would train it to know what you want? Besides a versioned system prompt that you (or the AI) would have to continually adjust.


If it gathers enough data on you, it can theoretically figure that out. Siri "Suggestions" have been around for years, and if you go to the same place frequently on certain days/times (e.g., your workplace, a friend you visit every Saturday, places you often go to lunch) or use the same apps at similar days/times (e.g., pulling up Callsheet in the evening when you're watching TV shows or movies to do TMDB lookups), those suggestions will show up. All of those examples are real ones I've experienced. The quality is certainly variable, but it's decent.

(Of course, it's "non-LLM" AI, which isn't particularly fashionable right now, but if we really want smarter AI agents we need to stop treating all problems as solvable with large language models.)


I don't think this leaves out initial setup. Another source of information: habit observation. If I do something around the same time every day, over and over again, it would be nice if it simply helped me along unless I interrupt explicitly. It should fine tune itself to observations it makes about my behavior and patterns, as opposed to me interjecting constantly

The constant need to query for information, rather than have useful information contextually pushed to me, fundamentally limits its utility once the novelty wears off. Without a sufficient complexity threshold (and this assumes accurate information and trust) its more work to query for things than it is to simply do them.

From a consumer perspective, thats not great.


If you had a butler or PA who's with you all the time, they would know what genre of food you like based on your restaurant visits and your ravings about what you liked. The imaginary AI would have your location history for restaurant visits, your Instagram feed for pictures of foods/review of restaurants, your chat history to see what you've raved about. It would also have big data from other people who have seemingly similar tastes to you, to recommend you the next place to eat.

Obviously since we're in lage-stage capitalism and everything is designed to extract profit out of you, we can't give commercial systems all our private data...


For both use cases I don't see how it would be any different that what anyone can currently do on their mobile device. And even if they were novel use cases, they are nowhere near solving a need that causes more than a few hundred people to pay money for a device or service.


I mean, you're both right. Being able to chat and iterate on stuff while I'm driving is both more productive and feels more natural than I expected before I did it. It wasn't too far removed from a brainstorming session I'd have with anyone on my team, except it was only me and ChatGPT. So there's probably a whole bunch of similar but adjacent use cases that I haven't even thought of you.

But... I can already do this! My phone + CarPlay and/or my headphones actually works great. I don't see how a new device adds value or is even desirable. Unless you're going down the Google Glass/Meta Rayban path of wanting to capture video and other environmental detail you can't if my phone is in my pocket.


Executive function assistance.

I'll set alerts, an alarm, write on my hand, etc. and still forget that e.g. my kids have a half-day tomorrow… even when medicated.

I'd love to have a little voice in my head periodically reminding me of these things.


when i think of them, i just call 1-800-chat-gpt


Hello and welcome to ChatGPT phone. If you know the query you would like to make, press one now.


On a dedicated device no less…what’s the point?! You have a phone.


Recently, I've been moving back to 'dedicated items' instead of 'everything on the phone'.

It might be an old guy finding a love for vinyl, but having a dedicated camera, a dedicated notebook, a dedicated music player ... makes making photos, writing down notes or listening to music somehow more ... meaningful to me. Maybe because I do not get distracted by the other 999 functionalities of the phone while I am trying to take photos, listen to music, or writing something down.


This isn’t a dedicated device for something a phone app replaced though…assuming it even is a pocket or wearable way to interact with chatgpt…it’s going to need some sort of cell/data service and replace an app that already exists with a whole other device.

I also am rolling back in certain areas, like writing instead of phone notes and such, but the idea of a wearable or portable chat bot device makes zero sense to me. It’s an added cost and yet another thing to lug around.

As it turns out though nobody seems to know _why_ they hired Ive or what they intend him to make.


Dedicated devices are often mediocre because they have to be cheap and have crappy HW and a small battery. Look at Rabbit R or Humane AI Pin. You already have a high quality expensive phone, no need to buy another crappy electronic-waste-since-manufactured device


... Unless they do bring more convenience. A popular example are smartwatches. People have a perfectly good clock, stopwatch, sleep surveillance device on their phone, and still pay 300-900 dollars for an AppleWatch.

Why? Because looking at your wrist is much more efficient and fast than getting your phone out of the pocket.

A phone-connected 'AI necklace' to act as a trigger and communication device to an AI interface ( maybe even a small camera ) might be more convenient than fishing your phone out of your pocket as well.


You can participate in more highly intelligent discussions, great for a dinner party or a date or an interview. Everywhere you go you can know it all, many use cases. The people who don’t do it, will be at a severe disadvantage.


So other people sitting at the table with you are supposed to be impressed or interested in you regurgitating words said to you by an AI voice in your head? Honestly if I went on a date and the other person was also having a conversation with an AI chatbot, I'd run as fast away as possible because that is insane.


It’s no different from all the other ways people try to impress each other.


It is vastly different because you aren't presenting anything novel or interesting. You'd just be parroting a computer program and acting like it is a substitute for personality, whit, and character.

More than anything it would broadcast a fear of opening up and showing who you really are to other people. So instead of risking saying something silly, you replace your sense of identity with a generic chatbot. Super cool.


You don’t think an AI can present anything novel or interesting? You think you just know it all already??


I have a relative who likes to lookup Wikipedia articles that he finds interesting and then read them to rooms full of people.

It's like, I can read Wikipedia myself.

Somehow I don't think anyone is going to be impressed by someone regurgitating chatgpt.


I imagine an LLM that has more realtime capabilities and can respond (or tell you what to say) on the fly would be a more fascinating conversation partner (well, on the surface!) than what you're depicting: a person who'd stop a conversation to ask the phone, and then just read the LLM's response.

I've read that interviews with Stephen Hawking are excruciating because he'd take many minutes to "type" up his response. Of course people are still engaged because it's Hawking and the answer is from his brain, someone pausing to interact with an LLM would be a bore indeed.


I think you're failing to understand that the human part of conversations is what makes them worthwhile. Otherwise you might as well just be talking by yourself to a circuit board.


Well duh... what I'm portraying is someone faking a human conversation, but who's actually just an output device for an LLM. A bit like pick-up artists going through a routine. What you're portraying is someone reciting Wikipedia, which would obviously be dull. And from that you're extrapolating that someone robotically reciting what an LLM is whispering would also be dull.

What I'm saying is, have a little bit more imagination and imagine someone seemingly in natural conversation, who is actually an LLM. Could they be engaging? IMO quite a bit more engaging compared to someone reading Wikipedia out loud. Would it be artificial? It still is. But would the conversation partner notice? Maybe not for a while. Would I hate it? Of course...


This idea has been explored before.

For example, a man had an AI girlfriend in a movie and she hired someone to keep her in her ear and follow instructions on what to say and do, so that she could be physically intimate with her human by borrowing someone’s body. Stuff like this could be interesting, people acting as surrogates for AI or just using AI to augment their conversation skills.


Why would the other people on the table be interested in what you're saying when they can talk to their ChatGPT too?


Because they get more clout by appearing intelligent in public speaking with others.


How does repeating words whispered in your ear show intelligence?


It doesn’t, it shows something that appears to be intelligence. Although it isn’t surprising that people all in on LLMs might not see any distinction there.


There was a guy at MIT who made a silent headset a few years ago. It didn't use brainwaves but rather measured electrical activity in the facial muscles. Apparently when you think in words, there's a slight activation of the same muscles you use to speak.


You want something like this. A sticker placed on your neck that reads the movement of your neck muscles and infers speech.

https://samueli.ucla.edu/speaking-without-vocal-cords-thanks...




More accurately, they have a patent on one method for achieving #1.


I was thinking about this problem once and I thought of some sensor to pick up your finger making typing movements but your hands can be in any orientation, i.e. suspended in the air at your sides as you walk.


I use mic input with the chat gpt app in public all the time, if you use a low whisper voice and hold the phone close you can be basically inaudible more than 3 feet away and the TTS still does a great job.


The real challenge, though, is going to be trust and usability. Always-on AI devices listening to whispers, watching context… that’s a privacy tightrope


Here is this vengeful looking, four legged, half bear sized wild cat before me, tell it to turn around and look for a squirrel instead!


Would love to chat more about this with you if you'd be interested!


Sure, hit me up at dereklomas@gmail.com


Awesome, will do!


Yeah this is exactly why I use Grok so much and barely use ChatGPT at all. I always have a device with X on it and it's easy to pop open Grok from anywhere on the site no matter what you're doing as the button is already there.


thanks for the list. it's brilliant.

I can't but wonder though... are we slave to productivity?

What do we need this omnipresent help? I'm sure some people do. If you're CEO of a large company, if you are a doctor seeing hundreds of patients in a week, etc.

But me? An average middle age guy with 9-5 job doing white collar job at healthcare company?

I enjoy doing some things that are 'inefficient'. Is that a really a problem?


Remember hanging out at the pub 20+ years ago? A discussion on who the fastest was, or something (literally what started the Guinness Book of Records) and it would run for a decent time as people mentioned who they thought it was, stories, hearsay evidence.

Now you just whip out your phone, look it up or ask an AI, get the answer and move on.

The second is more informative in a way, but so boring.

The point wasn’t knowing the fact, it was the discussion!


I try to avoid proving facts in pub conversations with my phone until someone has bet at least a dollar on the outcome


>That’s all you need.

No. They need all the data from your life.

They need to see what you see (camera somewhere), hear what you hear (hello microphones) and probably even more.

My bet is on some sort of tablet. Maybe kind of a book, or kindle, or something like that.


A bodycam rather




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: