The amount of hallucination I get when trying to write code is amazing. I mean i...

antihipocrat · 2025-01-24T12:11:09 1737720669

You're not alone. In my experience the senior executive are enamoured by the possibility of halving headcount. The engineers reporting honestly about the limitations of connecting it to core systems (or using it to generate complex code running on core systems) are at risk of being perceived as blocking progress. So everyone keeps quiet, tries to find a quick and safe use case for the tech to present to management, and make sure that they aren't involved in any project that will be the big one to fail spectacularly and bring it all crashing down.

ZaoLahma · 2025-01-24T12:50:54 1737723054

What irks me is how LLMs won't just say "no, it won't work" or "it's beyond my capabilities" and instead just give you "solutions" that are wrong.

Codeium for example will absolutely bend over backwards to provide you with solutions to requests that can't be satisfied, producing more and more garbage for every attempt. I don't think I've ever seen it just say no.

ChatGPT is marginally better and will sometimes tell you straight up that an algorithm can't be rewritten as you suggest, because of ... But sometimes it too will produce garbage in its attempts at doing something impossible that you ask it to do.

genewitch · 2025-01-24T17:37:52 1737740272

Two notes: I've never had any say no for code related stuff, but I have it disagree that something exists all the time. In fact I just one deny a Subaru brat exists, twice.

Secondly, if an llm is giving you the runaround it does not have a solution for the prompt you asked and you need either another prompt or another model or another approach to using the model (for vendor lock in like openai)

dingnuts · 2025-01-24T15:14:58 1737731698

>What irks me is how LLMs won't just say "no, it won't work" or "it's beyond my capabilities" and instead just give you "solutions" that are wrong.

This is one of the clearest ways to demonstrate that an LLM doesn't "know" anything, and isn't "intelligence." Until an LLM can determine whether its own output is based on something or completely made up, it's not intelligent. I find them downright infuriating to use because of this property.

I'm glad to see other people are waking up

scarface_74 · 2025-01-24T20:44:10 1737751450

That’s an easily solvable problem for programming. Today ChatGPT has an embedded Python runtime that it can use to verify its own code and I have seen times that it will try different techniques if the code doesn’t give the expected answer. The one time I can remember is with generating regex.

I don’t see any reason that an IDE especially with a statically typed language can’t have an AI integrated that at least will never hallucinate classes/functions that don’t exist.

Modern IDEs can already give you real time errors across large solutions for code that won’t compile.

Tools need to mature.

kvgr · 2025-01-27T12:03:22 1737979402

Yeah, but it would have to reason about the thing it just halucinated. Or it would have to be somehow hard prompted. There will be more tools and code around LLM, to make it behave like a human then people can imagine. They are trying to solve everything with LLMs. They have 0 agency.

genewitch · 2025-01-24T17:40:05 1737740405

Intelligence doesn't imply knowing when you're wrong though.

Hackernews has Intelligent people...

Q. E. D.

% LLMs can RAG incorrect PDF citations too

epcoa · 2025-01-24T14:34:09 1737729249

> ChatGPT is marginally better and will sometimes tell you straight up that an algorithm can't be rewritten as you suggest

Unfortunately this very often it gets wrong, especially if it involves some multistep process.

swells34 · 2025-01-24T15:02:05 1737730925

This is a good representation of my experience as well.

At the end of the day, this is because it isn't "writing code" in the sense that you or I do. It is a fancy regurgitation engine, that will output bits of stuff it's seen before that seem related to your question. LLMs are incredibly good at this, but that it also why you can never trust their output.

kvgr · 2025-01-27T11:46:30 1737978390

yes, I told Windsurf to copy some code to another folder. And what it did? It "regenerated" the files, in the right folders. But the content was different. Great chaos Agent :D