I would pay hundreds of dollars per month for the combination of cursor and clau...

aniviacat · on Sept 27, 2024

Current LLMs fail if what you're coding is not the most common of tasks. And a simple web app is about as basic as it gets.

I've tried using LLMs for some libraries I'm working on, and they failed miserably. Trying to make an LLM implement a trait with a generic type in Rust is a game of luck with very poor chances.

I'm sure LLMs can massively speed up tasks like front-end JavaScript development, simple Python scripts, or writing SQL queries (which have been written a million times before).

But for anything even mildly complex, LLMs are still not suited.

dathinab · on Sept 27, 2024

I don't think if complexity is the right metric.

front-end JS can easily also become very complex

I think a better metric is how close you are to reinventing a wheel for the thousands time. Because that is what LLMs are good at: Helping you write code which nearly the same way has already been written thousands of times.

But that is also something you find in backend code, too.

But that is also something where we as a industry kinda failed to produce good tooling. And worse if you are in the industry it's kinda hard to spot without very carefully taking a hounded (mental) steps back from what you are used to and what biases you might have.

mrybczyn · on Sept 27, 2024

LLM Code Assistants have succeeded at facilitating reusable code. The grail of OOP and many other paradigms.

We should not have an entire industry of 10,000,000 devs reinventing the JS/React/Spring/FastCGi wheel. Im sure those humans can contribute in much better ways to society and progress.

itishappy · on Sept 27, 2024

> LLM Code Assistants have succeeded at facilitating reusable code.

I'd have said the opposite. I think LLMs facilitate disposable code. It might use the same paradigms and patterns, but my bet is that most LLM written code is written specifically for the app under development. Are there LLM written libraries that are eating the world?

dbmikus · on Sept 27, 2024

I believe you're both saying the same thing. LLMs write "re-usable code" at the meta level.

The code itself is not clean and reusable across implementations, but you don't even need that clean packaged library. You just have an LLM regenerate the same code for every project you need it in.

The LLM itself, combined with your prompts, is effectively the reusable code.

Now, this generates a lot of slop, so we also need better AI tools to help humans interpret the code, and better tools to autotest the code to make sure it's working.

I've definitely replaced instances where I'd reach for a utility library, instead just generating the code with AI.

I think we also have an opportunity to merge the old and the new. We can have AI that can find and integrate existing packages, or it could generate code, and after it's tested enough, help extract and package it up as a battle tested library.

itishappy · on Sept 27, 2024

Agreed. But this terrifies me. The goal of reusable code (to my mind) is that with everybody building from the same foundations we can enable more functional and secure software. Library users contributing back (even just bug reports) is the whole point! With LLMs creating everything from scratch, I think we're setting ourselves on a path towards less secure and less maintainable software.

thelastparadise · on Sept 27, 2024

I (20+ years experience programmer) find it leads to a much higher quality output as I can now afford to do all the mundane, time-consuming housekeeping (refactors, more tests, making things testable).

E.g. let's say I'm working on a production thing and features/bugfixes accumulate and some file in the codebase starts to resemble spaghetti. The LLM can help me unfuck that way faster and get to a state of very clean code, across many files at once.

erosivesoul · on Sept 27, 2024

What LLM do you use? I've not gotten a lot of use out of Copilot, except for filling in generic algorithms or setting up boilerplate. Sometimes I use it for documentation but it often overlooks important details, or provides a description so generic as to be pointless. I've heard about Cursor but haven't tried it yet.

dbmikus · on Sept 27, 2024

Cursor is much better than Copilot. Also, change it to use Claude, and then use the Inspector with ctrl-I

KoolKat23 · on Sept 27, 2024

This is the thing it works both ways, it's really good at interpreting existing codebases too.

Could potentially mean just a change in time allocation/priority. As it's easier and faster to locate and potentially resolve issues later, it is less important for code to be consistent and perfectly documented.

Not fool proof and who knows how that could evolve, but just an alternative view. One of these big names in the industry said we'll have AGI when it speaks it's own language. :P.

znpy · on Sept 27, 2024

I had similar experiences:

1. Aasked ChatGPT to write a simple echo server in C but with this twist: use io_uring rather than the classic sendmsg/recvmsg. The code it spat out wouldn't compile, let alone work. It was wrong on many points. It was clearly pieces of who-knows-what cut and pasted together. However after having banged my head on the docs for a while I could clearly determine from which sources the code io_uring code segments were coming. The code barely made any sense and it was completely incorrect both syntactically and semantically.

2. Asked another LLM to write an AWS IAM policy according to some specifications. It hallucinated and used predicates that do not exist at all. I mean, I could have done it myself if I just could have made predicates up.

> But for anything even mildly complex, LLMs are still not suited.

Agreed, and I'm not sure we are any close to them being.

mattgreenrocks · on Sept 27, 2024

Yep. LLMs don’t really reason about code, which turns out to not be a problem for a lot of programming nowadays. I think devs don’t even realize that the substrate they build on requires this sort of reasoning.

This is probably why there’s such a divide when you try to talk about software dev online. One camp believes that it boils down to duct taping as many ready made components together all in pursuit of impact and business value. Another wants to really understand all the moving parts to ensure it doesn’t fall apart.

typedef_struct · on Sept 27, 2024

My test is to take a sized chunk of memory containing a TrueType/OpenType font and output a map of glyphs to curves. Bot is nowhere close.

PaulHoule · on Sept 27, 2024

Roughly LLMs are great at things that involve a series of (near) 1-1 correspondences like “translate 同时采访了一些参与其中的活跃用户 to English” or “How do I move something up 5px in CSS without changing the rest of the layout?” but if the relationship of several parts is complex (those Rust traits or anything involving a fight with the borrow checker) or things have to go in some particular order it hasn’t seen (say US states in order of percent water area) they struggle.

SQL is a good target language because the translation from ideas (or written description) is more or less linear, the SQL engine uses entirely different techniques to turn that query into a set of relational operators which can be rewritten for efficiency and compiled or interpreted. The LLM and the SQL engine make a good team.

infecto · on Sept 27, 2024

I’d bet that about 90% of software engineers today are just rewriting variations of what’s already been done. Most problems can be reduced to similar patterns. Of course, the quality of a model depends on its training data—if a library is new or the language isn’t widely used, the output may struggle. However, this is a challenge people are actively working on, and I believe it’s solvable.

LLMs are definitely suited for tasks of varying complexity, but like any tool, their effectiveness depends on knowing when and how to use them.

ben_w · on Sept 27, 2024

> Current LLMs fail if what you're coding is not the most common of tasks

Succeeding on the most common tasks (which isn't exactly what you said) is identical to "they're useful".

abm53 · on Sept 27, 2024

And I would go further… these “common tasks” cover 80% of the work in even the most demanding engineering or research positions.

layer8 · on Sept 27, 2024

That’s absolutely not my experience. I struggle to find tasks in my day to day work where LLMs are saving me time. One reason is that the systems and domains I work with are hardly represented at all on the internet.

scruple · on Sept 27, 2024

I have the same experience. I'm in gamesdev and we've been encouraged to test out LLM tooling. Most of us at/above the senior level report the same experience: it sucks, it doesn't grasp the broader context of the systems that these problems exist inside of, even when you prompt it as best as you can, and it makes a lot of wild assed, incorrect assumptions about what it doesn't know and which are often hard to detect.

But it's also utterly failed to handle mundane tasks, like porting legacy code from one language and ecosystem to another, which is frankly surprising to me because I'd have assumed it would be perfectly suited for that task.

nicolas_t · on Sept 27, 2024

In my experience, AI for coding is having a rather stupid very junior dev at your beck and call but who can produce the results instantly. It's just often very mediocre and getting it fixed often takes longer than writing it on your own.

ben_w · on Sept 28, 2024

My experience is that it varies a lot by model, dev, and field — I've seen juniors (and indeed people with a decade of experience) keeping thousands of lines of unused code around for reference, or not understanding how optionals work, or leaving the FAQ full of placeholder values in English when the app is only on the German market, and so on. Good LLMs don't make those mistakes.

But the worst LLMs? One of my personal tests is "write Tetris as a web app", and the worst local LLM I've tried, started bad and then half way through switched to "write a toy ML project in python".

abm53 · on Oct 5, 2024

I think this illustrates the biggest failure mode when people start using LLMs: asking it to do too much in one step.

It’s a very useful tool, not magic.

bee_rider · on Sept 27, 2024

> Not once did he need to ask me a question. When I asked him "how long did this take" and expected him to say "a few weeks" (it would have taken me - a far more experienced engineer - 2 months minimum).

> Current LLMs fail if what you're coding is not the most common of tasks. And a simple web app is about as basic as it gets.

These two complexity estimates don’t seem to line up.

fhd2 · on Sept 27, 2024

That's still valuable though: For problem validation. It lowers the table stakes for building any sort of useful software, which all start simple.

Personally, I just use the hell out of Django for that. And since tools like that are already ridiculously productive, I don't see much upside from coding assistants. But by and large, so many of our tools are so surprisingly _bad_ at this, that I expect the LLM hype to have a lasting impact here. Even _if_ the solutions aren't actually LLMs, but just better tools, since we reconfigured how long something _should_ take.

skydhash · on Sept 27, 2024

The problem Django solves is popular, which is why we have so many great frameworks that shorten the implementation time (I use Laravel for that). Just like game engines or GUI libraries, assuming you understand the core concepts of the domain. And if the tool was very popular and the LLMs have loads of data to train on, there may be a small productivity tick by finding common patterns (small because if the patterns are common enough, you ought to find a library/plugin for it).

Bad tools often falls in three categories. Too simple, too complex, or unsuitable. For the last two, you'd better switch but there's the human element of sunken costs.

gambiting · on Sept 27, 2024

I work in video games, I've tried several AI assistants for C++ coding and they are all borderline useless for anything beyond writing some simple for loops. Not enough training data to be useful I bet, but I guess that's where the disparity is - web apps, python....that has tonnes of publicly available code that it can train on. Writing code that manages GPU calls on a PS5? Yeah, good luck with that.

maroonblazer · on Sept 27, 2024

Presumably Sony is sitting on decades worth of code for each of the PlayStation architectures. How long before they're training their own models and making those available to their studios' developers?

skydhash · on Sept 27, 2024

I don't think sony have these codes, more likely the finished build. And all the major studios have game engines for their core product (or they license one). The most difficult part is writing new game mechanics or supporting a new platform.

ilaksh · on Sept 27, 2024

So you are basically saying "it failed on some of my Rust tasks, and those other languages aren't even real programming languages, so it's useless".

I've used LLMs to generate quite a lot of Rust code. It can definitely run into issues sometimes. But it's not really about complexity determining whether it will succeed or not. It's the stability of features or lack thereof and the number of examples in the training dataset.

aniviacat · on Sept 27, 2024

I realize my comment seems dismissive in a manner I didn't intend. I'm sorry for that, I didn't mean to belittle these programming tasks.

What I meant by complexity is not "a task that's difficult for a human to solve" but rather "a task for which the output can't be 90% copied from the training data".

Since frontend development, small scripts and SQL queries tend to be very repetitive, LLMs are useful in these environments.

As other comments in this thread suggested: If you're reinventing the wheel (but this time the wheel is yellow instead of blue), the LLM can help you get there much faster.

But if you're working with something which hasn't been done many times before, LLMs start struggling. A lot.

This doesn't mean LLMs aren't useful. (And I never suggested that.) The most common tasks are, per definition, the most common tasks. Therefore LLMs can help in many areas, and are helpful to a lot of people.

But LLMs are very specialized in that regard, and once you work on a task that doesn't fit this specialization, their usefulness drops, down to being useless.

ilaksh · on Sept 27, 2024

Which model exactly? You understand that every few months we are getting dramatically better models? Did you try the one that came out within the last week or so (o1-preview).

aniviacat · on Sept 27, 2024

I did use o1-preview.

Roark66 · on Sept 27, 2024

I can't understand how anyone can use these tools (copilot especially) to make entire projects from scratch and expand them later. They just lead you down the wrong path 90% of the time.

Personally I much prefer Chatgpt. I give it specific small problems to resolve and some context. At most 100 lines of code. If it gets more the quality goes to shit. In fact copilot feels like chatgpt that was given too much context.

sensanaty · on Sept 27, 2024

I hear it all the time on HN that people are producing entire apps with LLMs, but I just don't believe it.

All of my experiences with LLMs have been that for anything that isn't a braindead-simple for loop is just unworkable garbage that takes more effort to fix than if you just wrote it from scratch to begin with. And then you're immediately met with "You're using it wrong!", "You're using the wrong model!", "You're prompting it wrong!" and my favorite, "Well, it boosts my productivity a ton!".

I sat down with the "AI Guru" as he calls himself at work to see how he works with it and... He doesn't. He'll ask it something, write an insanely comprehensive prompt, and it spits out... Generic trash that looks the same as the output I ask of it when I provide it 2 sentences total, and it doesn't even work properly. But he still stands by it, even though I'm actively watching him just dump everything he just wrote up for the AI and start implementing things himself. I don't know what to call this phenomenon, but it's shocking to me.

Even something that should be in its wheelhouse like producing simple test cases, it often just isn't able to do it to a satisfactory level. I've tried every one of these shitty things available in the market because my employer pays for it (I would never in my life spend money on this crap), and it just never works. I feel like I'm going crazy reading all the hype, but I'm slowly starting to suspect that most of it is just covert shilling by vested persons.

insane_dreamer · on Sept 27, 2024

The other day I decided to write a script (that I needed for a project, but ancillary, not core code) entirely with CoPilot. It wasn't particularly long (maybe 100 lines of python). It worked. But I had to iterate so much with the LLM, repeating instructions, fixing stuff that didn't run, that it took a fair bit longer than if I had just written it myself. And this was a fairly vanilla data science type of script.

shriek · on Sept 27, 2024

Most of the time the entire apps are just a timer app or something simple. Never a complex app with tons of logic in them. And if you're having to write paragraphs of texts to write something complex then might as well just write that in a programming language, I mean isn't that what high-level programming language was built for? (heh). Also, you're not the only one who's had the thought that someone is vested in someway to overhype this.

KoolKat23 · on Sept 27, 2024

You can write the high level structure yourself and let it complete the boilerplate code within the functions, where it's less critical/complicated. Can save you time.

shriek · on Sept 27, 2024

Oh for sure. I use it as smart(ish) autocomplete to avoid typing everything out/looking up in docs everytime but the thought of prompt engineering to make an app is just bizarre to me. It almost feels like it has more friction than actually writing the damn thing yourself.

mattgreenrocks · on Sept 27, 2024

You aren’t the only one that feels this way.

After 20 years of being held accountable for the quality of my code in production, I cannot help but feel a bit gaslit that decision-makers are so elated with these tools despite their flaws that they threaten to take away jobs.

skydhash · on Sept 27, 2024

Here is another example [0]. 95% of the code was taken as it is from the examples of the documentation. If you still need to read the code after it was generated, you may have well read the documentation first.

When they say treat it like an intern, I'm so confused. An intern is there to grow and hopefully replace you as you get promoted or leave. The tasks you assign to him are purposely kept simple for him to learn the craft. The monotonous ones should be done by the computer.

[0]: https://gist.github.com/simonw/97e29b86540fcc627da4984daf5b7...

skywhopper · on Sept 27, 2024

I think to the extent this works for some people it’s as a way to trick their brains into “fixing” something broken rather than having to start from scratch. And for some devs, that really is a more productive mode, so maybe it works in the end.

And that’s fine if the dev realizes what’s going on but when they attribute their own quirks to AI magic, that’s a problem.

Workaccount2 · on Sept 27, 2024

As a non-programmer at a non-programming company:

I use it to write test systems for physical products. We used to contract the work out or just pay someone to manually do the tests. So far it has worked exceptionally well for this.

I think the core issue of the "do LLMs actually suck" is people place different (and often moving) goalposts for whether or not it sucks.

mythrwy · on Sept 27, 2024

I just wrote a fairly sizable app with an LLM. This is the first complete app I've written using it. I did write some of the core logic myself leaving the standard crud functions and UI for the LLM.

I did it in little pieces and started over with fresh context each time the LLM started to get off in the weeds. I'm very happy with the result. The code is clean and well commented, the tests are comprehensive and the app looks nice and performs well.

I could have done all this manually too but it would have taken longer and I probably would have skimped out on some tests and gave up and hacked a few things in out of expedience.

Did the LLM get things wrong on occasion? Yes. Make up api methods that don't exist? Yes. Skip over obvious standard straightforward and simple solutions in favor of some rat's nest convoluted way to achieve the same goal? Yes.

But that is why I'm here. It's a different style of programming (and one that I don't enjoy nearly as much as pounding the keyboard). It's more high level thinking and code review involved and less worrying about implementation detail.

It might not work as well in domains which training data doesn't exist in. Also certainly if someone expects to come in with no knowledge and just paste code without understanding, reading and pushing back, they will have a non working mess pretty shortly. But overall these tools dramatically increase productivity in some domains is my opinion.

ChainOfFools · on Sept 27, 2024

> but I'm slowly starting to suspect that most of it is just covert shilling by vested persons.

It's almost as if the horde of former kleptocurrency bros have found a promising new seam of fool's gold to mine

achempion · on Sept 27, 2024

I have the same observation as well. The hype is getting generated mostly by people who're selling AI courses or AI-related products.

It works well as a smart documentation search where you can ask follow-up questions or when you know what the output should look like if you see it but can't type it directly from the memory.

For code assistants (aka copilot / cursor), it works if you don't care about the code at all and ok with any solution if it's barely working (I'm ok with such code for my emacs configuration).

meiraleal · on Sept 27, 2024

LLMs are great to go from 0 to 2b but you wanted to go to 1 so you remove and modify lots of things, get back to 1 and then go to 2.

Lots of people are terrible at going from 0 to 1 in any project. Me included. LLMs helped me a lot solving this issue. It is so much easier to iterate over something.

kranuckle · on Sept 29, 2024

I think it’s more that if you want to believe it’s magic future tech then it looks like it.

If you aren’t on board then it looks impressive but flawed and not even close to living up to the hype.

flir · on Sept 27, 2024

Just for fun, give it a function you wrote, and ask it if it can make any improvements. I reckon I accept about a third of what it suggests.

mattgreenrocks · on Sept 27, 2024

Not a bad use, though I argue being able to do that critique yourself has a compounding effect over time that is worthwhile.

flir · on Sept 27, 2024

Well... I have to critique the critique, else how do I know which two thirds to reject?

In theory I'm learning from the LLM during this process (much like a real code review). In practice, it's very rare that it teaches me something, it's just more careful than I am. I don't think I'm ever going to be less slap-dash, unfortunately, so it's a useful adjunct for me.

threeseed · on Sept 27, 2024

> 20 year software engineering career is about to change

I have also been developing for 20+ years.

And have heard the exact same thing about IDEs, Search Engines, Stack Overflow, Github etc.

But in my experience at least how fast I code has never been the limiting factor in my project's success. So LLMs are nice and all but isn't going to change the industry all that much.

pluc · on Sept 27, 2024

There will be a whole industry of people who fix what AI has created. I don't know if it will be faster to build the wrong thing and pay to have it fixed or to build the right thing from the get go, but after having seen some shit, like you, I have a little idea.

Workaccount2 · on Sept 27, 2024

That industry will only form if LLMs don't improve from here. But the evidence, both theoretical and empirical, is quite the opposite. In fact one of the core reasons transformers gained so much traction is because they scale so well.

If nothing really changes in 3-5 years, then I'd call it a flop. But the writing is on the wall that "scale = smarts", and what we have today still looks like a foundational stage for LLM's.

namaria · on Sept 27, 2024

> In fact one of the core reasons transformers gained so much traction is because they scale so well.

> If nothing really changes in 3-5 years, then I'd call it a flop

Transformers have been used for what 6 years now? Will you in 6 years say "I'll decide if they don't change the world in another 6 years?"

Workaccount2 · on Sept 27, 2024

If the difference between now and 6 years in the future is the same as the difference between now and 6 years ago, a lot of people here will be eating their hats.

namaria · on Sept 27, 2024

Why? What exactly have we got for the (how many hundred) billions of dollars poured into GPUs running transformers over the past 6 years?

Workaccount2 · on Sept 27, 2024

You don't believe that models 100x better than today (OG transformers were pretty bad) would be fruitful for society?

mattgreenrocks · on Sept 27, 2024

Self-driving cars have been 3-5 years away for what, a decade now?

Workaccount2 · on Sept 27, 2024

I never paid much attention to Elon.

dumbfounder · on Sept 27, 2024

Correction: a whole industry of AI that will fix what AI has created.

vocram · on Sept 27, 2024

Will AI also be on call when things break in production?

tempfile · on Sept 27, 2024

no, the original comment was correct

cml123 · on Sept 27, 2024

yes, but does your colleague even fully understand what was generated? Does he have a good mental map of the organization of the project?

I have a good mental map of the projects I work on because I wrote them myself. When new business problems emerge, I can picture how to solve them using the different components of those applications. If I hadn't actually written the application myself, that expertise would not exist.

Your colleague may have a working application, but I seriously doubt he understands it in the way that is usually needed for maintaining it long term. I am not trying to be pessimistic, but I _really_ worry about these tools crippling an entire generation of programmers.

alonsonic · on Sept 27, 2024

AI assistants are also quite good at helping you create a high level map of a codebase. They are able to traverse the whole project structure and functionality and explain to you how things are organized and what responsibilities are. I just went back to an old project (didn't remember much about it) and used Cursor to make a small bug fix and it helped me get it done in no time. I used it to identify where the issue might be based on logs and then elaborate on potential causes before then suggesting a solution and implementing it. It's the ultimate pair programmer setup.

insane_dreamer · on Sept 27, 2024

> I just went back to an old project (didn't remember much about it) and used Cursor to make a small bug fix and it helped me get it done in no time.

That sounds quite useful. Does Cursor feed your entire project code (traversing all folders and files) into the context?

Anamon · on Sept 28, 2024

Do you ever verify those explanations, though? Because I occasionally try having an LLM summarise an article or document I just read, and it's almost always wrong. I have my doubts that they would fare much better in "understanding" an entire codebase.

My constant suspicion is that most results people are so impressed with were just never validated.

skywhopper · on Sept 27, 2024

I wouldn’t even be so sure the application “works”. All we heard is that it has pretty UI and an API and a database, but does it do something useful and does it do that thing correctly? I wouldn’t be surprised if it totally fails to save data in a restorable way, or to be consistent in its behavior. It certainly doesn’t integrate meaningfully with any existing systems, and as you say, no human has any expertise in how it works, how to maintain it, troubleshoot it, or update it. Worse, the LLM that created it also doesn’t have any of that expertise.

n_ary · on Sept 27, 2024

> I _really_ worry about these tools crippling an entire generation of programmers.

Isn’t that the point? Degrade the user long enough that the competing user is on-par or below the competence of the tool so that you now have an indispensable product and justification of its cost and existence.

P.S. This is what I understood from a lot of AI saints in news who are too busy parroting productivity gains without citing other consequences, such as loss of understanding of the task or expertise to fact-check.

svantana · on Sept 27, 2024

Me too, but a more optimistic view is that this is just a nascent form of higher-level programming languages. Gray-beards may bemoan that us "young" developers (born after 1970) can't write machine code from memory, but it's hardly a practical issue anymore. Analogously, I imagine future software dev to consist mostly of writing specs in natural language.

skydhash · on Sept 27, 2024

No one can write machine code from memory other by writing machine for years and just memorizing them. Just like you can't start writing Python without prior knowledge.

> Analogously, I imagine future software dev to consist mostly of writing specs in natural language.

https://www.commitstrip.com/en/2016/08/25/a-very-comprehensi...?

allochthon · on Sept 27, 2024

> Me too, but a more optimistic view is that this is just a nascent form of higher-level programming languages.

I like this take. I feel like a significant portion of building out a web app (to give an example) is boilerplate. One benefit of (e.g., younger) developers using AI to mock out web apps might be to figure out how to get past that boilerplate to something more concise and productive, which is not necessarily an easy thing to get right.

In other words, perhaps the new AI tools will facilitate an understanding of what can safely be generalized from 30 years of actual code.

mattgreenrocks · on Sept 27, 2024

Web apps require a ton of boilerplate. Almost every successful web framework uses at least one type of metaprogramming, many have more than one (reflection + codegen).

I’d argue web frameworks don’t even help a lot in this regard still. They pile on more concepts to the leaky abstractions of the web. They’re written by people that love the web, and this is a problem because they’re reluctant to hide any of the details just in case you need to get to them.

Coworker argued that webdev fundamentally opposes abstraction, which I think is correct. It certainly explains the mountains of code involved.

cml123 · on Sept 27, 2024

I admit that my own feelings about this are heavily biased, because I _truly_ care about coding as a craft; not just a means to an end. For me, the inclusion of LLMs or AI into the process robs it of so much creativity and essence. No one would argue that a craftsman produces furniture more quickly than Wayfair, but all people would agree that the final product would be better.

It does seem inevitable that some large change will happen to our profession in the years to come. I find it challenging to predict exactly how things will play out.

svantana · on Sept 28, 2024

I suppose the craft/art view of coding will follow the path of chess - machines gradually overtake humans but it's still an artform to be good at, in some sense.

gtvwill · on Sept 27, 2024

I've coded python scripts that let me take csv data from hornresp and convert it to 3d models I can import into sketchup. I did two coding units at uni, so whilst I can read it... I can't write it from scratch to save my life. I can debug and fix scripts gpt gives me. I did the hornresp script in about 40 mins. It would have taken me weeks to learn what it produced.

I'm not a mathematician, hell i did general maths at school. Currently I've been talking through scripting a method to mix dsd audio files natively without converting to tradional pcm. I'm about to use gpt to craft these scripts. There is no way I could have done this myself without years of learning. Now all I have to do is wait half a day so I can use my free gpt o credits to code it for me (I'm broke af so can't afford subs). The productivity gains are insane. I'd pay for this in a heartbeat if I could afford it.

orwin · on Sept 27, 2024

I really believe that the front-end part can be mostly automated (the html/CSS at least), copilot is close imho (microsoft+github, I used both), but really they're useless to do anything else complex without making to much calls, proposing bad data structures, using bad /old code design.

skydhash · on Sept 27, 2024

The frontend part was already automated. We called it Dreamweaver and RAD tools.

epicureanideal · on Sept 27, 2024

Thank you, now I realize where I've had this feeling before!

Working with AI-generated code to add new features feels like working with Dreamweaver-generated code, which was also unpleasant. It's not written the same way a human would write it, isn't written with ease of modification in mind, etc.

JanSt · on Sept 27, 2024

Copilot is pretty bad compared to cursor with sonnet. I have used Copilot for quite a long time so I can tell.

StefanWestfal · on Sept 27, 2024

I am curiouse, how complex was the app? I use cursor too and am very satisfied with it. It seem that is very good at code that must have been written so many times before (think react components, node.js REST api endpoints etc.) but it starts to fall of when moving into specific domains.

And for me that is the best case scenario, it takes away the part we have to code / solve already solved problems again and again so we can focus more on the other parts of software engineering beyond writing code.

rurp · on Sept 27, 2024

Fairly standard greenfield projects seem to be the absolute best scenario for an LLM. It is impressive, but that's not what most professional software development work is, in my experience. Even once I know what specifically to code I spend much more time ensuring that code will be consistent and maintainable with the rest of the project than with just getting it to work. So far I haven't found LLMs to be all that good at that sort of work.

skapadia · on Sept 27, 2024

Did you take a look at the code generated? Was it well designed and amenable to extension / building on top of?

I've been impressed with the ability to generate "throw away" code for testing out an idea or rapidly prototyping something.

nativeit · on Sept 27, 2024

Considering the current state of the industry, and the prevailing corporate climate, are you sure your job is about to get easier, or are you about to experience cuts to both jobs and pay?

insane_dreamer · on Sept 27, 2024

The problem is that it only works for basic stuff for which there is a lot of existing example code out there to work with.

In niche situations it's not helpful at all in writing code that works (or even close). It is helpful as a quick lookup for docs for libs or functions you don't use much, or for gotchas that you might otherwise search StackOverflow for answers to.

It's good for quick-and-dirty code that I need for one-off scripts, testing, and stuff like that which won't make it into production.

apwell23 · on Sept 27, 2024

So what is his plan to fix all the bugs that claude hallucinated in the code ?

JanSt · on Sept 27, 2024

I'm confident you have not used Cursor Composer + Claude 3.5 Sonnet. I'd say the level of bugs is no higher than that of a typical engineer - maybe even lower.

hobs · on Sept 27, 2024

There's no LLM for which that is true or we'd all be fired.

joshuacc · on Sept 27, 2024

In my experience it is true, but only for relatively small pieces of a system at the time. LLMs have to be orchestrated by a knowledgeable human operator to build a complete system any larger than a small library.

ben_w · on Sept 27, 2024

In the long term, sure. Short term, when that happens, we're going to be on Wile E. Cyote physics and keep up until we look down and notice the absence of ground.

dagw · on Sept 27, 2024

If all you bring to the table is the ability to reimplement simple web apps to spec, then sooner or later you probably will be fired.

threeseed · on Sept 27, 2024

It's only as good as its training data.

Step outside of building basic web/CRUD apps and its accuracy drops off substantially.

Also almost every library it uses is old and insecure.

mewpmewp2 · on Sept 27, 2024

Yet most work seems to be CRUD related and most SaaS businesses starting up just really need those things mainly.

whatshisface · on Sept 27, 2024

That last point represents the biggest problem this technology will leave us with. Nobody's going to train LLMs on new libraries or frameworks when writing original code takes an order of magnitude longer than generating code for the 2023 stack.

Workaccount2 · on Sept 27, 2024

With LLM's like gemini, which have massive context windows, you can just drop the full documentation for anything in the context window. It dramatically improves output.

SubiculumCode · on Sept 27, 2024

I use phind which does searches to provide additional context

apwell23 · on Sept 27, 2024

I am confident you didn't understand my comment. I didn't say anything about "level of bugs".

dagw · on Sept 27, 2024

Claude is actually surprisingly good at fixing bugs as well. Feed it a code snippet and either the error message or a brief description of the problem and it will in many cases generate new code that works.

charlie0 · on Sept 27, 2024

Sounds like CRUD boilerplate. Sure, it's great to have AI build this out and it saves a ton of time, but I've yet to see any examples (online or otherwise) or people building complex business rules and feature sets using AI.

The sad part is beginners using the boilerplate code won't get any practice building apps and will completely fail at the complex parts of an app OR try to use AI to build it and it will be terrible code.

skywhopper · on Sept 27, 2024

I hear these stories, and I have to wonder, how useful is the app really? Was it actually built to address a need or was it built to learn the coding tool? Is it secure, maintainable, accessible, deployable, and usable? Or is it just a tweaked demo? Plenty of demo apps have all those features, but would never serve as the basis for something real or meet actual customer needs.

SJC_Hacker · on Sept 27, 2024

Yeah AI can give you a good base if its something thats been done before (which admittedly, 99% of SE projects are), especially in the target language.

Yeah, if you want tic-tac-toe or snake, you can simply ask ChatGPT and it will spit out something reasonable.

But this is not much better than a search engine/framework to be honest.

Asking it to be "creative" or to tweak existing code however ...

JanSt · on Sept 27, 2024

Yes, the value of a single engineer can easily double. Even a junior - and it's much easier for them to ask Claude for help than the senior engineer on the team (low barrier for unblock).