Hacker Newsnew | past | comments | ask | show | jobs | submit | bongodongobob's commentslogin

He means that it is heavily biased to write code, not remove, condense, refactor, etc. It wants to generate more stuff, not less.

Because there are not a lot of high quality examples of code edition on the training corpora other than maybe version control diffs.

Because editing/removing code requires that the model output tokens for tools calls to be intercepted by the coding agent.

Responses like the example below are not emergent behavior, they REQUIRE fine-tuning. Period.

  I need to fix this null pointer issue in the auth module.
  <|tool_call|>
  {"id": "call_abc123", "type": "function", "function": {"name": "edit_file",     "arguments": "{"path": "src/auth.py", "start_line": 12, "end_line": 14, "replacement": "def authenticate(user):\n    if user is None:\n        return   False\n    return verify(user.token)"}"}}
  <|end_tool_call|>

I'm not disagreeing with any of this. Feels kind of hostile.

I clicked reply on the wrong level. And even then, I assure you I am not being hostile. English is a second language to me.

I don't see why this would be the case.

Have you tried using a base model from HuggingFace? they can't even answer simple questions. You input a base, raw model the input

  What is the capital of the United States?
And there's a fucking big chance it will complete it as

  What is the capital of Canada? 
as much as there is a chance it could complete it with an essay about the early American republican history or a sociological essay questioning the idea of Capital cities.

Impressive, but not very useful. A good base model will complete your input with things that generally make sense, usually correct, but a lot of times completely different from what you intended it to generate. They are like a very smart dog, a genius dog that was not trained and most of the time refuses to obey.

So, even simple behaviors like acting as a party in a conversation as a chat bot is something that requires fine-tuning (the result of them being the *-instruct models you find in HuggingFace). In Machine Learning parlance, what we call supervised learning.

But in the case of ChatBOT behavior, the fine-tuning is not that much complex, because we already have a good idea of what conversations look like from our training corpora, we have already encoded a lot of this during the unsupervised learning phase.

Now, let's think about editing code, not simple generating it. Let's do a simple experiment. Go to your project and issue the following command.

  claude -p --output-format stream-json "your prompt here to do some change in your code" | jq -r 'select(.type == "assistant") | .message.content[]? | select(.type? == "text") | .text'
Pay attention to the incredible amount of tool use calls that the LLMs generates on its output, now, think as this a whole conversation, does it look to you even similar to something a model would find in its training corpora?

Editing existing code, deleting it, refactoring is a way more complex operation than just generating a new function or class, it requires for the model to read the existing code, generate a plan to identify what needs to be changed and deleted, generate output with the appropriate tool calls.

Sequences of token that simply lead to create new code have basically a lower entropy, are more probable, than complex sequences that lead to editing and refactoring existing code.


Thank you for this wonderful answer.

It’s because that’s what most resembles the bulk of the tasks it was being optimized for during pre-training.

It is. Our "security manager" has a dashboard that just literally counts the number of "security policies" we've put in place. Anything that isn't a box to tick is completely ignored as irrelevant. So we are essentially counting how many group policies we can implement and just disregarding the effectiveness of them for mitigating relevant threats and ignoring the added complexity and cost it incurs by making everyone's life more difficult. Systems password management/MFA? Who cares, can't make a graph out of it. It's the dumbest shit I've ever had to deal with.

Lol I asked it how many rooms I have in my house and it got that wrong. Llms are useless amirite

It is a targeted advertising machine, that is one of its functions. I also don't think there is anything wrong with that. I don't think the government has any businesses banning speech either. I also don't believe they want to "save the children".

Bullshit. Just from a document editing perspective, going back to a network share where only one person can edit a doc is not going to fly. I used to have to deal with this as IT/desktop support and it fucking sucked. Docs in the cloud give you better collab capabilities and remove the need to have fancy networking, VPNs, international security exclusion groups etc, domain controller bullshit, connecting all of the companies offices together. Connect to the Internet, and all your stuff is there no matter where you are. It sounds like you've never had to support the infra for office workers before. This is way better than it used to be. For a small company, sure, do whatever. But the bigger it gets, the harder all that shit becomes and requires a lot of work to keep it running.

I had a similar thing happen to me with a huge company as a contractor. I couldn't work for 3 weeks due to a combination of login issues and permissions settings. Couldn't file a ticket and no one was really sure who to call/ask. Finally a director caught wind of it and knew who to talk to.

You need to let it actually benchmark. They are only as good as the tools you give them.

I have a great uncle that moved to Haight Ashbury to chase the whole spiritual open your mind idea. He said it was nothing like the media or nostalgia portrayed it. Lots of homeless drugged out kids who were completely lost. No jobs, panhandling for food and money, no direction, just spaced out druggies. Said it was fairly sad and he left within a year. He is an old hippy type as well, it was not what I was expecting to hear. I remember seeing an interview of George Harrison saying something similar.

George Harrison went to the Haight with his then-wife Pattie Boyd, and walked around, eventually finding people recognized him and followed him around. He played guitar in the park. He wrote a large check to fund the Haight Ashbury Free Clinic.

IIRC he said he had expected some kind of alternate hippie-economy based on genuine values and having ownership of the neighborhood, and was disappointed that he didn't see any evidence of that. Just a bunch of idle people.


Yep, pretty much. Found it - https://youtu.be/_I-ThafU1e4?si=dwZfCpNkDtnz2onb

My uncle had the same description. Disappointed that it was just stoned people and not a lot of real substance.


When was this? It's changed a lot (in both directions) over the years. For example, after Prop 64 legalized weed, the field in GGP by Haight and Stanyan that was previously staffed 24/7 by a morass of weed salespeople and their groupies (maybe 50-300 at any given time) emptied out overnight.

Then there's the fact that even the 18-20yo "Hippie Pilgrim" demo, which has held up pretty well for generations, is secretly stratified by the socioeconomic status of the parents. One's take on it depends on the specific cliques they're exposed to.


This brings back memories. I low key miss the drug market on hippie hill. We used to have the 'nugs' game where you had to try to sell the bums weed before they offered you drugs.

FWIW, the parent's comment matches my dad's sentiments about the city in the 60s/70s, but I wouldn't start a bar fight to defend his honor on this point. I would be genuinely curious to hear you elaborate on the changes. I live around the corner from the Upper Haight and it has always been one of my favorite parts of the city, but it has always had a lot of loafers doing nothing but drugs as long as I can remember.


Rich-kid hippies houseshare, hang out indoors after dark, and don't panhandle or shoplift groceries. They do smoke weed and maybe more, but their safety net is functioning. In their case, this life stage can reasonably be described as a cultural experience. Other than aesthetics, there's not much crossover with poor-kid hippies, because mooching tension is a major bummer.

Before the citywide affordability crisis, I think you were more likely to end up outdoors because you hit bottom than the other way around. The outdoor segment and the weed-dealing segment have always been more visible, though.


1969, he was prob about 30.

I felt the same, of moving to Los Angeles in the 80's and seeing the AIDS crisis take its toll on the street life. It was harsh, man. But then, the 90's happened.

Frank Zappa made his (similar) feelings pretty clear.

https://genius.com/The-mothers-of-invention-who-needs-the-pe...


Alan Watts, Gary Snyder, etc. were there, he just didn’t meet them.

The community of people that were actually serious about that stuff was as far as I can tell pretty small and exclusive.


kinda of the idea that I got from reading Phillip K Dick novels...

Wouldn't this just be unnecessary compute using AI? Compression or just normal filtering seems far more likely. It just seems like increasing the power bill for no reason.

Video filters aren't a radical new thing. You can apply things like 'slim waist' filters in real time with nothing more than a smartphone's processor.

People in the media business have long found their media sells better if they use photoshop-or-whatever to give their subjects bigger chests, defined waists, clearer skin, fewer wrinkles, less shiny skin, more hair volume.

Traditional manual photoshop tries to be subtle about such changes - but perhaps going from edits 0.5% of people can spot to bigger edits 2% of people can spot pays off in increased sales/engagement/ad revenue from those that don't spot the edits.

And we all know every tech company is telling every department to shoehorn AI into their products anywhere they can.

If I'm a Youtube product manager and adding a mandatory makeup filter doesn't need much compute; increases engagement overall; and gets me a $50k bonus for hitting my use-more-AI goal for the year - a little thing like authenticity might not stop me.


one thing we know for sure is that since chatgpt humiliated Google, all teams seem to have been given carte blanche freedom to do whatever it takes to make Google the leader again, and who knows what kind of people thrive in that kind of environment. just today we saw what OpenAI is willing to do to eke out any advantage it can.

Yeah, because it's not complex. It's 1 server. Get back to us when your 100k servers homelab data center that does a million different things has 10 years of uptime.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: