Hacker Newsnew | past | comments | ask | show | jobs | submit | kmijyiyxfbklao's commentslogin

Then why not say "they are just computer programs"?

I think the reason people don't say that is because they want to say "I already understand what they are, and I'm not impressed and it's nothing new". But what the comment you are replying to is saying is that the inner workings are the important innovative stuff.


> Then why not say "they are just computer programs"?

LLMs are probabilistic or non-deterministic computer programs, plenty of people say this. That is not much different than saying "LLMs are probabilistic next-token prediction based on current context".

> I think the reason people don't say that is because they want to say "I already understand what they are, and I'm not impressed and it's nothing new". But what the comment you are replying to is saying is that the inner workings are the important innovative stuff.

But we already know the inner workings. It's transformers, embeddings, and math at a scale that we couldn't do before 2015. We already had multi-layer perceptrons with backpropagation and recurrent neural networks and markov chains before this, but the hardware to do this kind of contextual next-token prediction simply didn't exist at those times.

I understand that it feels like there's a lot going on with these chatbots, but half of the illusion of chatbots isn't even the LLM, it's the context management that is exceptionally mundane compared to the LLM itself. These things are combined with a carefully crafted UX to deliberately convey the impression that you're talking to a human. But in the end, it is just a program and it's just doing context management and token prediction that happens to align (most of the time) with human expectations because it was designed to do so.

The two of you seem to be implying there's something spooky or mysterious happening with LLMs that goes beyond our comprehension of them, but I'm not seeing the components of your argument for this.


> But we already know the inner workings.

Overconfident and wrong.

No one understands how an LLM works. Some people just delude themselves into thinking that they do.

Saying "I know how LLMs work because I read a paper about transformer architecture" is about as delusional as saying "I read a paper about transistors, and now I understand how Ryzen 9800X3D works". Maybe more so.

It takes actual reverse engineering work to figure out how LLMs can do small bits and tiny slivers of what they do. And here you are - claiming that we actually already know everything there is to know about them.


I never claimed we already know everything about LLMs. Knowing "everything about" anything these days is impossible given the complexity of our technology. Even antennae, a centuries old technology, is something we're still innovating on and don't completely understand in all domains.

But that's a categorically different statement than "no one understands how an LLM works", because we absolutely do.

You're spending a lot of time describing whether we know or don't know LLMs, but you're not talking at all about what it is that you think we do or do not understand. Instead of describing what you think the state of the knowledge is about LLMs, can you talk about what it is that you think that is unknown or not understood?


I think the person you are responding to is using a strange definition of "know."

I think they mean "do we understand how they process information to produce their outputs" (i.e., do we have an analytical description of the function they are trying to approximate).

You and I mean, we understand the training process that produces their behaviour (and this training process is mainly standard statistical modelling / ML).

In short, both sides are talking past each other.


I agree. The two of us are talking past each other, and I wonder if it's because there's a certain strain of thought around LLMs that believes that epistemological questions and technology that we don't fully understand are somehow unique to computer science problems.

Questions about the nature of knowledge (epistemology and other philosophical/cognitive studies) in humans are still unsolved to this day, and frankly may never be fully understood. I'm not saying this makes LLM automatically similar to human intelligence, but there are plenty of behaviors, instincts, and knowledge across many kinds of objects that we don't fully understand the origin of. LLMs aren't qualitatively different in this way.

There are many technologies that we used that we didn't fully understand at the time, even iterating and improving on those designs without having a strong theory behind them. Only later did we develop the theoretical frameworks that explain how those things work. Much like we're now researching the underpinnings of how LLMs work to develop more robust theories around them.

I'm genuinely trying to engage in a conversation and understand where this person is coming from and what they think is so unique about this moment and this technology. I understand the technological feat and I think it's a huge step forward, but I don't understand the mysticism that has emerged around it.


> Saying "I know how LLMs work because I read a paper about transformer architecture" is about as delusional as saying "I read a paper about transistors, and now I understand how Ryzen 9800X3D works". Maybe more so.

Which is to say, not delusional at all.

Or else we have to accept that basically hardly anyone "understands" anything. You set an unrealistic standard.

Beginners play abstract board games terribly. We don't say that this means they "don't understand" the game until they become experts; nor do we say that the experts "haven't understood" the game because it isn't strongly solved. Knowing the rules, consistently making legal moves and perhaps having some basic tactical ideas is generally considered sufficient.

Similarly, people who took the SICP course and didn't emerge thoroughly confused can reasonably be said to "understand how to program". They don't have to create MLOC-sized systems to prove it.

> It takes actual reverse engineering work to figure out how LLMs can do small bits and tiny slivers of what they do. And here you are - claiming that we actually already know everything there is to know about them.

No; it's a dismissal of the relevance of doing more detailed analysis, specifically to the question of what "understanding" entails.

The fact that a large pile of "transformers" is capable of producing the results we see now, may be surprising; and we may lack the mental resources needed to trace through a given calculation and ascribe aspects of the result to specific outputs from specific parts of the computation. But that just means it's a massive computation. It doesn't fundamentally change how that computation works, and doesn't negate the "understanding" thereof.


Understanding a transistor is an incredibly small part of how Ryzen 9800X3D does what it does.

Is it a foundational part? Yes. But if you have it and nothing else, that adds up to knowing almost nothing about how the whole CPU works. And you could come to understand much more than that without ever learning what a "transistor" even is.

Understanding low level foundations does not automatically confer the understanding of high level behaviors! I wish I could make THAT into a nail, and drive it into people's skulls, because I keep seeing people who INSIST on making this mistake over and over and over and over and over again.


My entire point here is that one can, in fact, reasonably claim to "understand" a system without being able to model its high level behaviors. It's not a mistake; it's disagreeing with you about what the word "understand" means.

For the sake of this conversation "understanding" implicitly means "understand enough about it to be unimpressed".

This is what's being challenged: That you can discount LLMs as uninteresting because they are "just" probalistic inference machines. This completely underestimates just how far you can push the concept.

Your pedantic definition of understand might be technically correct. But that's not what's being discussed.

That is, unless you assign metaphysical properties to the notion of intelligence. But the current consensus is that intelligence can be simulated, at least in principle.


I'm not sure what you mean?

Saying we understand the training process of LLMs does not mean that LLMs are not super impressive. They are shining testiments to the power of statistical modelling / machine learning. Arbitrarily reclassifying them as something else is not useful. It is simply untrue.

There is nothing wrong with being impressed by statistics... You seem to be saying that statistics is interesting and there for to say that LLMs are statistics dismissed them. I think perhaps you are just implicitly biased against statistics! :p


There is so much complexity in interactions of systems that is easy to miss.

Saying that one can understand a modern CPU by understanding how a transistor works is kinda akin to saying you can understand the operation of a country by understanding a human from it. It's a necessary step, probably, but definitely not sufficient.

It also reminds me of a pet peeve in software development where it's tempting to think you understand the system from the unit tests of each component, while all the interesting stuff happens when different components interact with each other in novel ways.


Even saving can be seen as greed. Someone can focus too much on accumulating for themselves. Both investing and saving can be seen as preparation.

To avoid things becoming evil, you just need to make sure that your interactions with other are cooperative and not zero sum, and not all investments are zero sum.


You should start with a business or product that people want to use, or that is technical impressive. Only after that it makes sense to get government or investors involved.


That's interesting. I think the last Venezuelan election showed there are limits to what you can accomplish with peace.


Of course there are limits to everything, but conversely look at what people like Gandhi achieved


I've become increasingly uncomfortable with these sorts of casual throwabouts of extremely complex and unique geopolitical situations though. Gandhi existed in a particular moment and context - take the same man and put him up against a different regime, and you would not get the same outcome.

It's like how people talk up peaceful protest by referencing Martin Luther King. He was a major centralizing figure for civil rights, but he did not exist in a vacuum of context either.


Precisely. Liberation movements have various tools at their disposal. But using the same tool in a different context does not guarantee a similar outcome.

On Gandhi in particular, many do not realize that there were parallel movements inside India that did resort to violence. So the context is not as simple as it may seem.


It helped that WW-II broke the British. Non Violence needs an audience and a population that i) can feel shame ii) holds some power to do something about it.


Gandhi's protests were causing turmoil and dissent within the UK. Not to forget the fact that the massive Indian population had gone into civil disobedience as well, making it costlier to rule India. Anymore issues, including any harm to Gandhi would have caused massive problems for the British, both in India and at their own homeland. They had to spend to keep everybody safe and the situation normal. That wouldn't have been the outcome of a violent revolution. Summarizing, Gandhi's peaceful protest cannot be described in simple terms. There are a lot of nuances.

Gandhi's protests are a very valuable source of info on both violent and nonviolent protests. It's easy to talk about an armed or violent revolution. But it's not a decision to be taken very lightly. Apparently, both the sides of the American civil war went into it expecting it to somehow end in a few days! You know the carnage that followed. I have no clue why they held that belief. But it supports the fact that people almost always underestimate the cost of a war.

Non-violent protests are more effective at garnering support and mobilizing a huge movement. The human costs are also arguably lesser. I dont know if it's practical all the time. But it should be given a big chance if an opportunity exists.


Nothing you say contradicts my comment.

Non violent resistance can be and has been crushed many times in history.

To win one needs to wield some kind of power or leverage. Non violence does not work if your adversary cannot be shamed by a moral high ground. It will achieve zilch in that case.


Maybe up to 2 million people died in the process, mostly in the partitioning of India and Pakistan, so it was not all peaceful.


There were many millions of violent Indians who helped him achieve that.


I mean, Poland managed to get rid of communist rule through a peaceful process(which doesn't mean people weren't arrested, tortured, intimidated and beaten). There was a desire for free and democratic elections and it happened.


Managers with development skills are almost always better, because they can dive into the details if there's ever a problem.


That’s true, however the current vibe coding ecosystem is clearly not written in this mindset. You will have a hard time to dive into anything if you previously generated 2k LOC/hour, which is absolutely possible. Typing was never the bottleneck, understanding, and knowing that you did something well was always the real bottleneck. LLMs make this even worse. You can move Jira tickets to done faster with it, but even bad developers can do that many times compared to better ones, because for example they mindlessly copy-paste StackOverflow answers whose half of the code is absolutely not necessary, but they don’t care, because “it works”… until it doesn’t.


>LLMs make this even worse

Not in my experience

Better documentation, more test cases, and an NLP interface to query the code

Less cognitive load, more complete mental models

>even bad developers can do that many times compared to better ones, because for example they mindlessly copy-paste StackOverflow answers whose half of the code is absolutely not necessary

Maybe LLMs, much like StackOverflow, make good devs better and bad devs worse

Like a force multiplier for good practices and bad practices


The exact people who said to me that I’m using LLMs wrongly, and showed their code, showed me bad code. So let’s say, that “more documentation, which is unnecessary most time, ie noise, more test cases, which test not necessarily what they should, and an NLP interface which lies from time-to-time”, and we agree. LLM generated code is noisy as hell, for no good reason. Maybe, it’s good for you, and your type of work. I need to provide way better code than that. I don’t know why we pretend that “good code”, “good documentation”, “good tests” etc are the same for everybody.


>LLM generated code is noisy as hell, for no good reason

You can direct it to generate code/docs in whatever format or structure you want, prioritising the good practices and avoiding bad practices, and then manually edit as needed

For example with documentation I direct it to:

*Goal:* Any code you generate must lower cognitive load and be backed by accurate, minimal, and maintainable documentation

1. *Different docs answer different questions* — don’t duplicate; *link* instead.

2. *Explain _why_, not just what.* Comments carry rationale, invariants, and tradeoffs.

3. *Accurate or absent.* If you can’t keep a doc truthful, remove it and add a TODO + owner.

4. *Progressive disclosure.* One‑screen summaries first; details behind links/sections.

5. *Examples beat prose.* Provide minimal, runnable examples close to the API.

6. *Consistency > cleverness.* Uniform structure, tone, and placement.

I also give it a note to refuse the prompt if it cannot satisfy these conditions

>I don’t know why we pretend that “good code”, “good documentation”, “good tests” etc are the same for everybody

Of course code, docs, tests are all subjective and maybe even closer to an art than a science

But there's also objectively good habits, and objectively bad habits, and you can steer an LLM pretty well


Maybe because it's "US-based surveillance"

https://xcancel.com/garrytan/status/1963310592615485955


That’s the new YC CEO? Jesus Christ. The techno-totalitarianism has infected the entire Silicon Valley.


Holy shit this is the ycombinator CEO?

If you the reader are working on these technologies, I encourage you take a good look at yourself in the mirror. You are building a surveillance state for the rich to use against yourself, me and everyone else not part of the ruling class.

You are helping cement the class divide.


JFC what a moron he is…


people like him find themselves in waiting in the line for the showers wondering why they'd do this to "one of the good ones"


>Is there famine? In some selected areas, yes, but for the ones with money, this reality never came.

Seems like what Israel is doing disproportionately affects poor people.


This doesn't tell us much. I don't know why you would expect ChatGPT to do original PhD research. It's a general product that will trust already published research. That doesn't meat that GPT-5 can't do PhD research, when given the right sources.


This is completely wrong, at least for the crime organizations in Mexico. They are not a real threat to the state, and they are not similar to terrorists, they don't chant "death to America". What made the gangs in Mexico violent was the combination of bad law enforcement departments and extrajudicial killings.

The best that can come out of this is that Maduro is removed. Otherwise you are just creating more and more hate towards the USA.


War should be done by government, including dashboards for killing people. And then the focus should be on improving representation and accountability in the government. Doing this with private companies avoid accountability, the same way payment networks can regulate merchants, or the FBI outsources spying Americans to private contractors.


Not sure I follow. It’s a tracking board for assets.


I'm not sure how you don't follow. Is the board used for war? Can bugs in the board cause casualties?


Odd. You said nothing about bugs in your original post. What’s your point can you ELI5 why you would want the government to write software?


When software is written with the purpose to kill people, that is very important software. That makes the organizations that write it very important. The more important an organization is, the more people from outside the organization should know what they do, and the more they should have say on it. Private organizations don't meet those requirements, government approximates those requirements better.

Also, I don't know how you can't see the relationship between bugs and accountability.


So the government should make its own guns, tanks, food, planes, fuel etc.? Not trying to be pestering but again I don’t understand your point. Software to me is no different than a plane or a gun. The military does not make those either. It’s a tool that connects data sources to make decisions and I have yet to see a reason why the military has to make a tool instead of paying for one.


Ideally yes, if they are designed to be used to kill people. You don't want a whole industry that has the incentive to want more dead people just so it can stay alive.


On the contrary, these tools would cause less dead people. That’s the whole point and why the military uses it. By using tools that provide higher fidelity on threats, the military becomes more efficient and precise, which leads to less casualties and collateral damage.


Cool, but not related to whether they should be built by the government or private companies. Also, if all wars ended tomorrow, would Palantir's profits increase or decrease?


Ok so we are just talking about happy ideas that are not only unrealistic but will never happen. Which is ok but let’s be clear about that up front.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: