More

thadt · 2025-11-06T13:54:21 1762437261

Yesterday my wife burst into my office: "You used AI to generate that (podcast) episode summary, we don't sound like that!"

In point of fact, I had not.

After the security reporting issue, the next problem on the list is "trust in other people's writing".

bob1029 · 2025-11-06T14:03:26 1762437806

I think one potential downside of using LLMs or exposing yourself to their generated content is that you may subconsciously adopt their quirks over time. Even if you aren't actively using AI for a particular task, prior exposure to their outputs could be biasing your thoughts.

This has additional layers to it as well. For example, I actively avoid using em dash or anything that resembles it right now. If I had no exposure to the drama around AI, I wouldn't even be thinking about this. I am constraining my writing simply to avoid the implication.

jerf · 2025-11-06T16:43:40 1762447420

I didn't make heavy use of it, but I did sometimes use "It's not X, it's Y" or some closely related variant. I've had to strike that from my writing, because whether or not it makes anyone else cringe, it's making me cringe now. My usage doesn't even match the ones the LLMs favor, my X & Y were typically full clauses with many words rather than the LLM's use of short, punchy X & Ys... but still. Close enough. Can't write it anymore.

I'm still using bullet lists sometimes, as they have their place, and I'm hoping LLMs don't totally nuke them.

code51 · 2025-11-06T14:14:32 1762438472

Exactly and this is hell for programming.

You don't know whose style the LLM would pick for that particular prompt and project. You might end up with Carmack or maybe that buggy, test-failing piece of junk project on Github.

Taek · 2025-11-06T14:56:44 1762441004

You can tell it who's style to copy, it's actually decent at following instructions like that.

noduerme · 2025-11-06T16:02:34 1762444954

It's not bad at following my own style. I have longstanding quirks like naming any string that will end up in a DB query with a "q_" in front of the variable name, and shockingly Claude picks up on those and mimicks them. Wouldn't trust it to write anything without thorough review, but it's great at syntax.

dingnuts · 2025-11-06T17:08:59 1762448939

this isn't shocking, they are very good at repeating patterns in the immediate context. they're just not very good at anything else. your quirk is part of the immediate pattern

alchemism · 2025-11-06T15:11:53 1762441913

My first experiments with LLM chat was to ask to produce text mimicking the style of a distinct, well-known author. It was also quite good at producing hybrid fusions of unique fictional styles, A + B = AB.

riskable · 2025-11-06T15:49:16 1762444156

I suddenly have the urge to reply to this with a bulleted list where the bullets are emoji.

imiric · 2025-11-06T16:20:50 1762446050

Isn't the alternative far more likely? These tools were trained on the way people write in certain settings, which includes a lot of curated technical articles like this one, and we're seeing that echoed in their output.

There's no "LLM style". There's "human style mimicked by LLMs". If they default to a specific style, then that's on the human user who chooses to go with it, or, likely, doesn't care. They could just as well make it output text in the style of Shakespeare or a pirate, eschew emojis and bulleted lists, etc.

If you're finding yourself influenced by LLMs—don't be. Here's why:

• It doesn't matter.

• Keep whatever style you had before LLMs.

:tada:

jerf · 2025-11-06T16:45:56 1762447556

There is no "LLM style".

There is a "default LLM style", which is why I call it that. Or technically, one per LLM, but they seem to have converged pretty hard since they're all convergently evolving in the same environment.

It's trivial to prompt it out of that style. Word about how to do it and that you should do it has gotten around in the academic world where the incentives to not be caught are high. So I don't call it "the LLM style". But if you don't prompt for anything in particular, yes, there is a very very strong "default LLM style".

dingnuts · 2025-11-06T17:10:08 1762449008

the default LLM style is corporate voice lol

keybored · 2025-11-06T16:20:42 1762446042

Out of the mountains of content, one single symbol would provoke the ire of non-ascii reactionaries.

https://news.ycombinator.com/item?id=44072922

https://news.ycombinator.com/item?id=45766969

https://news.ycombinator.com/item?id=45073287

jobigoud · 2025-11-06T16:23:46 1762446226

Already a big problem in art, people go on witch hunt over what they think are signs of AI use.

It's sad because people that are ok with AI art are still enjoying the human art just the same. Somehow their visceral hate of AI-art managed to ruin human art for themselves as well.

whywhywhywhy · 2025-11-06T18:52:29 1762455149

This ultimately will only ever harm human artists accused of it. AI artists can just say “yeah, I did, so what” defusing the criticism.

robby_w_g · 2025-11-06T19:11:19 1762456279

If there wasn't global-scale theft of art and content or if LLMs could produce something better than an inferior facsimile, I bet there would be less backlash.

But instead we had a 'non-profit' called 'Open'AI that irresponsibly unleashed this technology on the world and lied about its capabilities with no care of how it would affect the average person.

dingnuts · 2025-11-06T17:13:13 1762449193

AI visual output mimicks art sufficiently that it is now more difficult to identify authenticity and humanity, which are important for the human connection audiences want from art.

AI outputs mimicking art rob audiences of the ability to appreciate art on its own in the wild without further markers of authenticity, which steals joy from a whole generation of digital artists that have grown up sharing their creativity with each other

If you lack the empathy to understand why AI art-like outputs are abhorrent, I hope someone wastes a significant portion of your near future with generated meaningless material presented to you as something that is valuable and was time consuming to make, and you gain nothing from it, so that you can understand the problem for yourself first hand.

acedTrex · 2025-11-06T16:04:41 1762445081

I blogged about this fundamental demolition of trust a few months ago.

HN discussed it here https://news.ycombinator.com/item?id=44384610

The responses were a surprisingly mixed bag. What I thought was a very common sense observation had some heavy detractors in those threads.

gdulli · 2025-11-06T16:30:05 1762446605

You're on a forum full of people trying to profit from this tech. In that context the pushback is obvious.

riskable · 2025-11-06T15:47:59 1762444079

Exposure to AI leads to people writing like AI. Just like when you're hanging out in certain circles, you start to talk like those people. It's human nature.

thadt · 2025-11-01T22:37:18 1762036638

Can you expand on what you find to be 'bad advice'?

The author uses an LLM to find bugs and then throw away the fix and instead write the code he would have written anyway. This seems like a rather conservative application of LLMs. Using the 'shooting someone in the foot' analogy - this article is an illustration of professional and responsible firearm handling.

sciencejerk · 2025-11-02T06:18:08 1762064288

Layman in cryptotography (that's 99% of us at least) may be encouraged to deploy LLM generated crypto implementations, without understanding the crypto

9dev · 2025-11-02T08:10:21 1762071021

If they consider doing that, they will without LLMs or with them. Raise your juniors right.

lisbbb · 2025-11-02T03:48:53 1762055333

Honestly, it read more like attention seeking. He "live coded" his work, by which I believe he means he streamed everything he was doing while working. It just seems so much more like a performance and building a brand than anything else. I guess that's why I'm just a nobody.

thadt · 2025-10-31T19:22:37 1761938557

S3 is doing quite a lot of sophisticated lifting to qualify as no backend at all.

But yeah - this is pretty neat. Easily seems like the future of static datasets should wind up in something like this. Just data, with some well chosen indices.

simonw · 2025-10-31T23:46:31 1761954391

I believe all S3 has to do here is respond to HTTP Range queries, which are supported by almost every static server out there - Apache, Nginx etc should all support the same trick.

thadt · 2025-11-01T01:15:55 1761959755

100%. I’m with y’all - this is what I would also call a “no-backend” solution and I’m all in on this type of approach for static data sets - this is the future, and could be served with a very simple web server.

I’m just bemused that we all refer to one of the larger, more sophisticated storage systems on the plant, composed of dozens of subsystems and thousands of servers as “no backend at all.” Kind of a “draw the rest of the owl”.

theultdev · 2025-10-31T19:39:15 1761939555

Still qualifies imo. Everything is static and on a CDN.

Lack of server/dynamic code qualifies as no backend.

thadt · 2025-10-30T17:18:42 1761844722

Not scared, time limited.

The world is a complicated place, and there is a veritable mountain of things a person could learn about nearly any subject. But sometimes I don't need or want to learn all those things - I just want to get one very specific task done. What I really appreciate is when an expert who has spent the time required to understand the nuances and tradeoffs can say "just do this."

When it comes to technology 'simple' just means that someone else made a bunch of decisions for me. If I want or need to make those decisions myself then I need more knobs.

kccqzy · 2025-10-31T00:53:54 1761872034

In my comment above I specifically did not expect the user to learn and understand everything, just the ability to ignore it. Handbrake has good defaults and the user would be successful if the only thing they do is: open the file and the press the green button.

And scared is the word used by the original author in the title. I want to understand that emotion. I don't need someone to tell me we can't learn everything.

thadt · 2025-10-24T15:19:13 1761319153

It's an interesting risk tradeoff to think about. Is 14k lines of LLM generated code more likely to have an attack in it than 14k lines of transitive library dependencies I get when I add a package to my project?

In the library case, there is a network of people that could (and sometimes do) deliberately inject attacks into the supply chain. On the other hand, those libraries are used and looked at by other people - odds of detection are higher.

With LLM generated code, the initial developer is the only one looking at it. Getting an attack through in the first place seems harder, but detection probability is lower.

thadt · 2025-10-23T14:22:52 1761229372

No, we would use something similar to S-Expressions [1]. Parsing and generation would be at most a few hundred lines of code in almost any language, easily testable, and relatively extensible.

With the top level encoding solved, we could then go back to arguing about all the specific lower level encodings such as compressed vs uncompressed curve points, etc.

[1] https://datatracker.ietf.org/doc/rfc9804

thadt · 2025-10-21T16:49:23 1761065363

* Opens Github repo

* Opens Cargo.lock [1] and pnpm-lock.yaml [2]

* Closes Cargo.lock and pnpm-lock.yaml

* Goes to find a Tylenol

At least with open source we can see the sausage getting made...

[1] https://github.com/votingworks/vxsuite/blob/main/Cargo.lock

[2] https://github.com/votingworks/vxsuite/blob/main/pnpm-lock.y...

aydyn · 2025-10-21T16:59:04 1761065944

Even after reading your comment I was not quite ready for that. I am gobsmacked, over 30K lines of lock file! Are we supposed to have trust in that?

bogwog · 2025-10-21T17:24:42 1761067482

To be fair... What I gather from the readme is that this is monorepo containing 7 sub projects.

stego-tech · 2025-10-21T16:55:41 1761065741

EW. Here, I’ll share some of my Extra Strength Acetaminophen. Those are some cursed lock files.

NekkoDroid · 2025-10-21T23:12:23 1761088343

> * Goes to find a Tylenol

Watch out that you don't catch the autism :) /s

> [1] https://github.com/votingworks/vxsuite/blob/main/Cargo.lock

> [2] https://github.com/votingworks/vxsuite/blob/main/pnpm-lock.y...

These files are actually cursed and I want all drives that contain their data destroyed with acid. But I have a slight feeling other voting software isn't really any better, even though in theory it should be relatively simple software in the grand scheme of things.

arielcostas · 2025-10-22T08:19:00 1761121140

Maybe the solution is to have no software at all. Software can't be really audited at scale, human actions can

thadt · 2025-10-20T15:45:26 1760975126

It's been a few years since I've slung code with it, but I'm pretty sure IAR had their own compiler (along with it's own special occasional bugs). Of the IDE's I've used, it wasn't that bad. But QT Creator was better. Bringing together IAR's tech and reach with QT's expertise does make a lot of sense.

thadt · 2025-10-17T00:15:23 1760660123

It’s useful for someone to be wrong on the Internet.

I’ve learned a lot from watching constructive disagreements between other people. Regardless of whether they’re “right” or not, healthy disagreements sharpen our perspectives.

musicale · 2025-10-17T03:36:09 1760672169

Cue joke about the way to get an answer on the internet is to post a wrong answer.

iambateman · 2025-10-17T14:12:59 1760710379

Tactical wrongness is an underrated parenting technique too.

thadt · 2025-10-16T20:21:26 1760646086

Starts reading: "fantastic, this is what we've been needing! But... where is code signing?"

> One problem that WAICT doesn’t solve is that of provenance: where did the code the user is running come from, precisely?

> ...

> The folks at the Freedom of Press Foundation (FPF) have built a solution to this, called WEBCAT. ... Users with the WEBCAT plugin can...

A plugin. Sigh.

Fancy, deep transparency logs that track every asset bundle deployed are good. I like logging - this is very cool. But this is not the first thing we need.

The first thing we need, is to be able to host a public signing key somewhere that browsers can get and automatically signature verify the root hash served up in that integrity manifest. Then point a tiny boring transparency log at _that_. That's the thing I really, really care about for non-equivocation. That's the piece that lets me host my site on Cloudflare pages (or Vercel, or Fly.io, or Joe's Quick and Dirty Hosting) that ensures the software being run in my client's browser is the software I signed.

This is the pivotal thing. It needs to live in the browser. We can't leave this to a plugin.

doomrobo · 2025-10-16T20:55:29 1760648129

I'll actually argue the opposite. Transparency is _the_ pivotal thing, and code signing needs to be built on top of it (it definitely should be built into the browser, but I'm just arguing the order of operations rn).

TL;DR you'll either re-invent transparency or end up with huge security holes.

Suppose you have code signing and no transparency. Your site has some way of signaling to the browser to check code signatures under a certain pubkey (or OIDC identity if you're using Sigstore). Suppose now that your site is compromised. What is to prevent an attacker from changing the pubkey and re-signing under the new pubkey. Or just removing the pubkey entirely and signaling no code signing at all?

There are a three answers off the top of my head. Lmk if there's one I missed:

1. Websites enroll into a code signing preload list that the browser periodically pulls. Sites in the list are expected to serve valid signatures with respect to the pubkeys in the preload list.

Problem: how do sites unenroll? They can ask to be removed from the preload list. But in the meantime, their site is unusable. So there needs to be a tombstone value recorded somewhere to show that it's been unenrolled. That place it's recorded needs to be publicly auditable, otherwise an attacker will just make a tombstone value and then remove it.

So we've reinvented transparency.

2. User browsers remember which sites have code signing after first access.

Problem: This TOFU method offers no guarantees to first-time users. Also, it has the same unenrollment problem as above, so you'd still have to reinvent transparency.

3. Users visually inspect the public key every time they visit the site to make sure it is the one they expect.

Problem: This is famously a usability issue in e2ee apps like Signal and WhatsApp. Users have a noticeable error rate when comparing just one line of a safety number [1; Table 5]. To make any security claim, you'd have to argue that users would be motivated to do this check and get it right for the safety numbers for every security-sensitive site they access, over a long period of time. This just doesn't seem plausible

[1] https://arxiv.org/abs/2306.04574

thadt · 2025-10-16T21:07:47 1760648867

I'll actually argue that you're arguing exactly what I'm arguing :)

My comment near the end is that we absolutely need transparency - just that what we need tracked more than all the code ever run under a URL is that one signing key. All your points are right: users aren't going to check it. It needs to be automatic and it needs to be distributed in a way that browsers and site owners can be confident that the code being run is the code the site owner intended to be run.

doomrobo · 2025-10-16T21:40:43 1760650843

Gotcha, yeah I agree. Fwiw, with the imagined code signing setup, the pubkey will be committed to in the transparency log, without any extra work. The purpose of the plugin is to give the browser the ability to parse (really fetch, then parse) those extension values into a meaningful policy. Anyways I agree, it'd be best if this part were built into the browser too.