Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The world of literature is increasingly making itself inaccessible to broad audiences by turning this into a zero-sum game.

I wish OpenAI, Anthropic and Gemini would all figure out how to pay royalties to copyright holders anytime their content is used. I see absolutely no reason why they can't do this. It would really take all the steam out of these hardline anti-AI positions.





I think we will be seeing a lot more business pop up that will take cater to people who are unhappy with AI. Especially if you consider the large amount of inevitable layoffs, people will begin to resent everything AI. The intelligent machine was never supposed to replace laborers, it was supposed to do your dishes and laundry.

> I think we will be seeing a lot more business pop up that will take cater to people who are unhappy with AI.

How will those potential customers know which businesses are providing an AI-generated product and which are not?

It's only a viable business if there is a way for potential customers to determine the amount of AI in whatever the product is.


My dishes and laundry are done by machines. Dumb ones though.

tbf dishes and laundry are also labourers' jobs.

but i agree i think there will be a small market for “vegan” content.


I'm down for this, but only if the people who are getting paid by OpenAI/etc also turn around and pay any inspiration they've had, any artist they've copied from, etc over their entire life. If we're going to do this, we need to do it in a logically consistent way; anyone who's derived art from pre-existing art needs to pay the pre-existing artist, and I mean ALL of it, for anything derivative.

Good luck with that.


> I'm down for this, but only if the people who are getting paid by OpenAI/etc also turn around and pay any inspiration they've had, any artist they've copied from, etc over their entire life.

Why? Things are scale have different rules (different laws as well) from things done individually or for personal reasons.

What is the argument for AI/LLM stuff getting an exemption in this regard?

I don't see why AI/LLMs should get exemptions or special treatment.


If copying someone is bad, and they should be paid for it, that should be universal.

We already have copyright laws, they already prevent people from distributing AI outputs that infringe on intellectual property. If you don't like those laws in the age of AI, get them changed consistently, don't take a broken system and put it on life support.

I find it funny that many people are pro-library and pro-archive, and will pay money to causes that try to do that with endangered culture, but get angry at AI as having somehow stolen something, when they're fulfilling an archival function as well.


What I find funny about your argument is how completely degraded fair use has become when using anything by a corporation capable of delaying and running up legal fees. It sure feels like there are a separate set of rules.

> If copying someone is bad, and they should be paid for it, that should be universal.

But we (i.e. society) don't agree that it is; the rules, laws and norms we have is that some things are bad at scale!

As a society, we've already decided that things at scale are regulated differently than things for personal use. That ship has sailed and it's too late now to argue for laws to apply universally regardless of scale.

I am asking why AI/LLMs should get a blanket exemption in this regard.

I have not seen any good arguments for why we society should make a special exemption for AI/LLMs.


"Scale" isn't a justification for regulatory differences, that's a straw man. We take shortcuts at scale because of resource constraints, and sometimes there are more differences than just scale, and we're over simplifying because we're not as smart as we'd like to imagine. If there aren't resource constraints, and we have the cognitive bandwidth to handle something in a consistent way, we really should.

If we were talking algorithms, would you special case code because a lot of people hit it even if load wasn't a problem, or would you try to keep one unified function that works everywhere?


> "Scale" isn't a justification for regulatory differences, that's a straw man.

It's not a strawman - that's literally how things work.

You can hold a bake sale at school with fewer sanitation requirements that a cake store has to satisfy.

You can possess weed for personal use, but will get locked up if you possess a warehouse filled with 200 tons of the stuff.

You can reproduce snippets of copyrighted material, but you can't reproduce the entire thing.

(I can go on and on and on, but you get the idea)

Which laws, regulations or social norms did you have in mind when you thought that scale doesn't matter?

I'm unable to think of even one regulation that applies universally regardless of scale. Which one were you thinking of?


Distribution and possession are fundamentally different. Cops try to bust people who have large amounts for distribution even if they don't have any evidence of it, but that's a different issue.

Corporations are individuals and can engage in fair use (at least, as the law is written now). Neither corporations nor individuals can redistribute material in non-fair use applications.

School bake sales are regulated under cottage food laws, which are relaxed under the condition that a "safe" subset of foods is produced. That's why there are no bake sales that sell cured sausage, for instance. Food laws are in some part regulatory capture by big food, but thankfully there hasn't been political will to outlaw independent food production entirely.

You're misinformed about all the examples you cited, you should do more research before stating strong opinions.


> You're misinformed about all the examples you cited, you should do more research before stating strong opinions.

You've literally agreed with what I said[1]:

> School bake sales are regulated under cottage food laws, which are relaxed under the condition that a "safe" subset of foods is produced. That's why there are no bake sales that sell cured sausage, for instance. Food laws are in some part regulatory capture by big food, but thankfully there hasn't been political will to outlaw independent food production entirely.

Scale results in different regulation. You have, with this comment, agreed that it does yet are still pressing on the point that there should be an exemption for AI/LLM.

I don't understand your reasoning in pointing out that baking has different regulations depending on scale; I pointed out the same thing - the regulations are not universal.

-------------------

[1] Things I have said:

> Things are scale have different rules (different laws as well) from things done individually or for personal reasons.

> As a society, we've already decided that things at scale are regulated differently than things for personal use.

> You can hold a bake sale at school with fewer sanitation requirements that a cake store has to satisfy.


> many people are pro-library and pro-archive, but get angry at AI as having somehow stolen something

Yes! They're angry that there are two standards, an onerous one making life hell for archivists, librarians, artists, enthusiasts, and suddenly a free-for-all when it comes to these AI fad companies hoovering all the data in the world they can get their paws on.

I.e. protecting the interests of capital at the expense of artists and people in the former, and the interests of capital at the expense of artists and people in the latter.


Why stop differentiating between humans and machines?

> anytime their content is used

So ... every time a model is used? Because it has been trained on these works so they have some influence on all its weights?

> I see absolutely no reason why they can't do this

They didn't even pay to access the works in the first place, frankly the chances of them paying now seems pretty minimal, without being forced to by the courts.


I've had this idea kicking around in my head now for a few months that this is an opportunity to update copyright / IP law generally, and use the size and scope of government to do something about both the energy costs of AI and compensation for people whose works are used. At a very rough draft and high level it goes something like this:

Update copyright to an initial 10 year limit, granted at publication without any need to register. This 10 year period also works just like copyright today, the characters, places, everything is projected. After 10 years, your entire work falls into the public domain.

Alternatively, you can register your copyright with the government within the first 3 years. This requires submitting your entire work in a machine readable specified format for integration into official training sets and models. These data sets and models will be licensed by the government for some fee to interested parties. As a creator with material submitted to this data set, you will receive some portion of those licensing feed, proportional to the quantity and amount of time your material has been in the set, with some caps set to prevent abuse. I imagine this would work something like the broadcast licensing for radios works. You will receive these licensing fees for up to 20 years from the first date of copyright.

During the first 10 years, copyright is still enforced on your work for all the same things that would normally be covered. For the 10 years after that, in additional consideration for adding your work to the data sets, you will be granted an additional weaker copyright term. The details would vary by the work, but for a novel for example, this might still protect the specific characters and creatures you created, but no longer offer protection on the "universe" you created. If we imagine Star Wars being created under this scheme, while Darth Vader, Luke Skywalker and Leia Organa might still be protected from 1987-1997, The Empire, Tatooine, and Star Destroyers might not be.

What I envision here is that these government data sets would be known good, clean, properly categorized and in the case of models, the training costs have already been paid once. Rather than everyone doing a mad dash to scrape all the world's content, or buy up their own collection of books to be scanned and processed, all of that work could already have been done and it's just a license fee away. Additionally because we're building up an archive of media, we could also license custom data sets. Maybe someone wants to make a model trained on only cartoons, or only mystery novels or what have you. The data is already there, a nominal fee can get you that data, or maybe even have something trained up, and all the people who have contributed to that data are getting something for their work, but we're also not hamstringing our data sets to being decades or more out of date because Disney talked the government into century long copyrights decades ago.


Why go after the AI company? If someone is using the AI generated content for commercial purposes and it’s based of a copyrighted work, they are the ones who should be paying the royalty.

The AI company is really more like a vector search company that brings you relevant content, kind of like Google, but that does not mean the user will use those results for commercial purposes. Google doesn’t pay royalties for displaying your website in search results.


Sure and that's one way to solve royalties.

I suspect from purely logistics, AI training is better when it's free to injest all the content it can, and for that freedom it pays in some small royalty amount when that source is cited.

They'd simply pass that cost onto the customer. For universities or enterprises or lawfirms, or whatever, they would either include pre-existing agreements, or pay for blanket access. Whatever terms OpenAI, Anthropic, and Gemini sign with these entities, they can workout the royalties there.

These are all solved problems for every other technology middle man company.


This only makes sense if we have open access to the training data so can verify if it’s copyrighted or not. Otherwise how am I supposed to know it’s replicated someone’s IP.

It's not quite the same though, when I search Google I'm generally directed to the source (though the summary box stuff might cross the line a bit).

With AI, copyrighted material is often not obvious to the end user, so I don't think it's fair to go after them.

I think it's non-trivial to make the AI company pay per use though, they'd need a way to calculate what percent of the response is from which source. Let them pay at training time with consent from the copyright holder, or just omit it.


The AI company isn’t making money off the copyrighted material, they make money off finding the copyrighted material for you.

The end user is 100% to blame for any copyright violations.


This argument didn’t work out for Napster or Google Image Search.

Surely both of them should have some sort of valid license to the work in that case?

> We put an identifying mark on publishers committed to publishing only (human-written) literature

>hardline anti-AI position

Some people are beyond parody.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: