Hacker Newsnew | past | comments | ask | show | jobs | submit | yorwba's commentslogin

(I'm not an egyptologist.) The Thesaurus Linguae Aegyptiae has a breakdown of π“…“π“‚‹π“„‹π“π“π“ŠΉπ“Š΅π“π“…“π“‡Ύπ“‚‹π“‡¦π“‚‹π“†‘ / jm.j-rΚΎ-wpw.wt-αΈ₯tp.w-nαΉ―r-m-tꜣ-r-ḏr=f / "overseer of apportionments of the god's offering(s) in the entire land" https://tla.digital/lemma/850281 with 𓇾 / tꜣ / "land" and π“‚‹π“‡₯π“‚‹ / r-ḏr / "entire".

So we have most of π“‡π“‡‹π“­π“‚»π“˜π“‡‹π“‡Ύπ“‚‹π“‡₯π“‚‹π“ˆπ“†‘ / jy.tj-tꜣ-r-ḏr-?? / "welcome land entire ??" except for the π“ˆπ“†‘ at the end where I have no idea whether it's phonetic αΈ₯r=f or a determinative https://en.wiktionary.org/wiki/%F0%93%88%90 or something else.


I think the =f at the end is a possessive, because r ḏr=f is an idiomatic phrase literally meaning β€œto its limit”.

π“‡π“‡‹π“­π“‚»π“˜π“‡‹π“‡Ύπ“‚‹π“‡₯π“‚‹π“ˆπ“†‘

jy.tj t3 r ḏr=f

come [STATIVE] land [VOCATIVE] to limit its

β€œWelcome, entire land”

(I’m not an Egyptologist either.)


Did you forget about the π“ˆ or does it get merged into the preceding word?

I think π“ˆ is the end of the preceding word, ḏr. It is functioning as a logogram, not a phonogram:

https://en.wikipedia.org/wiki/Determinative


What a time to be alive: my iPhone has enough Unicode coverage to display hieroglyphics… imagine thinking two decades ago that you could use disk space on a mobile device like that.

(It doesn’t have the glyph layout chops, though…)


In section 5.7.5, they fine-tune for "11 low-resource languages, with between 5-10 hours of training data and at least 1 hour of validation splits." "CTC fine-tuning takes β‰ˆ1 hour of walltime on 32 GPUs for the 300M scale." If that's too expensive, you also have the option of supplying additional context for the LLM-based model (section 5.5).

As for "very clean data," see section 5.7.4: "Omnilingual + OMSF ASR was intentionally curated to represent naturalistic (i.e., often noisy) audio conditions, diverse speaker identities, and spontaneous, expressive speech."


The Ethnologue link in footnote 7 of the paper has utm_source=chatgpt.com at the end, so I suspect whoever was tasked with listing languages and determining their status thought this wasn't important enough to do it themselves and just had ChatGPT give them a list. FWIW, Ethnologue does say that Ghari is "Stable" https://www.ethnologue.com/language/gri/ Meanwhile Swedish is "Institutional," the highest possible level of vitality https://www.ethnologue.com/language/swe/

The part you quote is part of the list of conditions for an if-statement, so how could it be a lie?

The issue wasn't if they said that thing or not; companies say a lot of things which are fundamentally a lie, things to keep up appearances – which are oftentimes not enforced. It's like companies arguing they believe in fair pay while using Chinese sweatshops or whatever.

In this case, for instance, Netflix still has a relation with their partners that they don't want to damage at this moment, and we are not at the point of AI being able to generate a whole feature length film indistinguishable from a traditional one . Also, they might be apprehensive regarding legal risks and the copyrightability at this exact moment; big companies' lawyers are usually pretty conservative regarding taking any "risks," so they probably want to wait for the dust to settle down as far as legal precedents and the like.

Anyway, the issue here is:

"Does that statement actually reflect what Netflix truly think and that they actually believe GenAI shouldn't be used to replace or generate new talent performances?"

Because they believe in the sanctity of human authorship or whatever? And the answer is: no, no, hell no, absolutely no. That is a lie.


"Does that statement actually reflect what Netflix truly think and that they actually believe GenAI shouldn't be used to replace or generate new talent performances?"

The if-statement "If you want to do X, you need to get approval." probably does actually reflect what Netflix truly think, but it doesn't mean they believe X shouldn't be done. It means they believe X is risky and they want to be in control of whether X is done or not.

I don't see how you could read the article and come away with the impression that Netflix believe GenAI shouldn't be used to replace or generate new talent performances.


I’m inclined to agree. The goalposts will move once the time is right. I’ve already personally witnessed it happening; a company sells their AI-whatever strictly along the lines of staff augmentation and a force multiplier for employees. Not a year later and the marketing has shifted to cost optimization, efficiency, and better β€œuptime” over real employees.

The truth is that Netflix, Amazon, or any other company, honestly, would fire 99% of their workforce if it were possible, because they only care about profit – hell, they are companies, that's why they exist. At the same time, brands have to pretend they care about society, people having jobs, the climate, whatever, so they can't simply say: "Yeah, we exist to make money and we totally want to fire you guys as soon as possible." As you said, it's all masked as staff augmentation and other technical mumbo jumbo.

"If you can confidently say "yes" to all the above, socializing the intended use with your Netflix contact may be sufficient. If you answer β€œno” or β€œunsure” to any of these principles, escalate to your Netflix contact for more guidance before proceeding, as written approval may be required."

They do want to save money by cheaply generating content, but it's only cheap if no expensive lawsuits result. Hence the need for clear boundaries and legal review of uses that may be risky from a copyright perspective.


Yeah, that's a fair assessment. The specific mention of "union-covered work" plays to that interpretation as well:

> GenAI is not used to replace or generate new talent performances or union-covered work without consent.


Yeah, I read the "Talent" section and it's very balanced. I can't see much, if anything, to complain about, so thank goodness for SAG-AFTRA. The strike a couple of years ago was well judged.

They also mention reputation / image in there. If I can’t tell something is generated by AI (some background image in a small part of a scene), it’s just CGI. But if its the uncanny valley view of a person/animal/thing that is clearly AI generated, that shows laziness.

Their company website also has shorter descriptions of the scheme for people with varying amounts of background knowledge: https://raveltech.io/

The implementation doesn't appear to be public, but their GitHub strongly suggests that the commercial application they have in mind is privacy-preserving bidding on ads: https://github.com/raveltech


I hadn't done a ton of research on them but I skimmed their paper. That's a really cool find. I'm curious to see how it compares to TikTok's suite of 2PC protocols for advertising measurement and Google's use of TEEs for ad auctions.

The transcript in question may or may not be this one: https://newmitbbs.com/blog/articles/9694

Synthetic data doesn't have to come from an LLM. And that paper only showed that if you train on a random sample from an LLM, the resulting second LLM is a worse model of the distribution that the first LLM was trained on. When people construct synthetic data with LLMs, they typically do not just sample at random, but carefully shape the generation process to match the target task better than the original training distribution.


Even when you do get a sync conflict, Syncthing will rename one of the copies and then you can have KeePassXC merge the two files back into one. So that's still pretty much hassle-free.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: