Hacker Newsnew | past | comments | ask | show | jobs | submit | harlanlewis's commentslogin

I agree completely with this as a human reader - but do wonder about the gradual codification of these markers in systems that will have increasingly have LLM detection as a standard feature, as frequently and obviously enabled as spam detectors were on blog comments back when blogs had comments.


I only got 65 for the same idea. I guess you have first mover advantage?


Lots of good recommendations in replies.

Calling it out only because I don’t see it mentioned - until last year, Bartender was one of the popular go-to tools to manage menu bar items, but it fell from favor after quietly changing owners, changing certs, general shadiness https://forums.macrumors.com/threads/psa-bartender-mac-app-u...

A specific and relevant reminder why open source is so important for system utilities.


> "Look in the mirror. Who are you? What values will you compromise?"

This is probably a typo from "comprise" or similar, but I'm rather tickled by the idea that week 1 includes both a thoughtful assessment of your values and admitting with intention that your principles should be discarded before they can get in the way.


I think this is satire and 100% intentional.


Of course you're right - oh how I wish it wasn't 1:1 with the earnestly-produced content dominating linkedin feeds…


The price really is eye watering. At a glance, my first impression is this is something like Llama 3.1 405B, where the primary value may be realized in generating high quality synthetic data for training rather than direct use.

I keep a little google spreadsheet with some charts to help visualize the landscape at a glance in terms of capability/price/throughput, bringing in the various index scores as they become available. Hope folks find it useful, feel free to copy and claim as your own.

https://docs.google.com/spreadsheets/d/1foc98Jtbi0-GUsNySddv...


> feel free to copy and claim as your own.

That's a nice sentiment, but I'd encourage you to add a license or something. The basic "something" would be adding a canonical URL into the spreadsheet itself somewhere, along with a notification that users can do what they want other than removing that URL. (And the URL would be described as "the original source" or something, not a claim that the particular version/incarnation someone is looking at is the same as what is at that URL.)

The risk is that someone will accidentally introduce errors or unsupportable claims, and people with the modified spreadsheet won't know that it's not The spreadsheet and so will discount its accuracy or trustability. (If people are trying to deceive others into thinking it's the original, they'll remove the notice, but that's a different problem.) It would be a shame for people to lose faith in your work because of crap that other people do that you have no say in.


Thats... incredibly thorough. Wow. Thanks for sharing this.


Not just for training data, but for eval data. If you can spend a few grand on really good labels for benchmarking your attempts at making something feasible work, that’s also super handy.


> https://docs.google.com/spreadsheets/d/1foc98Jtbi0-GUsNySddv...

how do you do the different size circles and colored sequences like that? this is god tier skills


hey, thank you! bubble charts, annotated with text and shapes using the Drawing tool. Working with the constraints of Google Sheets is its own challenge.

also - love the podcast, one of my favorites. the 3:1 io token price breakdown in my sheet is lifted directly from charts I've seen on latent space.


haha yeah many people might ask you to tweak to 100:1 but at that point you might as well just go by input price


Bubble charts?


very impressive... also interested in your trip planner, it looks like invite only at the moment, but... would it be rude to ask for an invite?


That is an amazing resource. Thanks for sharing!


What gets me is the whole cost structure is based on practically free services due to all the investor money. They’re not pulling in significant revenue with this pricing relative to what it costs to train the models, so the cost may be completely different if they had to recoup those costs, right?


Hey, just FYI, I pasted your url from the spreadsheet title into Safari on macOS and got an SSL warning. Unfortunately I clicked through and now it works, so not sure what the exact cause looked like.


I appreciate the bug report! Unfortunately this is a familiar and sporadically recurring issue with Netlify, which I should really move off of…


I cannot overstate how good your shared spreadsheet is. Thanks again!


Nice, thank you for that (upvoted in appreciation). Regarding the absence of o1-Pro from the analysis, is that just because there isn't enough public information available?


This is incredibly useful, thank you for sharing!


Holy shit, that's incredible. You should publicise this more! That's a fantastic resource.


They tried a while ago: https://news.ycombinator.com/item?id=40373284

Sadly little people noticed...


Sadly few people noticed.

I don’t normally cosplay as a grammar Nazi but in this case I feel like someone should stand up for the little people :)


A comma in the original comment would have made it pop even more:

"Sadly, little people noticed."

(queue a group of little people holding pitch forks (normal forks upon closer inspection))


Or, sadly, little did people notice.


So you think that little people didn’t notice? ;)


Thanks for the corrections, that’s what I wanted to say!


This is an amazing spreadsheet - thank you for sharing!


Wow, what awesome information! Thanks for sharing!


Amazing, thank you so much for sharing this.


Thank you so much for sharing this!


Very useful


[flagged]


Nobody comes to HN to read what ChatGPT thinks about something in the comments


Don't do this.


Awesome spreadsheet. Would a 3D graph of fast, cheap & smart be possible?


This is a great idea, I've been doing something similar at 2 levels:

1. .cursorrules for global conventions. The first rule in the file is dumb but works well with Cursor Composer:

`If the user seems to be requesting a change to global project rules similar to those below, you should edit this file (add/remove/modify) to match the request.`

This helps keep my global guidance in sync with emergent convention, and of course I can review before committing.

2. An additional file `/.llm_scratchpad`, which I selectively include in Chat/Composer context when I need lengthy project-specific instructions that I made need to refer to more than once.

The scratchpad usually contains detailed specs, desired outcomes, relevant files scope, APIs/tools/libs to use, etc. Also quite useful for transferring a Chat output to a Composer context (eg a comprehensive o1-generated plan).

Lately I've even tracked iterative development with a markdown checklist that Cursor updates as it progresses through a series of changes.

The scratchpad feels like a hack, but they're obvious enough that I expect to see these concepts getting first-party support through integrations with Linear/Jira/et al soon enough.


Did the article change, or was this a very strange quote edit?

Here’s the current line in the article, emphasis mine:

>> The Thermette, a simple and effective device for boiling water outdoors over an enclosed fire, was designed by Manawatū plumber John Hart in 1929 *based on similar products in Ireland and England.* He patented the Thermette in 1931.


Yes, I’m a huge fan of how easy it is to whip up quick isolated prototypes in Claude artifacts.

There’s a risk of breaking changes in libs causing frustration in larger codebases, though. I’ve been working with LLMs in a Nextjs App Router codebase for about a year, and regularly struggle with models trained primarily on the older Pages Router. LLMs often produce incompatible or even mixed compatibility code. It really doesn’t matter which side of the fence your code is on, both are polluted by the other. More recent and more powerful models are getting better, but even SOTA reasoning models don’t totally solve this.

Lately I’ve taken to regularly including a text file that spells out various dependency versions and why they matter in LLM context, but there’s only so much it can do currently to overcome the weight of training on dated material. I imagine tools like Cursor will get better at doing that for us silently in the future.

There’s an interesting tension brewing between keeping dependencies up to date, especially in the volatile and brittle front end world, vs writing code the LLMs are trained on.


Untrusted inputs to systems with agency or access to privileged data. Here’s a data exfiltration example in Google AI Studio:

https://x.com/wunderwuzzi23/status/1821210923157098919



I don’t know how much cheating by referees has got to with it. But many years ago I found the NBA to be a foul shooting contest and gave up on it. It is unwatchable.


The modern NBA stinks for reasons far beyond refereeing and cheating. The Donaghy scandal was a "low point" but the game was 5x as watchable then


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: