More

harlanlewis · 2025-09-02T02:04:41 1756778681

I agree completely with this as a human reader - but do wonder about the gradual codification of these markers in systems that will have increasingly have LLM detection as a standard feature, as frequently and obviously enabled as spam detectors were on blog comments back when blogs had comments.

harlanlewis · 2025-08-12T22:40:43 1755038443

I only got 65 for the same idea. I guess you have first mover advantage?

harlanlewis · 2025-04-08T16:25:03 1744129503

Lots of good recommendations in replies.

Calling it out only because I don’t see it mentioned - until last year, Bartender was one of the popular go-to tools to manage menu bar items, but it fell from favor after quietly changing owners, changing certs, general shadiness https://forums.macrumors.com/threads/psa-bartender-mac-app-u...

A specific and relevant reminder why open source is so important for system utilities.

harlanlewis · 2025-03-16T17:33:55 1742146435

> "Look in the mirror. Who are you? What values will you compromise?"

This is probably a typo from "comprise" or similar, but I'm rather tickled by the idea that week 1 includes both a thoughtful assessment of your values and admitting with intention that your principles should be discarded before they can get in the way.

BlackjackCF · 2025-03-16T18:33:25 1742150005

I think this is satire and 100% intentional.

harlanlewis · 2025-03-16T18:42:53 1742150573

Of course you're right - oh how I wish it wasn't 1:1 with the earnestly-produced content dominating linkedin feeds…

harlanlewis · 2025-02-27T20:29:42 1740688182

The price really is eye watering. At a glance, my first impression is this is something like Llama 3.1 405B, where the primary value may be realized in generating high quality synthetic data for training rather than direct use.

I keep a little google spreadsheet with some charts to help visualize the landscape at a glance in terms of capability/price/throughput, bringing in the various index scores as they become available. Hope folks find it useful, feel free to copy and claim as your own.

https://docs.google.com/spreadsheets/d/1foc98Jtbi0-GUsNySddv...

sfink · 2025-02-28T00:04:56 1740701096

> feel free to copy and claim as your own.

That's a nice sentiment, but I'd encourage you to add a license or something. The basic "something" would be adding a canonical URL into the spreadsheet itself somewhere, along with a notification that users can do what they want other than removing that URL. (And the URL would be described as "the original source" or something, not a claim that the particular version/incarnation someone is looking at is the same as what is at that URL.)

The risk is that someone will accidentally introduce errors or unsupportable claims, and people with the modified spreadsheet won't know that it's not The spreadsheet and so will discount its accuracy or trustability. (If people are trying to deceive others into thinking it's the original, they'll remove the notice, but that's a different problem.) It would be a shame for people to lose faith in your work because of crap that other people do that you have no say in.

isoprophlex · 2025-02-27T20:56:49 1740689809

Thats... incredibly thorough. Wow. Thanks for sharing this.

6gvONxR4sf7o · 2025-02-28T06:55:13 1740725713

Not just for training data, but for eval data. If you can spend a few grand on really good labels for benchmarking your attempts at making something feasible work, that’s also super handy.

swyx · 2025-02-28T08:38:40 1740731920

> https://docs.google.com/spreadsheets/d/1foc98Jtbi0-GUsNySddv...

how do you do the different size circles and colored sequences like that? this is god tier skills

harlanlewis · 2025-03-05T20:36:43 1741207003

hey, thank you! bubble charts, annotated with text and shapes using the Drawing tool. Working with the constraints of Google Sheets is its own challenge.

also - love the podcast, one of my favorites. the 3:1 io token price breakdown in my sheet is lifted directly from charts I've seen on latent space.

swyx · 2025-03-08T11:25:46 1741433146

haha yeah many people might ask you to tweak to 100:1 but at that point you might as well just go by input price

world2vec · 2025-02-28T10:46:22 1740739582

Bubble charts?

bglusman · 2025-02-27T21:36:44 1740692204

very impressive... also interested in your trip planner, it looks like invite only at the moment, but... would it be rude to ask for an invite?

gwyllimj · 2025-02-27T21:41:48 1740692508

That is an amazing resource. Thanks for sharing!

taurath · 2025-02-28T11:14:53 1740741293

What gets me is the whole cost structure is based on practically free services due to all the investor money. They’re not pulling in significant revenue with this pricing relative to what it costs to train the models, so the cost may be completely different if they had to recoup those costs, right?

senordevnyc · 2025-02-27T23:00:18 1740697218

Hey, just FYI, I pasted your url from the spreadsheet title into Safari on macOS and got an SSL warning. Unfortunately I clicked through and now it works, so not sure what the exact cause looked like.

harlanlewis · 2025-03-05T20:37:42 1741207062

I appreciate the bug report! Unfortunately this is a familiar and sporadically recurring issue with Netlify, which I should really move off of…

adinb · 2025-02-27T21:14:36 1740690876

I cannot overstate how good your shared spreadsheet is. Thanks again!

dumpsterdiver · 2025-02-27T21:44:37 1740692677

Nice, thank you for that (upvoted in appreciation). Regarding the absence of o1-Pro from the analysis, is that just because there isn't enough public information available?

jwr · 2025-02-28T04:09:27 1740715767

This is incredibly useful, thank you for sharing!

Philpax · 2025-02-27T21:01:51 1740690111

Holy shit, that's incredible. You should publicise this more! That's a fantastic resource.

beklein · 2025-02-27T21:22:32 1740691352

They tried a while ago: https://news.ycombinator.com/item?id=40373284

Sadly little people noticed...

throwup238 · 2025-02-27T21:24:40 1740691480

Sadly few people noticed.

I don’t normally cosplay as a grammar Nazi but in this case I feel like someone should stand up for the little people :)

dumpsterdiver · 2025-02-27T21:48:08 1740692888

A comma in the original comment would have made it pop even more:

"Sadly, little people noticed."

(queue a group of little people holding pitch forks (normal forks upon closer inspection))

freehorse · 2025-02-28T06:47:40 1740725260

Or, sadly, little did people notice.

rebolek · 2025-02-27T21:47:49 1740692869

So you think that little people didn’t notice? ;)

beklein · 2025-03-01T10:27:22 1740824842

Thanks for the corrections, that’s what I wanted to say!

bennyg · 2025-02-27T20:54:01 1740689641

This is an amazing spreadsheet - thank you for sharing!

mwigdahl · 2025-02-28T02:33:54 1740710034

Wow, what awesome information! Thanks for sharing!

rendist · 2025-02-27T21:50:00 1740693000

Amazing, thank you so much for sharing this.

jnd0 · 2025-02-27T21:14:50 1740690890

Thank you so much for sharing this!

crovona · 2025-03-01T19:08:56 1740856136

Very useful

kridsdale3 · 2025-02-27T22:40:23 1740696023

[flagged]

davidcbc · 2025-02-27T23:36:34 1740699394

Nobody comes to HN to read what ChatGPT thinks about something in the comments

topaz0 · 2025-02-27T23:13:52 1740698032

Don't do this.

krwiseman · 2025-02-27T21:49:22 1740692962

Awesome spreadsheet. Would a 3D graph of fast, cheap & smart be possible?

harlanlewis · 2025-02-04T19:25:07 1738697107

This is a great idea, I've been doing something similar at 2 levels:

1. .cursorrules for global conventions. The first rule in the file is dumb but works well with Cursor Composer:

`If the user seems to be requesting a change to global project rules similar to those below, you should edit this file (add/remove/modify) to match the request.`

This helps keep my global guidance in sync with emergent convention, and of course I can review before committing.

2. An additional file `/.llm_scratchpad`, which I selectively include in Chat/Composer context when I need lengthy project-specific instructions that I made need to refer to more than once.

The scratchpad usually contains detailed specs, desired outcomes, relevant files scope, APIs/tools/libs to use, etc. Also quite useful for transferring a Chat output to a Composer context (eg a comprehensive o1-generated plan).

Lately I've even tracked iterative development with a markdown checklist that Cursor updates as it progresses through a series of changes.

The scratchpad feels like a hack, but they're obvious enough that I expect to see these concepts getting first-party support through integrations with Linear/Jira/et al soon enough.

harlanlewis · 2025-01-30T02:33:34 1738204414

Did the article change, or was this a very strange quote edit?

Here’s the current line in the article, emphasis mine:

>> The Thermette, a simple and effective device for boiling water outdoors over an enclosed fire, was designed by Manawatū plumber John Hart in 1929 *based on similar products in Ireland and England.* He patented the Thermette in 1931.

harlanlewis · 2025-01-23T03:23:12 1737602592

Yes, I’m a huge fan of how easy it is to whip up quick isolated prototypes in Claude artifacts.

There’s a risk of breaking changes in libs causing frustration in larger codebases, though. I’ve been working with LLMs in a Nextjs App Router codebase for about a year, and regularly struggle with models trained primarily on the older Pages Router. LLMs often produce incompatible or even mixed compatibility code. It really doesn’t matter which side of the fence your code is on, both are polluted by the other. More recent and more powerful models are getting better, but even SOTA reasoning models don’t totally solve this.

Lately I’ve taken to regularly including a text file that spells out various dependency versions and why they matter in LLM context, but there’s only so much it can do currently to overcome the weight of training on dated material. I imagine tools like Cursor will get better at doing that for us silently in the future.

There’s an interesting tension brewing between keeping dependencies up to date, especially in the volatile and brittle front end world, vs writing code the LLMs are trained on.

harlanlewis · on Oct 15, 2024

Untrusted inputs to systems with agency or access to privileged data. Here’s a data exfiltration example in Google AI Studio:

https://x.com/wunderwuzzi23/status/1821210923157098919

harlanlewis · on Oct 1, 2024

Tim Donaghy (ref): https://en.m.wikipedia.org/wiki/Tim_Donaghy

rawgabbit · on Oct 1, 2024

I don’t know how much cheating by referees has got to with it. But many years ago I found the NBA to be a foul shooting contest and gave up on it. It is unwatchable.

tiznow · on Oct 2, 2024

The modern NBA stinks for reasons far beyond refereeing and cheating. The Donaghy scandal was a "low point" but the game was 5x as watchable then