Hacker Newsnew | past | comments | ask | show | jobs | submit | RSchaeffer's commentslogin

We examine min-p sampling (ICLR 2025 oral) & find significant problems in all 4 lines of evidence: human eval, NLP evals, LLM-as-judge evals, community adoption claims


Best of N was shown to exhibit power (polynomial) law scaling (left), but maths suggest one should expect exponential scaling (center). We show how to resolve this "paradox", then use our insights to design methods for predicting inference-scaling capabilities that can be more sample efficient!


Thanks for the advice and links! Do you know of a Render tutorial that involves getting Flask and ReactJS services to communicate with one another? Your 2nd and 3rd links demonstrate each independently. I don't know whether the same challenge will pop up in NextJS


With React by itself, you need to know the API's URL. You should read about environment variables and how React handles them.

You would have one environmental file for local development, and then in production, you'd use Render.com's environment variable settings.

One of these variables would be something like REACT_APP_API_BASE_URL (doesn't matter what you call it, as long as it begins with REACT_APP).

Then you prepend all of your API calls with that in your React code.

Next.js does not have this setup step because it serves your API and static site from the same s Node server by default, so all of your API URLs would just be /my-api-call (because the API and fronted would have the same hostname, whatever it is).


Thanks for taking the time to explain :)


Can you share more about the habits you built to break that cycle of scrolling for an hour in the morning?


The first thing I did is move my phone out of the bedroom. This helped break the habit of picking it up right away in the morning, and created some friction. I keep it on my desk on a charger.

The second thing I did was decide on a few things I had to do before picking up my phone for the first time in the morning. This like a game, where I started with one thing and then built up more, to see how many things I could get done before checking my messages. I started with taking a shower. Then I added walking the dog, then coffee, then breakfast... now I'm up to the point where my entire morning routine has to happen before I pick up my phone.

It's not easy. I'm not perfect at this, and there are days when I go back to bad habits. (In this context, I use "bad habits" to mean "habits that prevent me achieving the goals I set for myself). That's why I block certain websites still, for myself, to help from getting back into those habits.


Quitting smoking isn't a one-time event. Anyone with an addiction will tell you that it's a lifelong struggle.


Why make people search instead of quoting the relevant section?

"The human fasting mimicking diet (FMD) program is a plant-based diet program designed to attain fasting-like effects while providing micronutrient nourishment (vitamins, minerals, etc.) and minimize the burden of fasting. It comprises proprietary vegetable-based soups, energy bars, energy drinks, chip snacks, chamomile flower tea, and a vegetable supplement formula tablet (Table S4). The human FMD diet consists of a 5 day regimen: day 1 of the diet supplies ∼1,090 kcal (10% protein, 56% fat, 34% carbohydrate), days 2–5 are identical in formulation and provide 725 kcal (9% protein, 44% fat, 47% carbohydrate)."

"Subjects in the FMD cohort consumed the provided experimental diet consisting of 3 cycles of 5 continuous days of FMD followed by 25 days of normal food intake."


Then you're being far too generous with your interpretation


Does anyone know how frequently these are offered?


IMHO, YC tends to launch these kinds of programs around summer.

The first iteration of this was launched in July 2015 and it was called the YC Fellowship. Participation was accomplished remotely via video chat and grant money was $12000.

https://blog.ycombinator.com/yc-fellowship/


People frequently recommend Strang's teaching as an amazing pedagogical approach for engineers and applied mathematicians, but I find I'm frustrated every time I read his books or listen to his lectures. They don't work well for me and I've found much better alternatives


Sharing those alternatives would be a more valuable comment.


But how did their model compare against others? The article only mentions how their interpretable model compared against their own ML attempts


Their model didn't win. IBM's model won, based on actual metrics around useful insights.


The IBM team got $5,000 and the second place/honorable mention NYU got $2,000. So going by prize amounts, the Duke model was still pretty good.

IBM turned the model/paper into a toolkit: https://www.ibm.com/blogs/research/2019/08/ai-explainability... Their model seems to be a variant of decision trees that has a knob controlling how complicated the trees are.

And the evaluation was completely subjective, so there's not any meaning to the Duke people losing besides that the judges didn't like them.


> And the evaluation was completely subjective, so there's not any meaning to the Duke people losing besides that the judges didn't like them.

That's what you get if you use a black box for judging :-)


Reading "subjective" to mean "nonexistent" is a potentially big mistake.

Objectivity is more accurate, sure. The winner of an objective contest is always objectively better against objective criteria. But, objective criteria are generally narrow. This works well if one is either (a) seeking fundamental principles like in physics or (b) the narrow objective criteria is the definite goal.

In this area, we don't exactly know how to define narrow, objective goals & subsequent criteria. We can definee goalposts, but not goals. These are guesses at useful markers of success, useful to the larger goal of useful/novel ai.

Subjective goals have their own (massive problems), but since we can't objectively define the goals of ai research... we need to fall back on human subjectivity to define our subgoals.


All objective criteria are chosen, directly or indirectly, based on subjective criteria.


Also true. Subjectivity is unavoidable as long as we are relevant.. or so it seems circa 2020


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: