> All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the rope_scaling configuration only when processing long contexts is required. It is also recommended to modify the factor as needed. For example, if the typical context length for your application is 524,288 tokens, it would be better to set factor as 2.0.
FWIW, I used to use a light and sound machine (Mindplace Procyon) and was able to induce these states with minimal effort. And I had a couple dozen experiences w/ psilocybin in my college years, so I'm well versed in what they should be like.
The goggles w/ binaural beats create some weird sort of state where I don't feel any connection to my environment. After only a couple minutes my body turns to total mush and my brain comes alive with phosphene visuals. By about 15 minutes in, my stomach usually gurgles a bit, not unlike the indigestion that often accompanies psychedelic trips.
Interestingly enough, these machines are marketed as brainwave entrainment, but the literature on that says the visual component doesn't really have much impact. Yet auditory entrainment on its own doesn't seem to do much for me either, or at least, not convincing enough beyond placebo.
There is an app for the iPhone called Lumenate that uses the LED flash and it seems to work, though it's not as strong for me as the multi-LED goggles I used to use. Still, it's a great gateway for those who are curious.
I work at Google on these systems everyday (caveat this is my own words not my employers)). So I simultaneously can tell you that its smart people really thinking about every facet of the problem, and I can't tell you much more than that.
However I can share this written by my colleagues! You'll find great explanations about accelerator architectures and the considerations made to make things fast.
Edit:
Another great resource to look at is the unsloth guides. These folks are incredibly good at getting deep into various models and finding optimizations, and they're very good at writing it up. Here's the Gemma 3n guide, and you'll find others as well.
It's a breath of fresh air watching anime adaptations that actually love and respect the original author's work like dungeon meshi.
Meanwhile western fantasy adaptations seem to be full of arrogant showrunners that think their vision is better: wheel of time, the Witcher, House of the dragon (to an extent). George RR Martin even publicly criticised HBO for this recently.
I am absolutely blown away by the number of people who are saying that this person, who is clearly depressed and likely burnt out, can solve all of his problems by having children. It is the most insane advice I have ever seen on this site.
Jumping to a lifetime commitment as a kneejerk reaction is just so wild to me. Maybe start with a hobby? This guy is working a full time job and a side hustle, but doesn't seem to do anything for himself. What happens if he has kids and realizes he's still unfulfilled? It's not like he can just return them.
The original[1][2] articles are a better read IMO. The link is just a summary of the two with added spelling and grammatical errors that materially impact the meaning.
Surprised no one has mentioned another great and similar resource called Rustlings [0] (yes very punny name). You are given some files with todo statements which you'll need to fix and make the code compile and pass all the tests. It's an interactive way to learn which is what got me through learning Rust a few years ago.
There's a reason pretty much everything that does not require low latency replies avoids stateful networking - everything from RSS to video streaming prefers stateless polling designs because it is vastly easier for both parties to implement and scale. Meanwhile, I couldn't name a single system in widespread use built around a MQ paradigm in its public interface, except for actual MQ APIs, and many of those (e.g. from AWS) are still built on polling for the reasons just described
Maybe this is a good time to mention the Web Archives browser extension [1] that offers links to various cache / archive providers for any page you visit from a toolbar button. There are many such extensions, this is the one I've been using occasionally. Simple but very useful.
I haven't tried since the beginning of the Reddit strike though, I don't use Google and I only very occasionally run into Reddit pages. I know ArchiveTeam has asked help to archive Reddit [2].
Very interesting! I've been reading up on nutrition recently and saw articles talking about this ratio, but they were focused on inflammation, not on body composition or hunger.
tmjdev wrote:
> If it was the ratio being studied, could you instead increase your n3 fat consumption?
Some articles suggest exactly that: "To improve the ratio of omega-3 fats to omega-6 fats, eat more omega-3s, not fewer omega-6s." [1]
If you're looking for sources other than fish, consider chia. Two tablespoons of chia give 4.3 g of omega-3 fat, as well as 10 g of fiber for no net carbs. [2] Don't eat chia seeds dry; [3, 4] mix into 1.25 cups of water and leave them for at least 30 minutes. I also combine with oats. (I used to eat 0.5 cups of oats every morning. When trying to lower carbs, I went down to 0.25 cups of oats and the chia. The omega-3 fat was a nice surprise.)
On the other hand, minimizing processed and restaurant foods is probably a good idea in general. And of course if your goal is to lose weight, removing something from your diet makes more intuitive sense than adding something.
> Short positions by the means of shorting ETFs containing GME are reported as ETF SI (and XRT, by the way, has 700% SI)
XRT only holds 0.71% GME shares, so seems like an extremely inefficient way of shorting it. The entire XRT short position is around 170k GME shares, or about 0.4% of the float
> Synthetic short positions (e.g. by the means of options) do not have to be reported and FINRA is only now 'considering' asking to report it
But put options don't get squeezed (call options may cause a gamma squeeze)
> Retail directly registered over 10% of available float (excl. insider shares) as of 3 months ago
What relevance does that have? There's still 90% of the float available to be lent with a short interest of just 15%
I still hold some GME shares leftover from last year but fully believe the short squeeze has happened and there's nothing but a collective delusion left
This TED talk has a great example where the same sound sounds different to you based on what you expect to hear, if you need to prove this to yourself [1]. His thesis is also that the brain is a prediction machine.
>The mind has a basic habit, which is to create things. In fact, when the
Buddha describes causality, how experiences come about, he says that the power of creation or sankhara—the mental tendency to put things together—actually comes prior to our sensory experience. It’s because the mind is active, actively putting things together, that it knows things.
>The problem is that most of its actions, most of its creations, come out of ignorance, so the kind of knowledge that comes from those creations can be misleading.
The second paragraph is getting off the topic - or is it?
alterF :: (Functor f, Ord k) => (Maybe a -> f (Maybe a)) -> k -> Map k a -> f (Map k a)
Note that it works for all instances of Functor. Trivial choices allow this to reduce to simple things like insert or lookup:
insert :: Ord k => k -> a -> Map k a -> Map k a
insert key val map = runIdentity (alterF (\_ -> Identity (Just val)) key map)
lookup :: Ord k => k -> Map k a -> Maybe a
lookup key map = getConst (alterF Const key map)
But those aren't really compelling examples because they just reproduce simpler functionality. It starts to pay off when you start compounding requirements, though. What if you need to insert a value and return the previous one, if it existed?
insertAndReturnOld :: Ord k => k -> a -> Map k a -> (Maybe c, Map k a)
insertAndReturnOld key val map = alterF (\old -> (old, Just val)) key map
OK, those are all fine and good, but they're still barely scratching the surface. What if you had problem where when you had a value to insert and there was a previous value at the same key, it was ambiguous which you should use? And let's say the structure of the problem provides interdepencies which restrict the options such that you can't just stack up a list of every possibility for each key. So ideally, you'd like a way to model "insert this value, but if something was already present at this key, give me both possible maps back". Turns out alterF can do that!
insertNonDet :: Ord k => k -> a -> Map k a -> [Map k a]
insertNonDet key val map = alterF (maybe [Just val] (\old -> [Just old, Just Val])) key map
Fun facts with that one - thanks to using Functor, it only traverses the tree once, whether it returns one or two results. And thanks to Map being a persistent data type, returning two results only takes O(log n) space more than returning one.
(note: this is all typed on my phone without a compiler to verify. There might be simple mistakes in the above, but it's all conceptually sound.)
That's all still just the start. You can insert an IO operation on the old value to calculate the new one, and it still only traverses the data structure once. Or a huge number of other things. Anything that can be made into a Functor can be used to augment the operation alterF does.
And you know the best part of all this? Despite all that freedom, the type of alterF tells you that you can't change the key associated with a value and even that you can't operate on more than one value in the map. It really is nice when simple things give you lots of options, but make it clear what they don't offer.
If anyone is struggling with sleep, I highly recommend episode 1-3 of the Huberman Lab podcast (https://www.youtube.com/watch?v=H-XfCl-HpRM), it has literally changed my life. Dr. Huberman is a Neurobiologist at Stanford and explains the science of sleep vs just telling you to not look at bright screens at night.
I struggled with sleeping for 20+ years, now I easily wake up early.
The biggest game changer: Sunlight in your eyes within 30 minutes of waking up. Try morning walks for 1 week (without sunglasses) and your life will be changed.
I haven't done much reading on this subject in a while, but I recall from Gary Taubes' Good Calories, Bad Calories that the issue is not that CICO is somehow flawed, but rather that there are meta-metabolic effects (if that makes sense) from the food we eat that ends up changing the factors in the CICO equation. In other words, none of the variables are static, and they in fact depend upon the nature of calories eaten.
The canonical example: because high levels of glucose in the blood can be fatally toxic, the body must spike insulin levels after a (simple) carbohydrate-rich meal in order to get rid of it quickly. One effect of insulin is an increase in fat synthesis and storage of triglycerides in fat cells. A diet rich in simple carbohydrates can result in chronically elevated insulin levels, which means that the body is constantly stashing calories away in fat cells.
The point is that the "caloric equation" that you refer to is a grossly oversimplified model of human metabolism that has led to what might be called Goodhart's Law of dietary advice.
My experience with out of memory is that in every single language and environment I worked on in the past, once an application hits that condition, there is very little hope of continuing reliably.
So if your aim is to build a reliable system, it is much easier to plan to never get there in the first place.
Alternatively, make your application restart after OOM.
I would actually prefer the application to just stop immediately without unwinding anything. It makes it much clearer as to what possible states the application could have gotten itself to.
Hopefully you already have designed the application to move atomically between known states and have mechanism to handle any operation getting interrupted.
If you did it right, handling OOM by having the application drop dead should be viewed as just exercising these mechanisms.
My biggest take-away from this was FRACTRAN[1], The Bestest Ever™ programming language designed by the (sadly, late) John Conway.
To run a FRACTRAN program, you lookup its catalogue number, and repeatedly evaluate a certain simple function on it (which has the same spirit as the 3x + 1 one in the video).
As in 3x+1, all operations are integer operations.
FRACTRAN is Turing-complete, of course, so you can rewrite any program in FRACTRAN, and the paper provides quite a few examples!
Written as a sales pitch, this is both the most hilarious paper I've read in a long time - and one of the most mind-blowing ones.
It's pretty accessible (as far as math/CS papers go), too!
I do 5/3/1 and it does have it's tradeoffs but I've seen gains while on it. Basically the goal is to make you stronger, it was developed by a powerlifter and is heavily influenced by traditional american football training. I've linked a PDF below[1] which explains the whole thing in language anyone can understand.
Core lifts: Bench press, Squat, Deadlift, Overhead Press. Some people choose a different set of lifts.
The 531 thing means week one you do 3 sets of 5 on the core lift, week two you do 3 sets of 3, week three you do a set of 5, a set of 3, and a set of 1. Week four is deload, you do lighter weight for three sets of five. In all cases (except deload) the last set is actually for "AMRAP" i.e. as many reps as possible.
Ultimately weightlifting is not a modern science, it is an ancient practice akin to meditation or running or martial arts. There is ongoing research to optimize it but nobody here is going to the league and for us the most important thing is to show up consistently and track progress. The most impactful thing I ever did for my lifting was to create a spreadsheet I could update from my phone and write down how much I lifted and how many reps every time I went to the gym. I do something like this:
I'm proud to say I reached the end of my google sheet and had to start a new one. I am fortunate I was exposed to weightlifting early in life but after neglecting my training for most of my twenties (I'm 32 now), most of my current gains happened with 531. I hope you will start lifting! The benefits weight training has brought to my life can hardly be overstated.
Oh and stay away from planet fitness, that's not a gym[2]. Their business model is based on appealing to people who don't work out. You want to work out, go somewhere else.
It's definitely worth reading the full thread if you haven't
> Did you know you can just fork() the @rustlang
compiler from your proc_macro? :D
> This is, of course, a horrible idea.
> Unfortunately rustc is multithreaded and fork() only forks one thread. We'll just assume the other threads weren't doing anything important.
> In today's rustc, this means we're only missing the main thread in the fork. But that one was only going to call exit() after joining our thread anyway, so not very important. It just means the process always exits with a success status. So instead we grep the output for "error".
> All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the rope_scaling configuration only when processing long contexts is required. It is also recommended to modify the factor as needed. For example, if the typical context length for your application is 524,288 tokens, it would be better to set factor as 2.0.
https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking