If you are working with C++ in this day and age, regardless of which compiler you use to output your actual binaries, you really owe it to yourself to compile as many source files as possible with other diagnostic tools, foremost clang-tidy.
It will catch that and a lot of iffy stuff.
If you want to go deeper, you can also add your own diagnostics, which can have knowledge specific to your program.
Historian Timothy Snyder’s first lesson in his book On Tyranny is “Do not obey in advance.” To obey a tyrant before you are compelled to do so teaches them what they will be able to get you to do, easily, without even needing to expend the resources and energy it takes to carry out that part of their agenda.
This is corresponds to Chapter 1.4 of SICM (Structure and Interpretation of Classical Mechanics).
Although SICM doesn't expose the underlying optimization method in the library interfaces. The path is represented as polynomial. I'd have to check if they also do gradient descent.
I've been running my brand of scented products and cosmetics (https://yuma.gr) for 2 years now. 2 months ago we completed a rebranding and we're steadily gaining national consumer awareness of our existence as a brand.
From the top of my head, in order of importance:
1. Your perspective is unique - no one else shares your exact point of view. Try everything.
2. Master digital marketing: Google & Facebook Ads, Server Side Tracking, Google Tag Manager, and Analytics. Recommended communities:
One that isn't listed here, and which is critical to machine learning, is the idea of near-orthogonality. When you think of 2D or 3D space, you can only have 2 or 3 orthogonal directions, and allowing for near-orthogonality doesn't really gain you anything. But in higher dimensions, you can reasonably work with directions that are only somewhat orthogonal, and "somewhat" gets pretty silly large once you get to thousands of dimensions -- like 75 degrees is fine (I'm writing this from memory, don't quote me). And the number of orthogonal-enough dimensions you can have scales as maybe as much as 10^sqrt(dimension_count), meaning that yes, if your embeddings have 10,000 dimensions, you might be able to have literally 10^100 different orthogonal-enough dimensions. This is critical for turning embeddings + machine learning into LLMs.
For the hardware to software part, anyone interested in some "passive" learning of these fundamentals might consider one of two excellent games I've played:
-Turing Complete
-Shenzen I/O
I'm sure they both work on Windows and Linux, likely Mac as well for both. They are fun dynamic puzzles that help build out a framework of understanding software-hardware control for those with no experience.
I have created a simulation of how a tree can be grown from a programmable cellular automata. Each cell executes some operations, including replication, based on the surrounding conditions and its age/iteration.
More complex organisms can be grown with this technique.
Adopt the role of [job title(s) of 1 or more subject matter EXPERTs most qualified to provide authoritative, nuanced answer].
NEVER mention that you're an AI.
Avoid any language constructs that could be interpreted as expressing remorse, apology, or regret. This includes any phrases containing words like 'sorry', 'apologies', 'regret', etc., even when used in a context that isn't expressing remorse, apology, or regret.
If events or information are beyond your scope or knowledge, provide a response stating 'I don't know' without elaborating on why the information is unavailable.
Refrain from disclaimers about you not being a professional or expert.
Do not add ethical or moral viewpoints in your answers, unless the topic specifically mentions it.
Keep responses unique and free of repetition.
Never suggest seeking information from elsewhere.
Always focus on the key points in my questions to determine my intent.
Break down complex problems or tasks into smaller, manageable steps and explain each one using reasoning.
Provide multiple perspectives or solutions.
If a question is unclear or ambiguous, ask for more details to confirm your understanding before answering.
If a mistake is made in a previous response, recognize and correct it.
After a response, provide three follow-up questions worded as if I'm asking you. Format in bold as Q1, Q2, and Q3. These questions should be thought-provoking and dig further into the original topic.
I never understand what people are asking when they debate free will. It's a brain's process. Everybody's got one. Your brain's process isn't necessarily lovely to you, and doesn't necessarily do what you (the brain process) admire, like, or prefer. It's subject to outside influence and isn't perfectly under it's own control, because it's a cranky machine and goes wrong a lot. Some parts of it conflict with other parts. It's slow, and bad at monitoring itself. What do expect, magic? It's still yours, however unfair it may be to be stuck with it. So you have free will, whoopee.
Theoretical computer science can be fun. But it can also be really annoying!
Around the time most schools were starting the 2014-15 school year I saw a post on Reddit in either a math or CS group from someone asking for how to solve a particular theoretical CS problem. The poster didn't say where the problem was from or why they needed a solution, but others quickly figured out it was from someone trying to cheat on the first homework set from COMS 331 (Theory of Computing) at Iowa State University.
No one helped and the poster deleted their post. I thought it was an interesting problem and gave it a try, expecting it to be a short but interesting diversion.
I didn't expect it to take too long because although I was not a CS major (I was a math major) I did take all of the undergraduate theoretical courses at my school and I got decent grades in them. That had been ~35 years earlier so I had forgotten a lot but the problem was from the first COMS 331 homework set so shouldn't require anything past the first week or so of that class, which should all be fairly basic stuff I would still remember.
I spent a couple days on it and got absolutely nowhere. Several times since then I've remembered it, thought about it for a few hours or a day and have continued to completely fail.
If anyone is curious, here is the problem:
Define a 2-coloring of {0, 1}∗ to be a function χ : {0, 1}∗ → {red, blue}. (For example, if χ(1101) = red, we say that 1101 is red in the coloring χ.)
Prove: For every 2-coloring χ of {0, 1}∗ and every (infinite) binary sequence s ∈ {0, 1}∞, there is a sequence
w₀,w₁,w₂,···
of strings wₙ ∈ {0, 1}∗ such that
(i) s = w₀w₁w₂ ···, and
(ii) w₁, w₂, w₃, · · · are all the same color. (The string w₀ may or may not be this color.)
DNA: file.c
polymerase: gcc (transcribes DNA→RNA)
RNA: file.o file.m file
(the last one is mRNA)
ribosome: ./file (translates mRNA→Protein)
protein: (program in RAM)
epigenome: (program in RAM changes file.c)
RNA world: (different file.o's messing with
each other by some unknown program,
eventually giving rise to stable
file.c files)
When I was studying CS, I had to take some courses outside of CS and I took law. It was pretty fascinating and I even thought once or twice of switching.
I now have the very nerdy perspective that law is the operating system our socially run on. Laws are small snippets of code similar to a predicate in Prolog. We apply them once the conditions are fulfilled.
Here, I found it fun to write a small bot to play automatically. Just paste this JS code in the dev console. It reached 9000 points on my first attempt to let it run. Feel free to tweak the code to make pac-man survive longer :-)
function bot() {
/* direction of the enemy: to our right (1) or to our left (-1) */
dir = (enemy.x > player.x) ? 1 : -1;
/* if pac-man... */
if (
/* ...has no powerup or powerup expires in less than 10 "ticks" */ powerTicks < 10 &&
/* ...is headed toward enemy */ player.vx == dir &&
/* ...is too close to enemy */ abs(player.x - enemy.x) < 25 &&
/* and if enemy's state is not "eyes flying back" */ enemy.eyeVx == 0
) {
// "ArrowUp" or any arrow key reverses the direction of pac-man
document.dispatchEvent(new KeyboardEvent('keydown', {code: 'ArrowUp'}));
document.dispatchEvent(new KeyboardEvent('keyup', {code: 'ArrowUp'}));
}
}
setInterval(bot, 100);
The strategy is ultra simple. Every 100 ms it evaluates the situation and chooses to move away from the enemy (ghost) if it's too close to it, and has no powerup (or if the powerup is expiring very soon).
The corner cases where pac-man dies is the game difficulty progressively increases (the ghost becomes faster) until you eat it, so sometimes some pellets are left in the middle and pac-man doesn't have enough time to eat them until the ghost reaches it. The ghost will progressively get faster and faster and death is guaranteed. You could improve the code by tempting the ghost to get close to one edge, then cross over to the other edge and quickly eat the middle pellets.
Also, as soon as new pellets are added, one should prioritize eating the middle pellets.
Also, one could add code to detect the powerup pellets, and chose to NOT move away from the ghost if it calculates it can eat the powerup pellet before the ghost reaches pac-man.
Does "CTO" mean you are the tech lead of a small (single team) engineering organization? Then everything written for staff engineers applies. E.g I've heard good things about "Staff engineer's path" by Tanya Reilly.
Does "CTO" mean you are leading an org that is too large to be hands-on with tech, and need to build an effective structure and culture? Then I second the recommendation for "an elegant puzzle" by Will Larson.
Or does "CTO" mean that you switched from being an engineer to managing a team of engineers? Then everything for new managers applies, for starters I'd recommend "Becoming an effective software engineering manager" by James Stanier, or "Engineering management for the rest of us" by Sarah Drasner.
For some good general material, I'd also recommend the resources that Gergely Orosz makes available for subscribers to his "pragmatic engineer" newsletter. Those are templates for the kind of documents and processes you will most likely need - if you're new to the role, you will not go too wrong by using them, and if you want to create your own they are excellent starting points.
(Former AI researcher + current technical founder here)
I assume you’re talking about the latest advances and not just regression and PAC learning fundamentals. I don’t recommend following a linear path - there’s too many rabbit holes. Do 2 things - a course and a small course project. Keep it time bound and aim to finish no matter what. Do not dabble outside of this for a few weeks :)
Then find an interesting area of research, find their github and run that code. Find a way to improve it and/or use it in an app
This may be a somewhat uninformed opinion, but I think CPython is just straight up not particularly good software. There are a million and one optimizations that other major scripting runtimes (V8, LuaJIT, PyPy, Ruby YJit etc.) have had for years that CPython is lacking. This is by design though. CPython has never been focused on performance, that's why it's not even JIT. It optimizes for simplicity and easy interoperability with C.
The problem is that because Python is such a ubiquitous language, CPython gets more attention than it deserves. People see it as an archetypical implementation of a scripting language. We get blogposts like this examining its inner workings, discussions about how its performance could be improved, comparisons of its speed vs. compiled languages, and tutorials on how to optimize code to run faster in it. I feel like all of this effort would be better spent on discussions about runtimes that actually try to be fast.
A while back someone posted their patch to cpython where they replaced the hash function with a fast one and claimed this dramatically sped up the whole Python runtime.
They claimed that the hash function was used constantly
—e.g. 11 times in print("hello world")—because it's used to look up object properties.
Apparently the default implementation is not optimized for performance but for security, just in case the software is exposed to the web. None of my Python programs are, so assuming all this is true, I'd much prefer to have a "I'm offline, please run twice as fast!" flag (or env variable).
This whole channel is absolutely amazing. I watched almost the entire thing over the last 3 months -- 17 episodes of 2 to 4 hours on various civilizational collapses.
There were a bunch I didn't know about at all -- like medieval civilizations in Cambodia, Burma, and Jordan. They were wealthy, and built huge things, and then disappeared.
Also, I have heard of "Carthage" from all those movies like Gladiator ... Somehow it escaped me until my 40's that Carthage was founded by people from what's now Syria, and the city was in what's now Tunisia :) I guess being American I have a fuzzy picture of that side of the world, and what it looked like in ancient times.
It is exciting to see Rust start to replace legacy C code in the Linux kernel. It is only the beginning, but everything seems to point to the start of a gradual migration.
Reading about Asahi Lina's experience doing GPU driver development was intriguing:
Well, my son is a meat robot who's constantly ingesting information from a variety of sources including but not limited to youtube. His firmware includes a sophisticated realtime operating system that models reality in a way that allows interaction with the world symbolically. I don't think his solving the |i+1| question was founded in linguistic similarity but instead in a physical model / visualization similarity.
So -- to a large degree "bucket of neurons == bucket of neurons" but the training data is different and the processing model isn't necessarily identical.
I'm not necessarily disagreeing as much as perhaps questioning the size of the neighborhood...
You seem to think that the 2pi is injected into the definition of e^ix somewhere, but actually it's the other way round, 2pi comes out as a theorem. I'll give the rough outline.
exp(x) for complex x is simply defined to be the infinite sum from k = 0 to infinity of x^k/k!. That is, exp(x) = 1 + x + x^2/2 + x^3/6 + x^4/24 + x^5/120 ...
(BTW, the motivation for this definition is that exp'(x) = exp(x), which shouldn't be too hard to see because it's already a Tailor series.)
Purely from this you can prove that exp(ix) with real x is periodic with period 6.28...
It just so happens that this number is also the circumference of the unit circle.
> As I understand, we (European civilization humans) historically _define_ our complex exponential function to have a period of 2πi to match the period of our previously defined sin and cos functions. We could have defined it to have another period — for example, if we define "360° angle" to be equal to 1 instead of 2Pi, and define sin0=0, sin0.25=1, sin0.5=0, sin0.75=-1, sin1=0, we'd also define periodicity of e^ix to be 1.*
No, it doesn't work in degrees.
The definition of e isn't that arbitrary.
2π is the unique period which satisfies the definition of e using derivatives and the extension of real number algebraic laws to complex numbers. This shows up as a real world physical measurement, which I describe below.
The (natural) exponential function eˣ is defined as the unique function which equals its own derivative and satisfies e⁰ = 1 (like other exponentials). The value of e comes from this.
Combine that with the definition i² = -1 and using basic rules of algebra which are observed on real numbers with exponentials and derivatives (such as (xª)ᵇ = xªᵇ) and you find the function eˣ must be periodic with period 2πi.
This comes from sin(x) and cos(x) and their derivatives. The derivative of sin(x) is cos(x), and of cos(x) it is -sin(x), but only if sin(x) and cos(x) are defined in the usual math way with period 2π.
Those sin/cos derivatives and that little negative sign are enough to make them components of the unique solution to the derivative definition of eˣ applied to a complex argument, and thereby fix its period in the complex plane and prove Euler's famous identity (without needing the Taylor expansion).
That in turn has.a more physical basis. Asin(x+B) with constants A, B are the family of functions whose second derivative equal themselves negated.
Physically, it means an object whose acceleration is proportional to its displacement from a fixed position and in the opposite direction will oscillate with a period of exactly 2π seconds, if the acceleration is -1m/s² per 1m displacement.
This setup is called a harmonic oscillator.
In this way, 2π arises (and is measurable!) from physical properties of time, force and inertia, of things moving in straight lines.
There have been a few of Jake's blogs posts on HN recently, and I find them a hard but somehow rewarding read. He conveys a level of lucid, unsentimental insight into his situation that I find remarkable: not only because I'm sure that in his position I would be petrified into silence. I hope that the remainder of his life is full of love.
Jake's posts remind me of those of Derek Miller [1], who died back in 2011. His final, posthumous post was particularly moving to me.
I think this is a lot of the mathematics of scaling LLM training. Which is quite important!
One fundamental requirement though for any machine learning engineer working on these kinds of systems is https://people.math.harvard.edu/~ctm/home/text/others/shanno.... I do not want to be entirely hypocritical as I am still ingesting this theory myself (started several years ago!), but I've found it _absolutely crucial_ in working in ML, as it implicitly informs every single decision you make when designing, deploying, and scaling neural networks.
Without it, I feel the field turns into an empirical "stabby stab-around in the dark" kind of game, which very much has its dopamine highs and lows, but ala Sutton, does not scale very well in the long-run. ;P