I love bringing up Pokemon as a humbling example of how AI progress is counterintuitive. Naively we might see games like Go, Chess, StarCraft, and Dota as more "complex," yet current approaches still tend to fail here, a game we would expect an early-grade school child to have no problem completing because of how incredibly sparse the reward function is. I hope that one day we'll get a satisfying tabula rasa solution to narrative-based/world model games like Pokemon rather than something like "well it turns out a GPT ingested a gameplay walkthrough during training and can regurgitate that as gameplay inputs womp womp."
To be fair, even those early-grade-school children come in with the ability to read the text, while game-playing AI usually does not. If a human tried to play Pokémon Red without being able to read anything (even the numbers!), they would probably succeed eventually, but it would be quite difficult and frustrating. And even without reading, the human would still be making inferences based on how the graphics and sound resemble real-world objects. So for an AI to play the game “properly”, it really shouldn’t be a complete tabula rasa; it should have some of that knowledge too.
> even those early-grade-school children come in with the ability to read the text, while game-playing AI usually does not.
You sure? :D No idea what a grade school is, but I couldn't read a word of English at the age that I played Pokémon on the gameboy!
I did need occasional hints (every couple weeks) from my older cousin to get unstuck, but afaik still loved the game, I guess just because it's about Pokémon. I think cut (meaning both pussy and fuck in Dutch, fun fact) was one of the things I got stuck at, perhaps needing to teach it to a Pokémon, but probably on how to use it. It's a number of different things to connect and so you can't button mash your way through, iirc (it has been a few decades)
Quick anecdote: a nephew of mine, who was in third grade at the time, was playing my brother-in-law’s old Gameboy and some version of Pokémon.
He wasn’t reading the text. He would then get stuck, not knowing what to do, and start over. He did this several times and complained that the game was broken.
Some time later I gave him Pokémon Sword for his Switch, as I thought it’d be good for him. Some time after that he was diagnosed with ADD (to the surprise of no one). Some time after that he proudly told me he beat it.
He also said he didn’t like it that much. Hah. I’ll still count it as a win.
Go-Explore could probably solve this, but yeah, I wouldn't describe that as a satisfying solution yet. DRL has been in hibernation because generative AI has sucked all the oxygen out of the room, but I remain optimistic about directions like Gato working.
> Naively we might see games like Go, Chess, StarCraft, and Dota as more "complex"
The strong AIs made for those games had access to some crucial human "help" though.
* AlphaStar (For StarCraft) was trained on a massive database of human games, while this project started Pokemon Red with a blank slate.
* AlphaZero and AlphaZero Chess did start blank slate, but had a hand-coded and domain-specific search algorithm (MCTS) to explore various actions (800 iterations IIRC) before actually making that action. There's no obvious way to do something similar for Pokemon Red.
Was it a salvage title car? A year ago a clean title FRS with 100k miles would still easily pull $10k at auction, so you got a heck of a deal if it didn't come with major defects.
I'm not sure how to reconcile the purported gains with the fact that matrix multiplies are empirically the most heavily accelerated primitive [1] on current-gen hardware and that the "digital ops" shown here aren't even a blip on the "fractional of total compute" Figure 6. Sure, they're very small in terms of FLOPs, but they take up a disproportionate amount of time being bandwidth-bound. Intuitively, adding another hop off-chip and A/D or D/A conversion doesn't sound great, and I wonder if that's why this work sticks to efficiency over end-to-end throughput. Given that GPUs today mostly trade efficiency for clock rate and speed (think about how a single GPU can be at > 300W TDP), how much efficiency could we gain by simply inverting that tradeoff?
I haven't read this paper yet but I'm familiar with the general work. The aspect that everyone ignores is that, yes linear transformations like matrix operations or fourier transforms are incredibly fast in optics, however the nonlinearity is the sticker. While optical propagation is nonlinear, you need very high intensities. The elephant in the room is that the linear operations rely on parallelism, i.e. they split the optical power up into multiple paths so each path has very low intensity, thus exhibits low nonlinearity. The solution that has been that everyone simply used optical to electrical conversion and did the nonlinearity digitally (or sometimes in analog electronics). That sort of works for one layer, but completely falls apart for multiple layers, it is neither cost not energy efficient to have hundreds or possibly thousands of a/d converters.
It's interesting because of the scaling law. No matter how much acceleration matrix multiplication gets on an electronic circuit, its energy usage is always going to scale as O(n^2.something). The implication here is that the energy usage by doing it optically is O(1). At least, that's how I read "We found that the optical energy per multiply-accumulate (MAC) scales as 1/d where d is the Transformer width". The best you can hope for is to stay on the right side of the constant factors (which, currently, the GPU world is).
What is a bit concerning is the speculation that the flight continued to the intended destination in an effort to avoid having the incident captured by cockpit voice recorder (which apparently only have a recording window of 2 hours???!?).
If the problem was understood and corrected, I don't see any operational reason to not continue to the destination. (Same story on the AA runway incursion in NY. They knew what happened; it was corrected; they called company and continued the trip.) The CVR over-write is a side-effect of those correct decisions.
It gets exhausting seeing the same NIMBY talking points trotted around every once in a while. I guess the "new developments are just luxury housing" talking points have started to smell so we're back to the good old "character of the neighborhood."
Taking a look at the "character of the neighborhood" in South Bay and I couldn't be more thrilled that developers are putting up inoffensive boxes rather than some rehash of "Spanish/Mediterranean revival" that seemed to dominate circa 2000-2015. I have a little more hope that these "bland boxes" won't look garishly outdated in a generation. And that's neglecting the irony entirely of "European" styling applied to single family sprawl.
EDIT: Spoke too soon, "For many people, 5-over-1s have come to symbolize the most painful aspects of today’s housing crisis — stand-ins for gentrification, corporate landlords and excessively high rents." Do better, NYT.
The frustrating thing is that these units aren’t even luxurious, they are just new and relatively expensive. Having lived in and visited other who live in this type of construction, you consistently see that these are slapped together to look nice in photos, while the laminate floors, sinks, etc are poorly installed, and the wear shows pretty quickly.
It says a lot about the state of America that in unit laundry and ~500 sq ft is “luxury”.
And yet it's still an valued point. Some things can be trite and true, why spend energy entertaining contrairianism.
Being bold and out of character may inadvertently build character (brutalism). Movies have done the same: appeal to the masses by not being noticeably placeable.
Chicago has this many (show 1 finger) new floorplans replacing the brown/graystones which were themselves ubiquitous, and it's soul crushing. Every year I live in this city adds to the bane of noticing all the "misses" on long term investments and diversity (housing, minimal investment in public spaces, say "bike lane" as many times as you can in one sentence.)
Keep contention high by being noticeable and trying. not bland by contrast.
I was recently in a similar boat, as the combination of discord + twitch was really making my 11 year old i5-2520M-based T420 really show its age (WHY IS A DESKTOP CHAT CLIENT AN ELECTRON APP??!??!) even with 16 GiB RAM in Arch Linux. I settled on a 16GB X13 Gen 2 from the outlet store (which seemed like a deal at ~520 USD). Coincidentally that's where I bought the T420 in 2011. The outlet store seems like a good way to snag a bargain for Linux users if you have a bit of patience as I would be looking at a last-gen model anyway (which seems to be primarily what the store stocks) for better compatibility.
>WHY IS A DESKTOP CHAT CLIENT AN ELECTRON APP??!??!
Because the current software development approach/practices/tools used close to everywhere are fundamentally flawed driving software development cost to absurdity and making it not viable to not use electron (which allows you to reuse most of the web-client code). At best you get a specialized version for phone apps, but even that is often not worth the money.
And it's not something as simple as "just use approach X" or "tool Y" or "language Z" it a much more complex, subtle and deeply rooted problem across all of the industry, universities and other people driving the software industry.
My particular deal seems to be gone now, but they come and go periodically. I also didn't bother with the (bait?) models at the top of the page that never seem to be in stock.