Hacker Newsnew | past | comments | ask | show | jobs | submit | nsomaru's favoriteslogin

I have ghostty set up with this “starfield” shader: https://github.com/0xhckr/ghostty-shaders/blob/main/starfiel...

I also have it set up to do adaptive theme, so in light mode the galaxy is mostly just a little noise on the black text but in dark mode it’s like I’m piloting a space ship. Highly recommend.

I also documented a few other shaders on my blog here: https://catskull.net/fun-with-ghostty-shaders.html

Edit: I use the "starfield" shader, not the "galaxy" shader. Doh!


Our agentic builder has a single tool.

It is called graphql.

The agent writes a query and executes it. If the agent does not know how to do particular type of query then it can use graphql introspection. The agent only receives the minimal amount of data as per the graphql query saving valuable tokens.

It works better!

Not only we don't need to load 50+ tools (our entire SDK) but it also solves the N+1 problem when using traditional REST APIs. Also, you don't need to fall back to write code especially for query and mutations. But if you need to do that, the SDK is always available following graphql typed schema - which helps agents write better code!

While I was never a big fan of graphql before, considering the state of MCP, I strongly believe it is one of the best technologies for AI agents.

I wrote more about this here if you are interested: https://chatbotkit.com/reflections/why-graphql-beats-mcp-for...


I’ve been using (and loving) https://github.com/raine/workmux which brings together tmux, git worktrees, and CLI agents into an opinionated workflow.

In case this can be helpful to somebody else, I spent my ~twenties ignorant of what attachment styles were, while definitely exhibiting some very, very obvious attachment patterns. And I made a lot of mistakes, and made a lot of people close to me sad.

Reading the "Attached" book was a huge wake-up call. According to the questionnaire, for what it's worth, I was exhibiting ~100% avoidant behavior.

This led to therapy, and to a lot of atonement, and growth.

I just came here to say - if you have a minute, give it a read. And for fun, try the questionnaire:

https://archive.org/details/AttachementTheory/page/n37/mode/...

Best of luck


Mmh, ime you need to discard the session/rewrite the failing prompt instead of continuing and correcting on failures. Once errors occur you've basically introduced a poison pill which will continuously make things to haywire. Spelling out what it did wrong is the most destructive thing you can do - at least in my experience

Good read minimaxir! From the article:

> Nano Banana supports a context window of 32,768 tokens: orders of magnitude above T5’s 512 tokens and CLIP’s 77 tokens.

In my pipeline for generating highly complicated images (particularly comics [1]), I take advantage of this by sticking a Mistral 7b LLM in-between that takes a given prompt as an input and creates 4 variations of it before sending them all out.

> Surprisingly, Nano Banana is terrible at style transfer even with prompt engineering shenanigans, which is not the case with any other modern image editing model.

This is true - though I find it works better by providing a minimum of two images. The first image is intended to be transformed, and the second image is used as "stylistic aesthetic reference". This doesn't always work since you're still bound by the original training data, but it is sometimes more effective than attempting to type out a long flavor text description of the style.

[1] - https://mordenstar.com/portfolio/zeno-paradox


SAM3 is cool - you can already do this more interactively on chat.vlm.run [1], and do much more. It's built on our new Orion [2] model; we've been able to integrate with SAM and several other computer-vision models in a truly composable manner. Video segmentation and tracking is also coming soon!

[1] https://chat.vlm.run

[2] https://vlm.run/orion


The idea behind search itself is very simple, and it's a fun problem domain that I encourage anyone to explore[1].

The difficulties in search are almost entirely dealing with the large amounts of data, both logistically and in handling underspecified queries.

A DBMS-backed approach breaks down surprisingly fast. Probably perfectly fine if you're indexing your own website, but will likely choke on something the size of English wikipedia.

[1] The SeIRP e-book is a good (free) starting point https://ciir.cs.umass.edu/irbook/


Reminds me of reading Programming Collective Intelligence by Toby Segaran, which inspired me with a range of things, like building search, recommenders, classifiers etc.

> doing the extra digging instead of just going along with the claim.

That's the intention of intermediary liability laws - to make meritless censorship be the easy, no-risk way out. To deputize corporations to act as police under a guilty-until-proven-innocent framework.


    People defer thinking about what correct and incorrect actually
    looks like for a whole wide scope of scenarios and instead choose
    to discover through trial and error.
LLMs are _still_ terrible at deriving even the simplest of logical entailment. I've had the latest and greatest Claude and GPT derive 'B instead of '(not B) from '(and A (not B)) when 'A and 'B are anything but the simplest of English sentences.

I shudder to think what they decide the correct interpretations of a spec written in prose is.


True! I've been doing this for years on Linux. I use a dedicated Chromium instance in app mode:

  /usr/bin/chromium --ozone-platform=wayland --enable-features=UseOzonePlatform,WaylandWindowDecorations,WebRTCPipeWireCapturer --user-data-dir=/home/myuser/.config/chromium-ilri --app=https://teams.microsoft.com
Works incredibly well (put this in a `.desktop` file with `Exec=` and you can launch it via your desktop's launcher). Some of the settings may not be needed anymore, as Chromium has come a long way in terms of Wayland support. I use Firefox for everything else, but haven't tried Teams there.

@simonw: slight tangent but super curious how you managed to generate the preview of that gemini-cli terminal session gist - https://gistpreview.github.io/?17290c1024b0ef7df06e9faa4cb37...

is this just a manual copy/paste into a gist with some html css styling; or do you have a custom tool à la amp-code that does this more easily?


We've done this, and it works. Our setup is to have some agents that synthesize Prolog and other types of symbolic and/or probabilistic models. We then use these models to increase our confidence in LLM reasoning and iterate if there is some mismatch. Making synthesis work reliably on a massive set of queries is tricky, though.

Imagine a medical doctor or a lawyer. At the end of the day, their entire reasoning process can be abstracted into some probabilistic logic program which they synthesize on-the-fly using prior knowledge, access to their domain-specific literature, and observed case evidence.

There is a growing body of publications exploring various aspects of synthesis, e.g. references included in [1] are a good starting point.

[1] https://proceedings.neurips.cc/paper_files/paper/2024/file/8...


This is a philosophical argument.

The way to look at this is first to pin down what we mean when we say Human Commonsense Reasoning (https://en.wikipedia.org/wiki/Commonsense_reasoning). Obviously this is quite nebulous and cannot be defined precisely but OG AI researchers have a done a lot to identify and formalize subsets of Human Reasoning so that it can be automated by languages/machines.

See the section Successes in automated commonsense reasoning in the above wikipedia page - https://en.wikipedia.org/wiki/Commonsense_reasoning#Successe...

Prolog implements a language to logically interpret only within a formalized subset of human reasoning mentioned above. Now note that all our scientific advances have come from our ability to formalize and thus automate what was previously only heuristics. Thus if i were to move more of real-world heuristics (which is what a lot of human reasoning consists of) into some formal model then Prolog (or say LLMs) can be made to better reason about it.

See the paper Commonsense Reasoning in Prolog for some approaches - https://dl.acm.org/doi/10.1145/322917.322939

Note however the paper beautifully states at the end;

Prolog itself is all form and no content and contains no knowledge. All the tasks, such as choosing a vocabulary of symbols to represent concepts and formulating appropriate sentences to represent knowledge, are left to the users and are obviously domain-dependent. ... For each particular application, it will be necessary to provide some domain-dependent information to guide the program writing. This is true for any formal languages. Knowledge is power. Any formalism provides us with no help in identifying the right concepts and knowledge in the first place.

So Real-World Knowledge encoded into a formalism can be reasoned about by Prolog. LLMs claim to do the same on unstructured/non-formalized data which is untenable. A machine cannot do "magic" but can only interpret formalized/structured data according to some rules. Note that the set of rules can be dynamically increased by ML but ultimately they are just rules which interact with one another in unpredictable ways. Now you can see where Prolog might be useful with LLMs. You can impose structure on the view of the World seen by the LLM and also force it to confine itself only to the reasoning it can do within this world-view by asking it to do predominantly Prolog-like reasoning but you don't turn the LLM into just a Prolog interpreter. We don't know how it interacts with other heuristics/formal reasoning parts (eg. reinforcement learning) of LLMs but does seem to give better predictable and more correct output. This can then be iterated upon to get a final acceptable result.

PS: You might find the book Thinking and Deciding by Jonathan Baron useful for background knowledge - https://www.cambridge.org/highereducation/books/thinking-and...


Very easy to solve, just like it is easy to solve many other ones once you know the tricks.

I recommend this book: https://www.amazon.com/Secrets-Mental-Math-Mathemagicians-Ca...


I don't want to keep editing the above comment, so I'm starting a new one.

I really recommend that anyone with an interest in CS and AI read at least J. Alan Robinson's paper above. For me it really blew my mind when I finally found the courage to do it (it's old and a bit hard to read). I think there's a trope in wushu where someone finds an ancient scroll that teaches them a long-lost kung-fu and they become enlightened? That's how I felt when I read that paper, like I gained a few levels in one go.

Resolution is a unique gem of symbolic AI, one of its major achievements and a workhorse: used not only in Prolog but also in one of the two dominant branches of SAT-Solving (i.e. the one that leads from Hillary-Putnam to Conflict Driven Clause Learning) and even in machine learning, in of the two main branches of Inductive Logic Programming (which I study) and which is based on trying to perform induction by inverting deduction and so by inverting Resolution. There's really an ocean of knowledge that flows never-ending from Resolution. It's the bee's knees and the aardvark's nightgown.

I sincerely believe that the reason so many CS students seem to be positively traumatised by their contact with Prolog is that the vast majority of courses treat Prolog as any other programming language and jump straight to the peculiarities of the syntax and how to code with it, and completely fail to explain Resolution theorem proving. But that's the whole point of the language! What they get instead is some lyrical waxing about the "declarative paradigm", which makes no sense unless you understand why it's even possible to let the computer handle the control flow of your program while you only have to sort out the logic. Which is to say: because FOL is a computational paradigm, not just an academic exercise. No wonder so many students come off those courses thinking Prolog is just some stupid academic faffing about, and that it's doing things differently just to be different (not a strawman- actual criticism that I've heard).

In this day and age where confusion reigns about what even it means to "reason", it's a shame that the answer, that is to be found right there, under our noses, is neglected or ignored because of a failure to teach it right.



WebSockets are the secret ingredient to amazing low- to medium-user-count software. If you practice using them enough and build a few abstractions over them, you can produce incredible “live” features that REST-designs struggle with.

Having used WebSockets a lot, I’ve realised that it’s not the simple fact that WebSockets are duplex or that it’s more efficient than using HTTP long-polling or SSEs or something else… No, the real benefit is that once you have a “socket” object in your hands, and this object lives beyond the normal “request->response” lifecycle, you realise that your users DESERVE a persistent presence on your server.

You start letting your route handlers run longer, so that you can send the result of an action, rather than telling the user to “refresh the page” with a 5-second refresh timer.

You start connecting events/pubsub messages to your users and forwarding relevant updates over the socket you already hold. (Trying to build a delta update system for polling is complicated enough that the developers of most bespoke business software I’ve seen do not go to the effort of building such things… But with WebSockets it’s easy, as you just subscribe before starting the initial DB query and send all broadcasted updates events for your set of objects on the fly.)

You start wanting to output the progress of a route handler to the user as it happens (“Fetching payroll details…”, “Fetching timesheets…”, “Correlating timesheets and clock in/out data…”, “Making payments…”).

Suddenly, as a developer, you can get live debug log output IN THE UI as it happens. This is amazing.

AND THEN YOU WANT TO CANCEL SOMETHING because you realise you accidentally put in the actual payroll system API key. And that gets you thinking… can I add a cancel button in the UI?

Yes, you can! Just make a ‘ctx.progress()’ method. When called, if the user has cancelled the current RPC, then throw a RPCCancelled error that’s caught by the route handling system. There’s an optional first argument for a progress message to the end user. Maybe add a “no-cancel” flag too for critical sections.

And then you think about live collaboration for a bit… that’s a fun rabbit hole to dive down. I usually just do “this is locked for editing” or check the per-document incrementing version number and say “someone else edited this before you started editing, your changes will be lost — please reload”. Figma cracked live collaboration, but it was very difficult based on what they’ve shared on their blog.

And then… one day… the big one hits… where you have a multistep process and you want Y/N confirmation from the user or some other kind of selection. The sockets are duplex! You can send a message BACK to the RPC client, and have it handled by the initiating code! You just need to make it so devs can add event listeners on the RPC call handle on the client! Then, your server-side route handler can just “await” a response! No need to break up the handler into multiple functions. No need to pack state into the DB for resumability. Just await (and make sure the Promise is rejected if the RPC is cancelled).

If you have a very complex UI page with live-updating pieces, and you want parts of it to be filterable or searchable… This is when you add “nested RPCs”. And if the parent RPC is cancelled (because the user closes that tab, or navigates away, or such) then that RPC and all of its children RPCs are cancelled. The server-side route handler is a function closure, that holds a bunch of state that can be used by any of the sub-RPC handlers (they can be added with ‘ctx.addSubMethod’ or such).

The end result is: while building out any feature of any “non-web-scale” app, you can easily add levels of polish that are simply too annoying to obtain when stuck in a REST point of view. Sure, it’s possible to do the same thing there, but you’ll get frustrated (and so development of such features will not be prioritised). Also, perf-wise, REST is good for “web scale” / high-user-counts, but you will hit weird latency issues if you try to use for live, duplex comms.

WebSockets (and soon HTTP3 transport API) are game-changing. I highly recommend trying some of these things.


Love all of these tips. I've hosted dozens of events since moving to NYC and figured I'd add 5 more:

1. If this is a dinner party (or people are all seated), force people to get up and move in a way that they'll meet new people. Do this when you're about 2/3 of the way through the party. Some will complain - do it anyway.

2. Plan 1 (ideally 2) interludes. It can be a small speech, moving people around, changing locations, having people vote on something, etc. For whatever reason, they make the night more memorable.

3. Do your best to make introductions natural and low-pressure. Saying things like "you two would really get along" can put pressure on people - especially shy ones. Bring up something they have in common and let them chat while you back away.

4. Go easy on folks who cancel last minute. They often don't feel good about doing it and you don't want to add more stress to them or yourself.

5. More music != more fun. Some music is good, but if people can't hear each other, turn it down.

If you're interested reading more about this stuff, read The Art of Gathering by Priya Parker.


Wonderful project.

There's also mermaidjs to excalidraw https://github.com/excalidraw/mermaid-to-excalidraw


AFAIK, Automerge people work pretty hard on Beehive and Keyhive. Once released, that’ll be exactly what you asked for: https://www.inkandswitch.com/keyhive/notebook/05/ You can also use Yjs over Matrix (which has e2e encryption): https://github.com/YousefED/Matrix-CRDT

Shameless plug: I'm betting that a lot of applications could use some form of CRDT as a Database, which would allow a fully decentralized backend/database for local-first apps. So I've been building one.

Still working on good blog posts to explain and introduce it though.

https://github.com/arcuru/eidetica


While new works do trickle into the public domain every year, it's worth noting that copyright status is not currently the main bottleneck to public domain cultural works being made widely available to the public under F.A.I.R. (Findable, Accessible, Interoperable, Reusable) principles. There's a huge amount of extant literature and music in the public domain that someone has scanned and perhaps put up online as raw page images, but no one took the effort to transcribe/index/classify it yet and thus make it easily available for most uses - such that it might as well not exist as far as most people are concerned. This is where efforts like Project Gutenberg can be especially valuable.

This is exactly why I made Yaak [1]. It's fully offline, no telemetry, open source, and can even sync with Git.

https://yaak.app


> Overusing DISTINCT to “Fix” Duplicates

I wrote a small tutorial (~9000 words in two parts) on how to design complicated queries so that they don't need DISTINCT and are basically correct by construction.

https://kb.databasedesignbook.com/posts/systematic-design-of...


For reference, Libreture maintains a list of non-DRM bookshops [1].

[1] https://libreture.com/bookshops/


Me too. When they removed the option to download books I liberated everything I had ever bought, moved to Kavita+koreader and will never buy a kindle book again.

I jailbroke both kindles. And use koreader on them which now supports progress sync with Kavita which is amazing! So I don't really lose functionality.


I vibecoded a similar app. Here’s the open source link, if folks want to build their own:

https://github.com/naveedn/audio-transcriber


If you want to migrate off Spotify but are worried you’ll lose your library, feel free to checkout my tool Libx (libx.stream). It’s a tool to export your entire Spotify library to a nice and neat CSV file

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: