There's this anecdote, somehow connected to this topic.
Some decades ago, a manufacturer from East Germany, former GDR, was participating at a fair for lights and light bulbs. This manufacturer invented a light bulb that never burns its glow wire.
At some point during the fair the companies from West Germany had a big laugh on that manufacturer, mocking him and his invention. Their argument: If you build a bulb like this, how are you going to make money?
Now, I cannot say why we don't have glasses like this already but my assumption is that the monetary incentive is seen as being contradictory to such an invention.
In the US, consumers like stuff that is cheap, and don't seem to care much if it is poor quality and breaks - they'll just buy another.
In the UK, at least when I lived there 30 years ago, people seemed content to pay more for quality items that would last longer.
I noticed when I moved to the US and saw same brand, e.g. Black & Decker selling cheap plastic US-only versions of products compared to their heavy duty cast iron counterparts sold in the UK that would last forever.
That and producing these glasses with said technique is a lot more expensive. You need to heat up the glass and the potassium nitrate to 500C, mostly over hours because otherwise the glass breaks. Then you need to keep it for a couple of hours, then cool down slowly. What made the initial east german production work is, they did it on a large industrial scale, but even then the energy that you need makes the glasses quite expensive to produce. It's hard to justify buying 6-7€ for a regular drinking cup when a comparable form factor is 1€ or something in this region.
The gist is that by running bulbs at a lower power, you can greatly prolong its life, but the downside is that it doesn't heat up as much, and since emissions spectrum correllates with temperature, ends up being much worse at converting electricity to light, which ends up being not worth it.
This actually happened much earlier in the first half of the 20th century. It was an international cartel with household names (GE, Osram, Phillips etc).
"The cartel tested their bulbs and fined manufacturers for bulbs that lasted more than 1,000 hours."
I was answering yours (perhaps reading it not the way you intended) and the grand parent's "HOWEVER 'ClassicBondedOnly=true' is commented out". No big deal. This style of option and comment and commenting out the option (default or not) is autopilot common - no special intent here.
That's a tricky question. You're going to have to multiplex the use of the device, but since these are mostly 'ping-pong' style uses you can use something called a 'utilization factor' to figure out what a reasonable upper bound is where you still get an answer to your query in acceptable time. The typical mechanism is an input queue with a single worker to use the device. The cut-off is when the queue becomes unacceptably long, in which case you would have to throw an error or be content with waiting (possibly much) longer for your answer. This is usually capped by some hard limit on the length of the queue (for instance: available memory) or the fact that the queue fills up faster than that it can empty even over a complete daily cycle. Once that happens you need more hardware.
Actually many inference systems instead batch all requests within a time period and submit them as a single shot. It increases the average latency but handles more requests per unit time. (at least, this is my understanding how production serving of expensive models that support batching work)
I've done a bunch of optimization for GPU code (in CUDA) and there are typically a few bottle necks that really matter:
- memory bandwidth
- interconnect bandwidth between the CPU and GPU
- interconnect bandwidth between GPUs
- thermals and power if you're doing a good job of optimizing the rest
I don't see how a batching mechanism would improve on any of those, superficially it looks as though that would make matters worse rather than better. Can you explain where the advantage comes from?
It's a latency vs. throughput tradeoff. I was surprised as well. But most GPUs can do 32 inferences in the same time as they can do 1 inference. They have all the parallel units required and there are significant setup costs that can be amortized since all the inferences share the same model, weights, etc.
Very interesting, thank you. I will point one of my colleagues that is busy with this stuff to these and I thank you on his behalf as well, it is exactly the kind of thing they are engaged in.
I think in the case of LLM inference the main bottleneck is streaming the weights from VRAM to CU/SM/EU (whatever naming your GPU vendor of choice uses).
If you're doing inference on multiple prompts at the same time by doing batching, you don't take more time in streaming. But each streamed weights gets used for, say, 32 calculations instead of 1, making better use of the GPU's compute resources.
"Scalability" and "Single Board Computer" don't really belong in the same sentence. That said, today you can get a refurbished mini PC with a lot more power, for a lot less money than the higher end SBCs. But I didn't see any info on how portable this project is to other hardware.
I think the biggest advantage here is that you can run it on the GPU using shared memory, which I'm not sure how widespread it is on mini PCs (at least not intel NUCs).
You could run it using OpenVINO on IntelCPUs, but the performance would probably take a hit. It would be a lot easier though since you can just use ggml.
That's not how you play. The AI doesn't know what word was chosen. You need to ask it questions that will make it include the chosen word in the answer.
Nice, that they keep improving the app. However, Osmand gives mixed feelings.
Examples:
1) Going from Munich main train station south (Hauptbahnhof Süd) to Berlin main train station (Berlin, S+U Hauptbahnhof). Osmand tells me to download Czechia Northwest map. Doesn't make sense, the route doesn't even go through Czechia. Ok, I think. I download the missing map, app crashes. Restart the app, do the same search again. From one major German city to another. Pretty straight, it would be just one highway (A9) basically. Takes 4-5 minutes to calculate with a Samsung S22. That's just meh.
2) Do the same search again but this time set avoid highways. After 15 minutes still no result, I gave up.
3) The search function is contra-intuitive.
4) Even with a Samsung S22 the map is kind of slow if you just move the map around. For driving it seems ok though. The new engine might have speed up things but it still a far way to being smooth.
I don't complain, but I wish they would address especially the route planning and search.
Another excellent overview of the wider problem that's behind the usage of antibiotics in that scale can be found in the Meat Atlas, published by Heinrich Böll Foundation.
It delivers an excellent compilation of the issues at play that will keep the problem going. As long as there's no change in policies, consumer behavior and/or some mad disease that brings down the meat industry, it's going to keep continued.
Great. For Ubuntu it technically already landed on https://kernel.ubuntu.com/~kernel-ppa/mainline/ However, I say technically because the latest kernel build failed. As do many other versions there. I wonder why ...
While I have no idea if this is another case, Ubuntu seems to have a tendency to carry Ubuntu-specific patches that may cause unforeseen consequences.
I've had an Ubuntu patch cause squid to crash when the config had an empty line at the end of the file or something like that.
And Ubuntu's "openbsd-netcat" requires an EOF marker at the end of input, otherwise it won't terminate the connection. The original "openbsd-netcat" does that. Ubuntu even patched in a parameter to make their "openbsd-netcat" behave like the original.
Maybe they should have called it "ubuntu-netcat" instead.
These antics, combined with the mess that is the "snaps" system, made me swear off Ubuntu and leave for RockyLinux.
This is their mainline channel delivered over a Personal Package Archives (ppa) channel (a repo that's just slightly simple to add/enable on Ubuntu); there are no patches on top of those.
If your issue with Ubuntu is Ubuntu-specific patches, I don't think Debian is a solution. They patch a lot.
I personnaly think it's too much mostly because I don't value most of the reasons Debian patches (I don't care about exotic architectures - I value conformance to upstream more than the ability to have modular and small packages - I don't care about having some non-free parts in my package).
I like the blog post and the thought experiment. However, I wish the author wouldn't have stopped with the reasoning and would have stressed his arguments further.
Example where his reasoning in the article is coming short, one might answer: Yes, I wanted to go to store a, and yes, after my 'highway hypnosis' I went or was brought to store b instead. So what? It doesn't really matter if I go shopping in store a or b. The important thing is - I am at a store now and can start my shopping.
If we evaluate that from an ethical point of view then we would have to ask about emancipation and sovereignty in regard to the choices we make, and where that fine line is, where it really starts to matter, if we go for store a or b.
Even though I read the GNU Taler FAQ please excuse any lack of deeper knowledge about it.
When dealing with those commercial banks what is currently the biggest challenge? Is it more the political arguments or the technical arguments of such a payment system that you need to stress? And, I couldn't find no definite answer to that: Could GNU Taler ultimately replace Bitcoin?
Given that for many larger banks, medieval things like overdraft fees and "re-ordering" same-day transactions so that deposits appear later in order to create additional overdraft fees are a huge part of their bottom-line, it doesn't surprise me that they would avoid relinquishing control to an open and fair standard they can't override.
Also, I'm sure anti-money-laundering legislation complicates things in terms of the anonymity feature, despite the fact that merchants are fully auditable, though I assume users aren't able to transfer funds to non-merchants? Still, I'm sure the laws as written complicate this in some jurisdictions.
For example, you could have a completely legitimate business registered as a merchant function as a money-laundering front that could then take in illicit funds from anonymous users working for the illicit org which actually secretly owns the merchant. I'm fine with that happening in the wild because the benefits of privacy for consumers are obvious to me and outweigh the negatives, but I bet regulators aren't so forgiving.
In the wild this often happens -- there are plenty of "DDoS protection" services that by day offer legitimate services in the open and by night actually attack potential customers who they then offer their protection services to, so it is not at all unheard of for an illegitimate org to have a legitimate front. This sort of thing is rampant in the world of high-end minecraft servers, I'm told.
Some decades ago, a manufacturer from East Germany, former GDR, was participating at a fair for lights and light bulbs. This manufacturer invented a light bulb that never burns its glow wire.
At some point during the fair the companies from West Germany had a big laugh on that manufacturer, mocking him and his invention. Their argument: If you build a bulb like this, how are you going to make money?
Now, I cannot say why we don't have glasses like this already but my assumption is that the monetary incentive is seen as being contradictory to such an invention.