More

montanalow · on Sept 19, 2022

We weren’t expecting 35x speedup. The difference was pretty surprising.

montanalow · on Aug 26, 2022

We're launching the PostgresML Gym today, a free/hosted version of Postgres with our machine learning extension built in. It enables you to train supervised ML algorithms (regression, classification) on real time data in your database, and make predictions without any additional infrastructure.

We'd really appreciate it if you could kick the tires let us know how we can make ML more approachable.

montanalow · on July 16, 2022

Go to huggingface.com and start with some of the tutorials. The operational/engineering skill sets alone are all you need to treat modern ML models like any other black box API/SDK.

intpx · on July 17, 2022

They call it ‘Tasks’

https://huggingface.co/tasks

Tempest1981 · on July 17, 2022

https://huggingface.co (no 'm')

synergy20 · on July 17, 2022

went there and there are lots of stuff indeed, but I failed to find anything related to "operational/engineering skill sets"?

montanalow · on June 13, 2022

Yep, they will use the specialized hardware by default if CUDA/cuBLAS has been installed on the system.

montanalow · on May 14, 2022

The entire crew was rescued. The article is nonsense.

jhugo · on May 14, 2022

What's your point? Of course the avoidance of a human tragedy is great but the cost of this incident (to the shipping company, automotive manufacturers, insurers, environment) was enormous. It makes perfect sense to try to understand it better.

montanalow · on May 14, 2022

My point is that the ships telemetry was not an issue. Blaming electric vehicles with no basis, or lack of telemetry with clear evidence to the contrary is…something.

tlb · on May 14, 2022

The issue is that they don't know what caused the ship to sink, because the data recorder is at the bottom of the ocean. If they'd streamed the data continuously, we'd have useful information to prevent future sinkings.

jhugo · on May 14, 2022

Of course it's an issue; the cost of finding out why it sank is huge because it didn't have telemetry.

cups_of_rooibos · on May 14, 2022

This time. A fire on board a vessel is highly dangerous and should always be avoided if possible. If a car was the cause of the fire, additional safety measures should be implemented on these types of vessel to potentially save the lives of the crew.

montanalow · on May 14, 2022

This article is pure speculation by an insurance company that isn’t involved, and admit they have no idea what happened. FUD.

topspin · on May 14, 2022

> speculation by an insurance company

Speculation about risk is literally what insurance companies do; how they determine the terms of insurance.

I learned a lot from this. One, shipping is pretty fly-by-night. Operators are winging it and trusting to luck in a lot of cases; failing to secure cargo for expediency for example. That's more corroboration than new knowledge. Two, fires are common on cargo ships: 14 times a year some cargo ship starts burning for some reason, car carriers being among the more frequent. Small fires that are quickly contained aren't included in that figure because they don't get reported. Three, car carrying ships aren't prepared to deal with cascading lithium battery fires. They can handle ICE fires but haven't yet adapted to the electric vehicles they're actually starting to carry in quantity. Four, the people that have to analyze the risks involved because they're liable for the costs are more candid about electric car fires; they say straight up that "lithium-ion batteries—they can ignite a lot more vigorously as compared to any other cars" and "a high impact on these cars and a lithium-ion battery can ignite them." A refreshing change from the obligatory handwaving and cognitive dissonance one gets at all other times.

In electronics there is the concept of a bathtub curve. The frequency of component failure is high when the components are new. The frequency decreases afterwards and later increases again with old age.

Applied to cargo ships full of electric cars with millions of brand new battery cells, obviously the risk that some battery will ignite and sink the ship has to be considered. And it will be considered -- if only by insurance companies -- whether electric car proponents like it or not.

rjsw · on May 14, 2022

It isn't just speculation, it has already occurred [1], the fire has had an impact on availability of Porsche cars in Europe as more are being sent to the US to replace the ones lost.

[1] https://en.wikipedia.org/wiki/Felicity_Ace

jhugo · on May 14, 2022

They're providing background, which they have more than enough relevant expertise to provide. I found it a useful perspective; in particular an un-connected party can often speak more freely about issues.

What exactly did you identify as FUD in this article?

montanalow · on May 5, 2022

Landed a PR to enable GridSearchCV and RandomSearchCV. https://github.com/postgresml/postgresml/pull/83

montanalow · on May 3, 2022

Outside of academia, you need to be able to constantly adapt to changing training data. That is a constant pitfall for many projects that try to transition from offline to online.

mr_toad · on May 3, 2022

Outside academia you need to do a lot of data cleaning and feature engineering, and if you’re constantly changing the model as well as the data you’ll never be able to attribute changes to either.

Chances are a DBA wouldn’t consider letting you do data engineering in a live production database anyway, so this really is all academic.

chaps · on May 3, 2022

Haven't played with it yet, but this tooling is exactly what I'm looking for for my SQL-first-python-second "data lake" workflows. It doesn't exactly matter to me if data in a table gets rewritten in-place since it's built on a model meant to be destroyed at any time from an original source file - generally data from FOIA or gov sources that's write-once.

In a way, having to pull things out of the DB and into python is something that requires change attribution on its own since it's a conversion between abstractions which I tend to think of as a lossy process (even if it isn't). This sort of tooling keeps the abstractions localized, so it's much easier to maintain a mental model of models of changes.

code_biologist · on May 3, 2022

Outside academia you need to do a lot of data cleaning and feature engineering, and if you’re constantly changing the model as well as the data you’ll never be able to attribute changes to either.

I get the concern but sometimes I really just do want a black box regressor or classifier. Model performance monitoring is important, but I don't care about attribution.

Chances are a DBA wouldn’t consider letting you do data engineering in a live production database anyway, so this really is all academic.

Maybe it isn't data engineering, but I'm curious what you'd call using Google's BigQuery ML? "BigQuery ML enables users to create and execute machine learning models in BigQuery by using standard SQL queries."

I haven't used it in production, but I'd use it in a heartbeat if I was on BigQuery.

montanalow · on May 3, 2022

Stateful services like Postgres cannot be rapidly scaled up/down to adjust to daily loads, even when those loads are very predictably busy at noon, but often sit idle overnight. Scheduling ML jobs while the db is typically wasting resources might be an efficient strategy.

montanalow · on May 3, 2022

Nice! Feel free to file any issues on Github if something doesn't work. I'd love to understand more about the internals of the hydra architecture.