Hacker Newsnew | past | comments | ask | show | jobs | submit | more montanalow's commentslogin

We weren’t expecting 35x speedup. The difference was pretty surprising.


We're launching the PostgresML Gym today, a free/hosted version of Postgres with our machine learning extension built in. It enables you to train supervised ML algorithms (regression, classification) on real time data in your database, and make predictions without any additional infrastructure.

We'd really appreciate it if you could kick the tires let us know how we can make ML more approachable.


Go to huggingface.com and start with some of the tutorials. The operational/engineering skill sets alone are all you need to treat modern ML models like any other black box API/SDK.


They call it ‘Tasks’

https://huggingface.co/tasks



went there and there are lots of stuff indeed, but I failed to find anything related to "operational/engineering skill sets"?


Yep, they will use the specialized hardware by default if CUDA/cuBLAS has been installed on the system.


The entire crew was rescued. The article is nonsense.


What's your point? Of course the avoidance of a human tragedy is great but the cost of this incident (to the shipping company, automotive manufacturers, insurers, environment) was enormous. It makes perfect sense to try to understand it better.


My point is that the ships telemetry was not an issue. Blaming electric vehicles with no basis, or lack of telemetry with clear evidence to the contrary is…something.


The issue is that they don't know what caused the ship to sink, because the data recorder is at the bottom of the ocean. If they'd streamed the data continuously, we'd have useful information to prevent future sinkings.


Of course it's an issue; the cost of finding out why it sank is huge because it didn't have telemetry.


This time. A fire on board a vessel is highly dangerous and should always be avoided if possible. If a car was the cause of the fire, additional safety measures should be implemented on these types of vessel to potentially save the lives of the crew.


This article is pure speculation by an insurance company that isn’t involved, and admit they have no idea what happened. FUD.


> speculation by an insurance company

Speculation about risk is literally what insurance companies do; how they determine the terms of insurance.

I learned a lot from this. One, shipping is pretty fly-by-night. Operators are winging it and trusting to luck in a lot of cases; failing to secure cargo for expediency for example. That's more corroboration than new knowledge. Two, fires are common on cargo ships: 14 times a year some cargo ship starts burning for some reason, car carriers being among the more frequent. Small fires that are quickly contained aren't included in that figure because they don't get reported. Three, car carrying ships aren't prepared to deal with cascading lithium battery fires. They can handle ICE fires but haven't yet adapted to the electric vehicles they're actually starting to carry in quantity. Four, the people that have to analyze the risks involved because they're liable for the costs are more candid about electric car fires; they say straight up that "lithium-ion batteries—they can ignite a lot more vigorously as compared to any other cars" and "a high impact on these cars and a lithium-ion battery can ignite them." A refreshing change from the obligatory handwaving and cognitive dissonance one gets at all other times.

In electronics there is the concept of a bathtub curve. The frequency of component failure is high when the components are new. The frequency decreases afterwards and later increases again with old age.

Applied to cargo ships full of electric cars with millions of brand new battery cells, obviously the risk that some battery will ignite and sink the ship has to be considered. And it will be considered -- if only by insurance companies -- whether electric car proponents like it or not.


It isn't just speculation, it has already occurred [1], the fire has had an impact on availability of Porsche cars in Europe as more are being sent to the US to replace the ones lost.

[1] https://en.wikipedia.org/wiki/Felicity_Ace


They're providing background, which they have more than enough relevant expertise to provide. I found it a useful perspective; in particular an un-connected party can often speak more freely about issues.

What exactly did you identify as FUD in this article?


Landed a PR to enable GridSearchCV and RandomSearchCV. https://github.com/postgresml/postgresml/pull/83


Outside of academia, you need to be able to constantly adapt to changing training data. That is a constant pitfall for many projects that try to transition from offline to online.


Outside academia you need to do a lot of data cleaning and feature engineering, and if you’re constantly changing the model as well as the data you’ll never be able to attribute changes to either.

Chances are a DBA wouldn’t consider letting you do data engineering in a live production database anyway, so this really is all academic.


Haven't played with it yet, but this tooling is exactly what I'm looking for for my SQL-first-python-second "data lake" workflows. It doesn't exactly matter to me if data in a table gets rewritten in-place since it's built on a model meant to be destroyed at any time from an original source file - generally data from FOIA or gov sources that's write-once.

In a way, having to pull things out of the DB and into python is something that requires change attribution on its own since it's a conversion between abstractions which I tend to think of as a lossy process (even if it isn't). This sort of tooling keeps the abstractions localized, so it's much easier to maintain a mental model of models of changes.


Outside academia you need to do a lot of data cleaning and feature engineering, and if you’re constantly changing the model as well as the data you’ll never be able to attribute changes to either.

I get the concern but sometimes I really just do want a black box regressor or classifier. Model performance monitoring is important, but I don't care about attribution.

Chances are a DBA wouldn’t consider letting you do data engineering in a live production database anyway, so this really is all academic.

Maybe it isn't data engineering, but I'm curious what you'd call using Google's BigQuery ML? "BigQuery ML enables users to create and execute machine learning models in BigQuery by using standard SQL queries."

I haven't used it in production, but I'd use it in a heartbeat if I was on BigQuery.


Stateful services like Postgres cannot be rapidly scaled up/down to adjust to daily loads, even when those loads are very predictably busy at noon, but often sit idle overnight. Scheduling ML jobs while the db is typically wasting resources might be an efficient strategy.


Nice! Feel free to file any issues on Github if something doesn't work. I'd love to understand more about the internals of the hydra architecture.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: