Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: What's your biggest complaint about Apache Spark?
5 points by orbOfOrthanc on Jan 1, 2022 | hide | past | favorite | 1 comment
Could be anything from current state, to lack of features, to something else


My biggest complaint is not about Spark itself, but what people make of it. I'm currently at a company (and I have seen this before in others) where we are spending millions of dollars a year to run huge Spark infrastructures for data processing that could be replaced by a couple dozen servers running well architected apps.

I think there is certain use cases/envs where Spark makes a lot of sense, but I don't think is viable for most cases/teams, specially if you don't plan to use Databricks. The vanilla developer experience is pretty rough: automation is lacking, UI is pretty bad, local dev environments (beyond "hello world" level) are hard to setup, etc; and that's not even accounting for the infrastructure deployment/management side of things.

Mix all this with the fact that (at least in my experience) DE and DS are not know for writing robust/defensive code, and you get systems that break very often.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: