Launch HN: ContainIQ (YC S21) – Kubernetes Native Monitoring with eBPF

rlyshw · on Jan 6, 2022

I recently had an issue where my UDP service worked fine exposed directly as a NodePort type, but not through an nginx UDP ingress. I _think_ the issue was that the ingress controller forwarding operation was just too slow for the service's needs, but I had no way of really knowing.

Now if I had this kernel level network monitoring system, I probably could have had a clearer picture as to what is going on.

Really one of the hardest problems I've had with learning/deploying in k8s is trying to trace down the multiple levels of networking, from external TLS termination to LoadBalancers, through ingress controllers, all the way down to application-level networking, I've found more often than not the easiest path is to just get rid of those layers of complexity completely.

In the end I just exposed my server on NodePort, forwarded my NAT to it, and called it done. But it sounds like something like ContainIQ can really add to a k8s admin's toolset for troubleshooting these complex network issues. I also agree with other comments here that a limited, personal-use/community tier would be great for wider adoption and home-lab users like me :)

NWMatherson · on Jan 6, 2022

Appreciate this insight and I agree with you.

And I can definitely circle back here when our limited use tier goes live. Agree on that too.

gigatexal · on Jan 6, 2022

A community edition/non-paid would be quite nice to be able to trial this out before paying.

This is how an old employer adopted CockroachDB because we trialed the non-enterprise version and then ultimatley bought a license.

NWMatherson · on Jan 6, 2022

I agree. We are planning to launch a free edition with limited size and data retention. For users to try / play with before paying. It is in the works and we hope to have this out in the next few months.

We are also thinking about launching trials too.

permalac · on Jan 6, 2022

Agreed.

Our employer does not invest on this kind of tools, so when a free version does not exist for us the tool does not exist.

We would be happy to provide usage metrics and reports, our company is full open source and open data, and we work/invest time on open projects when possible.

nyellin · on Jan 6, 2022

Not the OP, but I develop a different open source tool for Kubernetes and would love to talk! (Email is in my profile)

HatchedLake721 · on Jan 6, 2022

Does your company use paid versions of the open source tools or pay support?

NWMatherson · on Jan 6, 2022

We are planning to open source our agents in 2022!

nodesocket · on Jan 6, 2022

Hello. I own and run a DevOps consulting company and use DataDog exclusively for clients. DD works pretty well as it integrates with cloud providers (such as AWS), physical servers (agent), and Kubernetes (helm chart). The pain point is still creating all the custom dashboards, alerts, and DataDog integrations and configuration. Managing the DataDog account can almost be a full-time job for somebody. Especially with clients who have lots of independent k8s clusters all in a single DD account (lots of filtering on tags and labels).

What does ContainIQ offer in terms of benefits over well established players like DataDog? I will say, the Traefik DataDog integration is horrible and hasn't been updated in years so that's something I wish was better. DataDog does support Kubernetes events (into the feed), and their logging offering is quite good (though very expensive).

NWMatherson · on Jan 6, 2022

The dashboard configuration issue was actually one of the pain points we targeted initially. It was an issue we experienced too. And we talked to a lot of our friends who had spent significant time setting these dashboards up in Datadog. One of our initial goals has been to try to automate to get you 95% of the way there without any configuration on your end. We've also tried to make alerting really easy and are working to automate the process of setting smart alerts. Would love to chat more about your experience if you are open to it. My email is nate (at) containiq (dot) com

kolanos · on Jan 6, 2022

How does this compare to Pixie? [0]

[0]: https://github.com/pixie-io/pixie

outgame · on Jan 6, 2022

Polar signals develops Parca [0] which is another eBPF observability tool, and Isovalent develops Cilium [1] which is built on eBPF as well. Genuinely curious if there are differences, or if eBPF only allows for specific observability functionality and each tool has it all.

[0]: https://github.com/parca-dev/parca

[1]: https://github.com/cilium/cilium

brancz · on Jan 6, 2022

Polar Signals founder and one of the creators of Parca here. From what I can tell ContainIQ is distinct from Parca and Polar Signals as we only concern ourselves with continuous profiling, which is complementary to metrics, logs and traces. From our experience, while eBPF is certainly limited and it can be painful to work with the verifier at times, it hits a sweet spot for observability collection because of low overhead and you really only read some structs from memory somewhere for which the limitations of eBPF tend to be plentiful.

Definitely excited to see more eBPF tooling appear in the observability space.

NWMatherson · on Jan 6, 2022

Well said, we are excited to see more eBPF tooling appear as well.

NWMatherson · on Jan 6, 2022

Pixie is definitely similar in their eBPF based approach. I believe there are differences in the types of data they collect and correlate with. For example we collect logs and state information (node status, node conditions, pod scheduled ect) along side our eBPF based metrics like latency. I'm sure there are things they collect that we don’t as well.

nyellin · on Jan 6, 2022

Nice to see a new eBPF based solution out there. Good luck.

NWMatherson · on Jan 6, 2022

Thanks so much!

MoSattler · on Jan 6, 2022

How does this compare to Opstrace? [0]

[0]: https://opstrace.com

NWMatherson · on Jan 6, 2022

OpsTrace took an interesting approach (and was a YC company too, recently acquired by GitLab). We are a managed solution, whereas OpsTrace was a self hosted open source solution. And we are not building on top of other open source tools. With ContainIQ, you can get metrics natively and other features that you wouldn't otherwise be able to get (ex p95 latency by endpoint) with OpsTrace and its integrations.

Kletiomdm · on Jan 7, 2022

Gcp wants 50 cent per ingested log gb.

Gcp is already quite expensive in this regard and you want double.

I think that's way to expensive.