I am doing a new sub-system, for analytics, which I can design / implement from scratch. I get a bunch of unique users (say a few thousands). Now I need to track each of these users and do some analytics (Which places are they logging in from ? How long do their sessions last on average ? What are the usual links/endpoints that they visit ? etc.) in my API server. I have a few thousand active users and about two dozen parameters on which I want to track and analyze these parameters.
I have never implemented such an analytics type system. I want to learn about people who have already implemented similar systems. Are there any good tech talks, engineering blog posts, video courses, etc. that highlight the design/technology/architecture choices and their benefits.
You'll be a lot better off spending your mental energy thinking about the outcomes you want to achieve (user engagement, upselling, growth, etc) and the types of analysis you'll need to understand what changes you need to make to produce those outcomes. Protip: this is actually really hard, and people underestimate it by orders of magnitude. A blog post by Roger Peng (with indirect commentary from John Tukey) ... https://simplystatistics.org/2019/04/17/tukey-design-thinkin...
One other immediate tip is to start thinking about correlating your telemetry with user surveys - again, strongly focusing on outcomes and the controllable aspects of those outcomes.
Don't let the data lead the discuisson; decide on the question you're asking, and the implications of all of the possible answers to that question (clearly yes, clearly no, mixed, etc) before you ask it.
Then engineer the lightest weight system possible to ingest, process, store, analyze, and visualize that data.
For me, that would just be:
1. Log data in whatever logging tool you like. Persist the raw stuff forever in a cheap data lake. 2. Batch at some fixed interval into a staging area of a relational DB. 3. Transform it with stored procedures for now (while you figure out what the right transforms are) into a flat fact table. 4. Visualize in Superset or PowerBI or even plain old Excel.
Once you've got the patterns of analysis at least fundamentally right you can consider stream processing (Flink or Kafka Streams are fine) to replace 2 and 3.