Outside of looking pretty and complex, I just can't figure out what marginal benefit I have looking at this graph vs this one from FinViz [0]. It's clearer, easier to read, and not three-dimensional.
It's an interesting way to visualize relationships between companies beyond just sector/industry (as in FinViz).
> Each point on the Market Map represents a distinct company traded on the NYSE or NASDAQ and positioned according to a series of market metrics such as the Market Capitalization, the Price to Earnings ratio, EBITDA, and others.
This is really cool! I especially like your Tour, very fun way to get introduced. Here is a couple points of feedback:
- For the tour, I notice that pressing Esc exits it. Would be nice if there is a close button somewhere as well, or at least some way to let the user know that they can press Esc. Currently if someone is using the mouse, they only can exit using the left (Previous) button of the first slide or the right (Start Exploring) button of the last slide.
- Clicking into some nodes and then Reset View seems to spin the view many more times than necessary. Not sure if that's a bug or by design but I would prefer if it uses the least amount of camera movement necessary.
Overall this was really cool to see. I have been wanting to do something similar but based on price action correlation instead of fundamentals. Actually I have just launched a related feature this week called Similar Charts.
If you are interested you can see that here: https://base.report/ticker/ACLS/similar-charts. Please note that the free version only shows you tickers that start with the letter A. The paid version shows you 50 matches.
This looks pretty, but it has the rather serious limitation that you can't search for a company, and then see what its neighbors are.
Er, actually it is possible if you click one of the (unlabeled) neighbors, then zoom back in and hover around to find the original company again. But whether this works seems to depend on which color scheme is active.
In general, most interactions force you to zoom out and lose your place, and sometimes it just keeps rotating for no apparent reason.
NuScale Power (SMR) is categorized as "Life Sciences / Agricultural Production-crops", which makes me question the rest of the data.
No, UMAP is nonlinear. The general idea is that you generate a neighborhood graph of your data points, do a spectral embedding on that to get your initial result, and then do gradient descent to make its neighborhood graph closer to the high-dimensional one.
I guess it's a Principle Component Analysis (PCA) dimensionality reduction so the axes are not necessarily concepts/features with names. More just "abstract dimensions of similarity."
The underlying UMAP model is actually pretty interesting. It's linked to in the tour, though I would have expected it to be featured more prominently: https://pair-code.github.io/understanding-umap/
I've experimented with zillions of 3d graphing layouts, usually in the context of PDM/ERP for manufacturing/logistics. Couple of roadblocks I've encountered that are also obstacles here (although he does a MUCH better job than I did in overcoming them, including dynamic distance between nodes, which I can't get away with, sad to say)
First is parallax, the phenomenon of things appearing larger when they are closer to the observer. What this means is that the node size CAN'T be significant in a 3d network unless the perspective is set to orthographic / isometric - because it's going to screw with parallax. How can you tell if the node is actually larger, or if it's just closer?
Second interesting thing about 3d networks is how the (Levenshtein or whatever parm) distance resolves in 3d space, and how that's going to be legible given that we don't have a fourth dimension to stick a camera in. On a 2d surface, the distance-driven force resolves in a 2d vector, so that looking down on it from above, no matter where the force vectors go, all the nodes will be theoretically visible. If you just plot plain distance as a force into 3 dimensions, just using geodesic or straightest line distance, the most tightly gathered nodes will disappear, i.e., be completely occluded. You won't see them!
One possible resolution for this problem, I've found, is classification of distance and assigning this class / category to a specific axis. For example, X axis can be time, Y axis can be a single vector (like, say, military / civilian adoption of a particular dual use part number, expressed as n), and Z axis can represent actual "real" distance (based on tokens, references, "where used", and whatever other factors, either all of them or some of them). This gives you structure where dimensions in the data viz are immediately significant, and simple isometric distance doesn't pile all the nodes in front of each other because they share the same space as the audience.
The takeaway here is that a 3d graph can't just use the same parameters as a 2d graph. The data has to be summarized differently so that the graph remains meaningful. Nothing WRONG with just dumping distance into straight 3d distance, but from the perspective of visual storytelling, it's not optimal.
Also, use isometric cameras. Sure, it's very pretty to have a camera swoop and dive, but it's not going to tell the data's story as well as an isometric camera. (Yes I know I am misusing "isometric" here, but it's the word most people recognize).
If you're using or for actual market analysis, then yes. I completely agree that the 3rd dimension makes it more complicated! But there's a 3D/2D toggle in the top left corner of the Overview page if you're opening it from the desktop
Would be nice to be able to use custom metrics or scripts or indicators for the various axes/features and then show a movie of the display over time. Do the stocks that take off appear randomly or do they exhibit similar behavior?
I suppose an unavoidable effect of the dimensionality reduction process is that clusters of correlations across dozens of dimensions are by nature hard to describe.
This actually seems like a potentially great application for LLMs – generating a semantic description of an n-dimensional correlation.
Thanks @let_var! On the front end, it's a Next.js app with three.js / D3 visuals. The map rendering and the UI was done from scratch almost (well, on top of the mentioned libraries)
And on the backend it's a simple Node.js Express server. Data-wise, I can't share the exact APIs I'm using but they're easily searchable.
this is just UMAP on some arbitrary set of financial metrics. outside of looking cool, there are _a lot_ of limitations with trying to interpret this kind of output.
what it does is move points with a similar cosine distance close to each other (in an stochastic globally-on-average sort of way). a lot of the clusters and other formations are artifacts of the graph layout method moreso than anything else. this has been extensively studied
Sorry to place this here. I've been working on some stock market data visualization over the past couple days -- not as pretty but it makes the point -- and would love some feedback.
Analyzing the Direct Correlation between Federal Reserve's Reverse Repo Operations and S&P 500 Stock Prices
[0]: https://finviz.com/map.ashx