Hacker Newsnew | past | comments | ask | show | jobs | submit | 10000truths's commentslogin

This is addressed in the "information asymmetry" section of the article.

What do you believe needs improving and why?

I think the ambiguity was deliberate.

And very, very clever.

The table is a bit misleading. Most of the resources of a website are loaded concurrently and are not on the critical path of the "first contentful paint", so latency does not compound as quickly as the table implies. For web apps, much of the end-to-end latency hides lower in the networking stack. Here's the worst-case latency for a modern Chrome browser performing a cold load of an SPA website:

DNS-over-HTTPS-over-QUIC resolution: 2 RTTs

TCP handshake: 1 RTT

TLS v1.2 handshake: 2 RTTs

HTTP request/response (HTML): 1 RTT

HTTP request/response (bundled JS that actually renders the content): 1 RTT

That's 7 round trips. If your connection crosses a continent, that's easily a 1-2 second time-to-first-byte for the content you actually care about. And no amount of bandwidth will decrease that, since the bottlenecks are the speed of light and router hop latencies. Weak 4G/WiFi signal and/or network congestion will worsen that latency even further.


For context this article was written when 95%+ of websites used HTTP 1.1 (and <50% used HTTPS).

The reason why using a CDN is so effective for improving the perceived performance of a web site is because it reduces the length (and hence speed of light delay) of these first 7 round trips by moving the static parts of the web app (HTML+JS) to the "edge", which is just a bunch of cache boxes scattered around the world.

The user no longer has to connect to the central app server, they can connect to their nearest cache edge box, which is probably a lot closer to them (1-10ms is typical).

Note that stateful API calls will still need to go back to the central app server, potentially an intercontinental hop.


Indeed, at some point, you can't lower tail latencies any further without moving closer to your users. But of the 7 round trips that I mentioned above, you have control over 3 of them: 2 round trips can be eliminated by supporting HTTP/3 over QUIC (and adding HTTPS DNS records to your zone file), and 1 round trip can be eliminated by server-side rendering. That's a 40-50% reduction before you even need to consider a CDN setup, and depending on your business requirements, it may very well be enough.

Yes? Funnily enough, I don't often use indexed access in Rust. Either I'm looping over elements of a data structure (in which case I use iterators), or I'm using an untrusted index value (in which case I explicitly handle the error case). In the rare case where I'm using an index value that I can guarantee is never invalid (e.g. graph traversal where the indices are never exposed outside the scope of the traversal), then I create a safe wrapper around the unsafe access and document the invariant.

If that's the case then hats off. What you're describing is definitely not what I've seen in practice. In fact, I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access. Even security-critical cryptography crates that passed audits don't do that. Personally, I found it quite hard to avoid indexing for graph-heavy code, so I'm always on the lookout for interesting ways to enforce access safety. If you have some code to share that would be very interesting.

My rule of thumb is that unchecked access is okay in scenarios where both the array/map and the indices/keys are private implementation details of a function or struct, since an invariant is easy to manually verify when it is tightly scoped as such. I've seen it used it in:

* Graph/tree traversal functions that take a visitor function as a parameter

* Binary search on sorted arrays

* Binary heap operations

* Probing buckets in open-addressed hash tables


> I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access.

The smoltcp crate typically uses runtime checks to ensure slice accesses made by the library do not cause a panic. It's not exactly equivalent to GP's assertion, since it doesn't cover "every single slice access", but it at least covers slice accesses triggered by the library's public API. (i.e. none of the public API functions should cause a panic, assuming that the runtime validation after the most recent mutation succeeds).

Example: https://docs.rs/smoltcp/latest/src/smoltcp/wire/ipv4.rs.html...


I think this goes against the Rust goals in terms of performance. Good for safe code, of course, but usually Rust users like to have compile time safety to making runtime safety checks unnecessary.

> graph-heavy code

Could you share some more details, maybe one fully concrete scenario? There are lots of techniques, but there's no one-size-fits-all solution.


Sure, these days I'm mostly working on a few compilers. Let's say I want to make a fixed-size SSA IR. Each instruction has an opcode and two operands (which are essentially pointers to other instructions). The IR is populated in one phase, and then lowered in the next. During lowering I run a few peephole and code motion optimizations on the IR, and then do regalloc + asm codegen. During that pass the IR is mutated and indices are invalidated/updated. The important thing is that this phase is extremely performance-critical.

And it's fine for a compiler to panic when it violates an assumption. Not so with the Cloudflare code under discussion.

Idiomatic Rust would have been to return a Result<> to the caller, not to surprise them with a panic.

The developer was lazy.

A lot of Rust developers are: https://github.com/search?q=unwrap%28%29+language%3ARust&typ...


One normal "trick" is phantom typing. You create a type representing indices and have a small, well-audited portion of unsafe code handling creation/unpacking, where the rest of the code is completely safe.

The details depend a lot on what you're doing and how you're doing it. Does the graph grow? Shrink? Do you have more than one? Do you care about programmer error types other than panic/UB?

Suppose, e.g., that your graph doesn't change sizes, you only have one, and you only care about panics/UB. Then you can get away with:

1. A dedicated index type, unique to that graph (shadow / strong-typedef / wrap / whatever), corresponding to whichever index type you're natively using to index nodes.

2. Some mechanism for generating such indices. E.g., during graph population phase you have a method which returns the next custom index or None if none exist. You generated the IR with those custom indexes, so you know (assuming that one critical function is correct) that they're able to appropriately index anywhere in your graph.

3. You have some unsafe code somewhere which blindly trusts those indices when you start actually indexing into your array(s) of node information. However, since the very existence of such an index is proof that you're allowed to access the data, that access is safe.

Techniques vary from language to language and depending on your exact goals. GhostCell [0] in Rust is one way of relegating literally all of the unsafe code to a well-vetted library, and it uses tagged types (via lifetimes), so you can also do away with the "only one graph" limitation. It's been awhile since I've looked at it, but resizes might also be safe pretty trivially (or might not be).

The general principle though is to structure your problem in such a way that a very small amount of code (so that you can more easily prove it correct) can provide promises that are enforceable purely via the type system (so that if the critical code is correct then so is everything else).

That's trivial by itself (e.g., just rely on option-returning .get operators), so the rest of the trick is to find a cheap place in your code which can provide stronger guarantees. For many problems, initialization is the perfect place (e.g., you can bounds-check on init and then not worry about it again) (e.g., if even bounds-checking on initialization is too slow then you can still use the opportunity at initialization to write out a proof of why some invariant holds and then blindly/unsafely assert it to be true, but you then immediately pack that hard-won information into a dedicated type so that the only place you ever have to think about it is on initialization).

[0] https://plv.mpi-sws.org/rustbelt/ghostcell/


I do use a combination of newtyped indices + singleton arenas for data structures that only grow (like the AST). But for the IR, being able to remove nodes from the graph is very important. So phantom typing wouldn't work in that case.

I realize that this is meant as an exercise to demonstrate a property of variance. But most investors are risk-averse when it comes to their portfolio - for the example given, a more practical target to optimize would be worst-case or near-worst-case return (e.g. p99). For calculating that, a summary measure like variance or mean does not suffice - you need the full distribution of the RoR of assets A and B, and find the value of t that optimizes the p99 of At+B(1-t).

It's hard enough to get a reliable variance-covariance estimate.

They are absolutely aware of these sorts of abuses. I'll bet my spleen that it shows up as a line item in the roadmapping docs of their content integrity/T&S teams.

The root problem is twofold: the inability to reliably automate distinguishing "good actor" and "bad actor", and a lack of will to throw serious resources at solving the problem via manual, high precision moderation.


This is often parroted, but the reasoning is flawed. The vast majority of the platform's growth will come from new users, who are entering the dating scene. If they fail to capture that audience (say, by having a reputation of not performing as advertised), then no amount of upsells or string-alongs of existing users will sustain them, as their user base will only ever decrease, and investors will see that and withdraw accordingly.


Everything about this is wrong.

1) The platforms aren't growing that impressively. Most of their users have been on the platform for a while, were previous users, etc.

2) It doesn't matter how good the app is, you need a network effect. New users are going to go to where the potential dates are.

3) Marketing does wonders. An app can suck and have great marketing. It will get users over an app that actually works and doesn't have good marketing.

4) Lots of people on dating apps are looking for dates (hookups), not partners. If the apps can keep you getting dates, not partners, they can keep you on the app and happy.


"If the apps can keep you getting dates, not partners, they can keep you on the app and happy."

I know this sounds judgemental but I'm not convinced the people going on lots of dates are "Happy" even if they're being successful in dating and hookups.


Happy in terms of being a customer of the app perhaps.


Match's growth peaked a long time ago. The site is now trying to grow by "offering new products" and "cutting operational costs."

The relative newcomers - Bumble and Hinge - grew by trying to offer a better experience, especially for women, who are traditionally overwhelmed with unreciprocated interest on conventional apps. Both seem to have admitted defeat now and moved to the usual model.

In terms of revenue, the incentive to keep millions of users spending is far higher than the nominal gains from persuading friends of a successful couple to join up. Given that most users aren't successful, that network effect is tiny.

There's an opposing network effect of *keeping customers unmatched, because this provides gossip and entertainment among friends, which gives them a reason to continue using a service.

We know that string-alongs are a real thing on dating sites - especially, but not exclusively, for men.

There's also a small but not negligible subculture of (mostly) women who use dates for free meals and get a good return on their monthly subscription.

And a lot of sites - not just Tinder - overlap hook-up culture with people seeking marriage and kids. If anything the former is a more popular option now.


FWIW, Hinge has been owned by match for some years now. Bumble is still independent by their stock is down ~92% over 5 years. I think they will eventually be bought out by match.


Wrong claim: "There's an opposing network effect of *keeping customers unmatched, because this provides gossip and entertainment among friends, which gives them a reason to continue using a service."

No, this is implying they are doing it with intention - which they dont even have to insist on! They can keep the users, because matching based on an app does not work for 99.99% real cases. So if you treat them well, they will stay anyway, unless your product is shitty.


I don't think this counterargument holds. It's a hell of a lot easier to get a customer who already paid once to pay a second time than it is to get a customer to pay for the first time. Also, I think most people are well aware that by and large, dating apps have a very low success rate for the majority of their users. People use them anyway.


Since your marginal costs per customer are veeeery low, you can hammer thm with 50-60% discount, which ends up on most platforms at 10 - 20 bucks per month, and if you make 3 - 4 dinners, you get much more out of it than an evening in the cinema


And to add to that- seeing a real world friend go on dates or start a relationship because of an app is better than any marketing you could ever buy.

If you want to drive top-of-the-funnel growth, make the product good even it causes some folks to drop out once they’re in a relationship.


They don't need that. What option do most young people have?

Most young men can't approach women, most young women can't handle being approached and we don't have shared spaces where people can get to know each other and pair off anymore. Young people think the apps are dumpster fires, they hate them, but the alternative is sadly worse.


> The vast majority of the platform's growth will come from new users...

Userbase expansion is new users less leaving users for a time period. So there are two factors, not just "new users."

In any case, Match Group apps are well into the phase of focusing on extracting the most money possible from their paying users as opposed to gaining new users.

After all, infinite users are useless to a company, even if it costs nothing to support them, if none of them pay.


This only applies in perfectly elastic systems, where the bodies can convert kinetic energy to potential energy and back with perfect restitution. Which, thanks to the second law of thermodynamics, doesn't exist in reality. It's only a question of how much energy is lost. (Unless, of course, you include the medium into which the energy dissipates as heat into the system itself. But such a model is not useful in almost all practical scenarios.)


The ELF loading logic in the Linux kernel is intentionally very simple, so it's more like a bare-bones subset of what the dynamic linker handles. matheusmoreira summarizes it well in a previous discussion [0]:

> Yeah it turns out the kernel doesn't care about sections at all. It only ever cares about the PT_LOAD segments in the program header table, which is essentially a table of arguments for the mmap system call. Sections are just dynamic linker metadata and are never covered by PT_LOAD segments.

The simplicity of the ELF loader in Linux can be exploited to make extremely small executables [1], since most of the data in the ELF header is stuff that the kernel doesn't care about.

[0] https://news.ycombinator.com/item?id=45706380#45709203

[1] https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...


Yep, good points. FWIW I do share roughly the same sentiment despite how I worded that last part of my post.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: