I've just published v1.0.1. It's about 2x faster, and should have no other observable changes. The speedup is mainly from avoiding allocation and string slicing as much as possible, plus an internal refactor to bind the parser and tokenizer more tightly together.
Previously the parser would get an array of tokens each time it pushed data into the tokenizer. This was easy to write, but it meant we needed to allocate token objects. Now the tokenizer has a reference to the parser and calls token-specific methods directly on it. Since most of the tokens carry no data, this keeps us from jumping all over the heap so much. If we were parsing a more complicated language this might become a huge pain in the butt, but JSON is simple enough, and the test suite is exhaustive enough, that we can afford a little nightmare spaghetti if it improves on speed.
I want to ditch stream-json so hard (needs polyfills in browser, cumbersome to use), but I need only one feature: invoke callback by path (e.g. `user.posts` need to invoke for each post in array) only for complete objects. Is this something that json river can support?
jsonriver's invariants do give you enough info to notice which values are and aren't complete. They also mean that you can mutate the objects and arrays it returns to drop data that you don't care about.
There might be room for some helper functions in something like a 'jsonriver/helpers.js' module. I'll poke around at it.
Previously the parser would get an array of tokens each time it pushed data into the tokenizer. This was easy to write, but it meant we needed to allocate token objects. Now the tokenizer has a reference to the parser and calls token-specific methods directly on it. Since most of the tokens carry no data, this keeps us from jumping all over the heap so much. If we were parsing a more complicated language this might become a huge pain in the butt, but JSON is simple enough, and the test suite is exhaustive enough, that we can afford a little nightmare spaghetti if it improves on speed.