Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you're building a UI that renders output from a streaming LLM you might get back something which looks like this:

  {"role": "assistant", "text": "Here's that Python code you aske
Incomplete parsing with incomplete strings is still useful in order to render that to your end user while it's still streaming in.


In this example the value is incomplete, not the key.


incomplete strings could be fun in certain cases

{"cleanup_cmd":"rm -rf /home/foo/.tmp" }


If any part of that value actually made it, unchecked, to execution, then you have bigger problems than partial JSON keys/values.


Incremental JSON parsing is key for LLM apps, but safe progressive UIs also need to track incompleteness and per-chunk diffs. LangDiff [1] would help with that.

[1]: https://github.com/globalaiplatform/langdiff/tree/main/ts


Why not just chunk the json packets instead?


Only because you have access to the incomplete value doesn't mean you should treat it like the complete one...


Yeah, another fun one is string enums. Could tread "DeleteIfEmpty" as "Delete".


I imagine if you reason about incomplete strings as a sort of “unparsed data” where you might store or transport or render it raw (like a string version of printing response.data instead of response.json()), but not act on it (compare, concat, etc), it’s a reasonably safe model?

I’m imagining it in my mental model as being typed “unknown”. Anything that prevents accidental use as if it were a whole string… I imagine a more complex type with an “isComplete” flag of sorts would be more powerful but a bit of a blunderbuss.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: