Have to provide precise instructions to LLMs to get anything useful.
Instructions for operating the "Holy hand grenade of Antioch" from Monty Python and the Holy Grail are a good example:
''First shalt thou take out the Holy Pin. Then shalt thou count to three, no more, no less. Three shall be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, neither count thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then lobbest thou thy Holy Hand Grenade of Antioch towards thy foe, who, being naughty in My sight, shall snuff it.'
I see that the handful of Ellipsis "buddies" here are upvoting your post. :)
CodeRabbit employees wouldn't usually be commenting here to spoil your "moment," but this reply is entirely wrong on so many levels. The fact is that CR is much further along the traction (several hundred paying customers and thousands of GitHub app installs) and product quality. Most of the CR clones are just copying the CR UX (and it's OSS prompts) at this point, including Ellipsis. The chat feature at CR is also pretty advanced - it even comes with a sandbox environment to execute AI-generated shell commands that help it deep dive into the codebase.
Again, I am sorry that we had to push back on this reply; we usually don't respond to competitors - but this statement was plain wrong, so we had to flag it.
Anyone looking to build a practical solution that involves weighted-fair queueing for request prioritization and load shedding should check out - https://github.com/fluxninja/aperture
The overload problem is quite common in generative AI apps, necessitating a sophisticated approach. Even when using external models (e.g. by OpenAI), the developers have to deal with overloads in the form of service rate limits imposed by those providers. Here is a blog post that shares how Aperture helps manage OpenAI gpt-4 overload with WFQ scheduling - https://blog.fluxninja.com/blog/coderabbit-openai-rate-limit...
Interesting use-case: We recently started using ast-grep at CodeRabbit[0] to review pull request changes with AI.
We use gpt-4 to generate ast-grep patterns to deep-dive and verify pull-request integrity. We just rolled this feature out 3 days back and are getting excellent results!
FluxNinja [0] founder here. I developed an in-house AI-based code review tool [1] that CodeRabbit is now commercializing [2].
I did it because of the increasing frustration due to the time-consuming, manual code review process. We tried several techniques to improve velocity - e.g., stacked pull requests, but the AI tool helped the most.
In addition to code generation, our team has found the new AI code review tools to be quite useful as well. We use CodeRabbit and we keep finding issues/improvements in every other PR.
We have been doing this at CodeRabbit[0] for incrementally reviewing PRs and allowing conversations in the context of code changes, giving the impression that the bot has much more context than it has. It's one of the few tricks we use to scale the AI to code review even large PRs (100+ files).
For each commit, we summarize diff for each file. Then, we create a summary of summaries, which is incrementally updated as further commits are made on a pull request. This summary of summaries is saved, hidden inside a comment on a pull request, and is used while reviewing each file and answering the user's queries.
We haven’t used functions as our text parsing has been pretty high fidelity so far. It’s because we provide an example of the format that we expect. We didn’t feel like fighting too hard with LLMs to get structured output. You will also notice that our input format is not structured as well. Instead of unidiff format we provide the AI side-by-side diff with line number annotations so the it can comment accurately - this is similar to how humans want to look at diffs.
Our OSS code is far behind our proprietary version. We have a lot more going on over there and we don’t use functions in that version as well.
Let's talk LLMs instead.