I have just implemented chunking with overlap for larger documents to split texts into smaller chunks and ensure access to all documentation in your RAG. It's currently in the testing phase, and I’d like to experiment with different models to optimize the process. Once I confirm that everything is working correctly, I can merge the PR into the main branch, and you’ll just need to update Rlama with `rlama update`.
This is my next step. Currently, I’ve built an MVP to test the features, integrations, and see how far I can go with rLlama. I’m already developing a RAG on my end by chunking the data, adding overlap, and using metadata to retrieve the best possible context. This should be deployed soon. The version on GitHub has been pushed for days now, and it was only a version to showcase the features. I can’t wait to improve it and make it useful for everyone!
I've already made some examples, even with my own codebase, to see how it can be used to understand projects, and I want to show how it can be used with documentation or studies. I will publish them next week.
Hey! Yes, that's something I was planning to do—a complete documentation on the code, its architecture, and the entire stack to allow others to develop alongside me. I just deployed a functional version, and soon, the website will have documentation with its architecture and a visualization of the entire code.
but for now here is the stack used:
Core Language: Go (chosen for performance, cross-platform compatibility, and single binary distribution)
CLI Framework: Cobra (for command-line interface structure)
LLM Integration: Ollama API (for embeddings and completions)
Storage: Local filesystem-based storage (JSON files for simplicity and portability)
Vector Search: Custom implementation of cosine similarity for embedding retrieval
Hi, if you want to keep using a Go embedded/in-process vector store, but with some additional features, you can check out my project https://github.com/philippgille/chromem-go
I thought about adding an API interface for it. It is on my to-do list of things that could be good to add. For now, I'm gathering feedback to see what people like about it or not.
Imagine one day you did something on your laptop two months ago and you want to find out where you did it. If you don't remember, it will do it for you. If you want a detailed daily summary of your activities, it will do that for you. If you want to summarize a meeting or your university lectures in detail, it will do that for you. People were afraid of this technology because it wasn't private, and that's completely understandable. Now, imagine having the same technology but completely local. All your data would remain confidential and encrypted, and you could ask it any question about your digital activity.