Hacker Newsnew | past | comments | ask | show | jobs | submit | Dontizi's commentslogin

I don’t know why it’s not working, but here it is: https://www.codex-nuvia.ca/app



I have just implemented chunking with overlap for larger documents to split texts into smaller chunks and ensure access to all documentation in your RAG. It's currently in the testing phase, and I’d like to experiment with different models to optimize the process. Once I confirm that everything is working correctly, I can merge the PR into the main branch, and you’ll just need to update Rlama with `rlama update`.


This is my next step. Currently, I’ve built an MVP to test the features, integrations, and see how far I can go with rLlama. I’m already developing a RAG on my end by chunking the data, adding overlap, and using metadata to retrieve the best possible context. This should be deployed soon. The version on GitHub has been pushed for days now, and it was only a version to showcase the features. I can’t wait to improve it and make it useful for everyone!


I've already made some examples, even with my own codebase, to see how it can be used to understand projects, and I want to show how it can be used with documentation or studies. I will publish them next week.


Hey! Yes, that's something I was planning to do—a complete documentation on the code, its architecture, and the entire stack to allow others to develop alongside me. I just deployed a functional version, and soon, the website will have documentation with its architecture and a visualization of the entire code.

but for now here is the stack used: Core Language: Go (chosen for performance, cross-platform compatibility, and single binary distribution) CLI Framework: Cobra (for command-line interface structure) LLM Integration: Ollama API (for embeddings and completions) Storage: Local filesystem-based storage (JSON files for simplicity and portability) Vector Search: Custom implementation of cosine similarity for embedding retrieval


Hi, if you want to keep using a Go embedded/in-process vector store, but with some additional features, you can check out my project https://github.com/philippgille/chromem-go


Why not use an established open source vector db like pg_vector etc? I imagine your implementation is not going to be as performant


Defeats the point of the single binary installation if you have to set up dependencies.


rlama requires a python install (and several dependencies via pip) to extract text.

https://github.com/DonTizi/rlama/blob/main/internal/service/...


I recommend using this hybrid vector/full text search engine that works across many runtimes: https://github.com/oramasearch/orama


No, for now, I’ve only made it work with Ollama, but it could be ideal to do it directly on llama.cpp. Thank you, I’ll take note of it.


That would be great. Llama.cpp’s built in server offers HTTP embedding endpoints.


I thought about adding an API interface for it. It is on my to-do list of things that could be good to add. For now, I'm gathering feedback to see what people like about it or not.


Just added an Apache License


Imagine one day you did something on your laptop two months ago and you want to find out where you did it. If you don't remember, it will do it for you. If you want a detailed daily summary of your activities, it will do that for you. If you want to summarize a meeting or your university lectures in detail, it will do that for you. People were afraid of this technology because it wasn't private, and that's completely understandable. Now, imagine having the same technology but completely local. All your data would remain confidential and encrypted, and you could ask it any question about your digital activity.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: