Hey fellow open-source enthusiasts, We built Korvus, an open-source RAG (Retriev...

kaspermarstal · on July 11, 2024

Very cool! A assume you use Postgres' native full-text search capabilities? Any plans for BM25 or similar? This would make Korvus the end-game for open source rag IMO.

darby_nine · on July 12, 2024

How do you resolve the disparity between semantic and text search? Surely these rankings are difficult to combine.

whakim · on July 12, 2024

I’d start with something very simple such as Reciprocal Rank Fusion. I’d also want to make sure I really trusted the outputs of each search pipeline before worrying too much about the appropriate algorithm for combining the rankings.

mdaniel · on July 11, 2024

I find it misleading to use an f-string containing encoded `{CONTEXT}` <https://github.com/postgresml/korvus/blob/bce269a20a1dbea933...>, and after digging into TFM <https://postgresml.org/docs/open-source/korvus/guides/rag#si...> it seems it is not, in fact, an f-string artifact but rather the literal characters "{"+"CONTEXT"+"}" and are the same in all the language bindings?

IMHO it would be much clearer if you just used the normal %s for the "outer" string and left the implicit f-string syntax as it is, e.g.

                    {
                        "role": "user",
                        # this is not an f-string, is rather replaced by TODO FIXME
                        "content": "Given the context\n:{CONTEXT}\nAnswer the question: %s" % query,
                    },

The way the example (in both the readme and the docs) is written, it seems to imply I can put my own fileds as siblings to the chat key and they, too, will be resolved

    results = await collection.rag(
        {
            "EXAMPLE": {
              "uh-huh": True
            },
            "CONTEXT": {
                "vector_search": {
                    "query": {
                        "fields": {"text": {"query": query}},
                    },
                    "document": {"keys": ["id"]},
                    "limit": 1,
                },
                "aggregate": {"join": "\n"},
            },
            "chat": {
              "messages": [{"content": "Given Context:\n{CONTEXT}\nAn Example:\n{EXAMPLE}"
            }

One could not fault the user for thinking such a thing since the *API* docs say "see the *GUIDE*" :-( https://postgresml.org/docs/open-source/korvus/api/collectio...

smarvin2 · on July 11, 2024

This section of the docs may be confusing. What you described will actually almost work. See: https://postgresml.org/docs/open-source/korvus/guides/rag#ra...