Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's far more to a RAG pipeline than chunking documents, chunking is just one way to interface with a file. In our case we use query decomposition, document summaries and chunking to achieve strong results.

Your right that chunking is just one piece of this. But without quality chunks you're either going to miss context come query time (bad chunks) or use 100X the tokens (full file context).



Can you describe a little bit more in detail what is your stragegy on query decomposition?


Here is a description: https://js.langchain.com/docs/use_cases/query_analysis/techn...

> When a user asks a question there is no guarantee that the relevant results can be returned with a single query. Sometimes to answer a question we need to split it into distinct sub-questions, retrieve results for each sub-question, and then answer using the cumulative context.

> For example if a user asks: “How is Web Voyager different from reflection agents”, and we have one document that explains Web Voyager and one that explains reflection agents but no document that compares the two, then we’d likely get better results by retrieving for both “What is Web Voyager” and “What are reflection agents” and combining the retrieved documents than by retrieving based on the user question directly.

> This process of splitting an input into multiple distinct sub-queries is what we refer to as query decomposition. It is also sometimes referred to as sub-query generation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: