Answering Queries using LlamaIndex in a Django Application¶
This markdown document explains how queries are answered in a Django application using LlamaIndex, the limitations of the approach, and how LlamaIndex is leveraged for this purpose.
Query Answering Process¶
- A user submits a query through the Django application, which is associated with a specific corpus (a collection of documents).
- The query is saved in the database as a
CorpusQuery
object, and a Celery task (run_query
) is triggered to process the query asynchronously. - Inside the
run_query
task: - The
CorpusQuery
object is retrieved from the database using the providedquery_id
. - The query's
started
timestamp is set to the current time. - The necessary components for query processing are set up, including the embedding model (
HuggingFaceEmbedding
), language model (OpenAI
), and vector store (DjangoAnnotationVectorStore
). - The
DjangoAnnotationVectorStore
is initialized with thecorpus_id
associated with the query, allowing it to retrieve the relevant annotations for the specified corpus. - A
VectorStoreIndex
is created from theDjangoAnnotationVectorStore
, which serves as the index for the query engine. - A
CitationQueryEngine
is instantiated with the index, specifying the number of top similar results to retrieve (similarity_top_k
) and the granularity of the citation sources (citation_chunk_size
). - The query is passed to the
CitationQueryEngine
, which processes the query and generates a response. - The response includes the answer to the query along with the source annotations used to generate the answer.
- The source annotations are parsed and converted into a markdown format, with each citation linked to the corresponding annotation ID.
- The query's
sources
field is updated with the annotation IDs used in the response. - The query's
response
field is set to the generated markdown text. - The query's
completed
timestamp is set to the current time. - If an exception occurs during the query processing, the query's
failed
timestamp is set, and the stack trace is stored in thestacktrace
field.
Leveraging LlamaIndex¶
LlamaIndex is leveraged in the following ways to enable query answering in the Django application:
- Vector Store: LlamaIndex provides the
BasePydanticVectorStore
class, which serves as the foundation for the customDjangoAnnotationVectorStore
. TheDjangoAnnotationVectorStore
integrates with Django's ORM to store and retrieve annotations efficiently, allowing seamless integration with the existing Django application. - Indexing: LlamaIndex's
VectorStoreIndex
is used to create an index from theDjangoAnnotationVectorStore
. The index facilitates fast and efficient retrieval of relevant annotations based on the query. - Query Engine: LlamaIndex's
CitationQueryEngine
is employed to process the queries and generate responses. The query engine leverages the index to find the most relevant annotations and uses the language model to generate a coherent answer. - Embedding and Language Models: LlamaIndex provides abstractions for integrating various embedding and language models. In this implementation, the
HuggingFaceEmbedding
andOpenAI
models are used, but LlamaIndex allows flexibility in choosing different models based on requirements.
By leveraging LlamaIndex, the Django application benefits from a structured and efficient approach to query answering. LlamaIndex provides the necessary components and abstractions to handle vector storage, indexing, and query processing, allowing the application to focus on integrating these capabilities into its existing architecture.