Pluggability of Vector Store
This document outlines the steps for creating a custom vector store class that inherits from the provided BaseVectorStore
interface.
As we know, many LLM applications leverage vector stores to efficiently retrieve relevant information for generating responses. A vector store acts as a specialized database designed to store and retrieve high-dimensional vector representations of data, such as documents. These vectors capture the semantic meaning and relationships between concepts in the data.
Understanding the Base Interface:
To create your own vector store, you need to extend the BaseVectorStore
class and implement the following methods:
Method/Property | Description | Required/Optional |
---|---|---|
get_client | An abstract method that subclasses must implement to retrieve the client object used to interact with the specific vector store backend (e.g., Pinecone, Faiss). | Required |
chunk_list | Helper function that splits a document list into batches of a specified size. | Optional |
add_documents | An abstract method for adding documents to the vector store. Subclasses implement their specific logic for document insertion. | Required |
similarity_search_with_score | An abstract method for performing similarity search on the vector store. Subclasses implement their specific logic for retrieving similar documents and scores based on a query string. | Required |
Implementation
Let's implement a custom vector store YourVectorStoreClass
inheriting from BaseVectorStore
that adds documents and returns relevant documents whose text contains the semantic meaning in the user query.
Create a new file named <your_vector_store_name>.py
in the vectorstores
folder to define the YourVectorStoreClass
.
Here's a brief overview of how you'll implement the YourVectorStoreClass
:
In this example, we implement the YourVectorStoreClass
, which provides concrete implementations for the abstract methods defined in BaseVectorStore
.
get_client: Retrieves the specific client used to interact with the vector store backend.
chunk_list: Optionally, splits documents into chunks for more manageable processing.
add_documents: Adds documents to the vector store.
similarity_search_with_score: Performs a similarity search and returns documents along with their similarity scores based on the query.
Go to the vectorstores
folder and update __init__.py
with the module lookup entry for YourVectorStoreClass
.
Modify env_manager.py
to import YourVectorStoreClass
and add a mapping in the self.indexes
dictionary.
This setup ensures that YourVectorStoreClass
can be instantiated based on specific environment variables, effectively integrating it into the environment management system. The self.indexes
the dictionary now includes a mapping where customVector
corresponds to the YourVectorStoreClass
, and uses VECTOR_STORE_TYPE
as the environment key.
Configuration
Configure your environment variables in the .env
file for connecting to the vector store.
Example Usage
Here is an example of how to add and query documents using the vectorstore_class
.
Adding/Appending Documents
Querying Documents
By following this structure, you can efficiently interact with your custom vector store, adding and querying documents as needed.
Last updated