Best Practices
Providing a reference to sources in responses increases trustworthiness.
Work on the documents chunking strategy like keeping chunk size of 512/768/1024 with text overlap as 150/200 and verify the retrieval results efficiency for minimum of 30 queries.
Ensure your prompt has instructions to avoid responding to queries involving politics, political personalities, religion, caste, skin colour or any other sensitive topics. Perform a round of testing of the bot involving these corner cases.
Put a disclaimer (in the bot welcome message or any other suitable place) stating that bot responses are from GenAI and may not be 100% accurate.
Try different embedding models/sentence transformers to verify/improve indexing+retrieval efficiency. You can also refer to https://huggingface.co/spaces/mteb/leaderboard
Last updated