Sunbird AI Assistant
  • Overview
  • Functional Overview
    • The Problem
    • The Solution
    • Use Cases
      • e-Jaadui Pitara
    • Capabilities
  • Technical Overview
    • Architecture
    • Technology Stack
  • Get Started with AI Assistant
    • Key Steps to role out an AI Assistant Solution
    • Pre-requisites
    • Installation
    • Data Ingestion Process
    • Configuration
    • APIs
    • Bot Creation 101
  • Components
    • Sakhi API Service
      • Environment Variables
      • Pluggability of LLM Chat Model
      • Pluggability of Cloud Storage
      • Pluggability of Transaltion service
      • Pluggability of Vector Store
  • Release Notes
    • Release Convention
    • 3.0.0 (Latest)
    • 2.0.0
    • 1.0.0
  • Roadmap
  • Contribution Guide
  • FAQs
  • Knowledge Base
    • Best Practices
    • Indexing CSV Data
  • Contact us
Powered by GitBook
On this page
  1. Knowledge Base

Best Practices

PreviousKnowledge BaseNextIndexing CSV Data

Last updated 10 months ago

  1. Providing a reference to sources in responses increases trustworthiness.

  2. Work on the documents chunking strategy like keeping chunk size of 512/768/1024 with text overlap as 150/200 and verify the retrieval results efficiency for minimum of 30 queries.

  3. Ensure your prompt has instructions to avoid responding to queries involving politics, political personalities, religion, caste, skin colour or any other sensitive topics. Perform a round of testing of the bot involving these corner cases.

  4. Put a disclaimer (in the bot welcome message or any other suitable place) stating that bot responses are from GenAI and may not be 100% accurate.

  5. Try different embedding models/sentence transformers to verify/improve indexing+retrieval efficiency. You can also refer to

  6. Good to know: ,

https://huggingface.co/spaces/mteb/leaderboard
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard