Agent skill

rag-hybrid-search

Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.

Stars 514
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/a5c-ai/babysitter/tree/main/library/specializations/ai-agents-conversational/skills/rag-hybrid-search

SKILL.md

rag-hybrid-search

Implement hybrid search combining semantic vector retrieval with keyword-based BM25 search for improved RAG pipeline accuracy and recall.

Overview

Hybrid search addresses the limitations of pure semantic or pure keyword search:

  • Semantic search excels at conceptual similarity but may miss exact matches
  • Keyword search finds exact terms but lacks semantic understanding
  • Hybrid combines both for superior retrieval performance

Capabilities

Search Strategies

  • Dense vector semantic search (embeddings)
  • Sparse vector keyword search (BM25, TF-IDF)
  • Hybrid fusion with configurable weighting
  • Reciprocal Rank Fusion (RRF) combination

Retrieval Configuration

  • Configure embedding models for dense search
  • Tune BM25 parameters (k1, b values)
  • Set retrieval limits and thresholds
  • Apply metadata filtering

Ranking & Reranking

  • Score normalization across search types
  • Weighted score fusion
  • Cross-encoder reranking
  • MMR (Maximum Marginal Relevance) diversity

Index Management

  • Create and update hybrid indexes
  • Batch indexing with progress tracking
  • Index optimization and maintenance
  • Multi-index federation

Usage

Basic Hybrid Search with LangChain

python
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma
from langchain.retrievers import EnsembleRetriever
from langchain_openai import OpenAIEmbeddings

# Create documents
docs = [...]  # Your document chunks

# Dense retriever (semantic)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 5

# Hybrid ensemble
hybrid_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, dense_retriever],
    weights=[0.4, 0.6]  # Adjust based on use case
)

# Query
results = hybrid_retriever.invoke("How do I configure the system?")

Reciprocal Rank Fusion

python
def reciprocal_rank_fusion(results_lists: list, k: int = 60) -> list:
    """
    Combine multiple ranked lists using RRF.
    k is a constant (typically 60) for smoothing.
    """
    fused_scores = {}

    for results in results_lists:
        for rank, doc in enumerate(results):
            doc_id = doc.metadata.get("id", str(doc.page_content[:50]))
            if doc_id not in fused_scores:
                fused_scores[doc_id] = {"doc": doc, "score": 0}
            fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

    # Sort by fused score
    sorted_docs = sorted(
        fused_scores.values(),
        key=lambda x: x["score"],
        reverse=True
    )

    return [item["doc"] for item in sorted_docs]

# Use with multiple retrievers
semantic_results = dense_retriever.invoke(query)
keyword_results = bm25_retriever.invoke(query)
hybrid_results = reciprocal_rank_fusion([semantic_results, keyword_results])

Pinecone Hybrid Search

python
from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder

# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.Index("hybrid-index")

# Prepare sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus)  # Fit on your document corpus

def hybrid_query(query: str, alpha: float = 0.5, top_k: int = 10):
    """
    Query with hybrid search.
    alpha: weight for dense vectors (1-alpha for sparse)
    """
    # Get dense embedding
    dense_embedding = embeddings.embed_query(query)

    # Get sparse embedding
    sparse_embedding = bm25.encode_queries([query])[0]

    # Hybrid query
    results = index.query(
        vector=dense_embedding,
        sparse_vector=sparse_embedding,
        top_k=top_k,
        include_metadata=True
    )

    return results

Weaviate Hybrid Search

python
import weaviate

client = weaviate.Client("http://localhost:8080")

def weaviate_hybrid_search(query: str, alpha: float = 0.5, limit: int = 10):
    """
    Weaviate native hybrid search.
    alpha: 0 = pure BM25, 1 = pure vector
    """
    result = (
        client.query
        .get("Document", ["content", "title", "metadata"])
        .with_hybrid(
            query=query,
            alpha=alpha,
            properties=["content", "title"]
        )
        .with_limit(limit)
        .do()
    )

    return result["data"]["Get"]["Document"]

Task Definition

javascript
const ragHybridSearchTask = defineTask({
  name: 'rag-hybrid-search-setup',
  description: 'Configure hybrid search for RAG pipeline',

  inputs: {
    vectorStore: { type: 'string', required: true },  // 'pinecone', 'weaviate', 'chroma', etc.
    embeddingModel: { type: 'string', default: 'text-embedding-3-small' },
    bm25Params: { type: 'object', default: { k1: 1.5, b: 0.75 } },
    fusionStrategy: { type: 'string', default: 'rrf' },  // 'rrf', 'weighted', 'custom'
    denseWeight: { type: 'number', default: 0.6 },
    topK: { type: 'number', default: 10 }
  },

  outputs: {
    retrieverConfigured: { type: 'boolean' },
    indexStats: { type: 'object' },
    artifacts: { type: 'array' }
  },

  async run(inputs, taskCtx) {
    return {
      kind: 'skill',
      title: `Configure hybrid search with ${inputs.vectorStore}`,
      skill: {
        name: 'rag-hybrid-search',
        context: {
          vectorStore: inputs.vectorStore,
          embeddingModel: inputs.embeddingModel,
          bm25Params: inputs.bm25Params,
          fusionStrategy: inputs.fusionStrategy,
          denseWeight: inputs.denseWeight,
          topK: inputs.topK,
          instructions: [
            'Validate vector store connection and configuration',
            'Set up dense embedding pipeline',
            'Configure BM25/sparse encoding',
            'Implement fusion strategy',
            'Test retrieval quality with sample queries',
            'Document configuration and tuning parameters'
          ]
        }
      },
      io: {
        inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
        outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
      }
    };
  }
});

Applicable Processes

  • rag-pipeline-implementation
  • advanced-rag-patterns
  • knowledge-base-qa
  • vector-database-setup

External Dependencies

  • Vector database (Pinecone, Weaviate, Chroma, Milvus, Qdrant)
  • Embedding provider (OpenAI, Cohere, Hugging Face)
  • BM25 encoder (rank_bm25, pinecone-text)

References

Related Skills

  • SK-RAG-001 rag-chunking-strategy
  • SK-RAG-004 rag-reranking
  • SK-RAG-005 rag-query-transformation
  • SK-VDB-001 through SK-VDB-005 (vector database integrations)

Related Agents

  • AG-RAG-001 rag-pipeline-architect
  • AG-RAG-003 vector-db-specialist
  • AG-RAG-004 retrieval-optimizer

Expand your agent's capabilities with these related and highly-rated skills.

a5c-ai/babysitter

gsd-tools

Central utility skill for GSD operations. Provides config parsing, slug generation, timestamps, path operations, and orchestrates calls to other specialized skills. Acts as the unified entry point that the original gsd-tools.cjs provided via its lib/ modules (commands, config, core, init).

514 31
Explore
a5c-ai/babysitter

model-profile-resolution

Resolve model profile (quality/balanced/budget) at orchestration start and map agents to specific models. Enables cost/quality tradeoffs by selecting appropriate AI models for each agent role.

514 31
Explore
a5c-ai/babysitter

verification-suite

Plan structure validation, phase completeness checks, reference integrity verification, and artifact existence confirmation. Provides the structured verification layer ensuring GSD artifacts are well-formed and complete.

514 31
Explore
a5c-ai/babysitter

state-management

STATE.md reading, writing, and field-level updates. Provides cross-session state persistence via .planning/STATE.md with structured fields for current task, completed phases, blockers, decisions, and quick tasks.

514 31
Explore
a5c-ai/babysitter

git-integration

Git commit patterns, formats, and conventions for GSD methodology. Provides atomic commits per task, structured commit messages, planning file commits, branch management, and milestone tag operations.

514 31
Explore
a5c-ai/babysitter

frontmatter-parsing

YAML frontmatter parsing and manipulation for .planning/ documents. Provides read, write, update, query, and validation operations on frontmatter blocks in GSD markdown artifacts.

514 31
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results