Embeddings Skill

This skill computes semantic embeddings for text and measures the semantic distance between sentences using sentence transformers.

Instructions

To compute embeddings and measure distance:

Accept one or more sentences as input
Use the sentence-transformers library with the 'all-MiniLM-L6-v2' model
Compute embeddings (384-dimensional vectors)
Calculate cosine distance or Euclidean distance between embeddings
Return the distance metric

What are Embeddings?

Embeddings are numerical vector representations of text that capture semantic meaning. Sentences with similar meanings have embeddings that are close together in vector space.

Distance Metrics

Cosine Distance

Range: [0, 2]
0 = identical meaning
2 = opposite meaning
Most commonly used for semantic similarity

Euclidean Distance

Range: [0, ∞)
Measures straight-line distance in embedding space
Sensitive to magnitude differences

Python Code Usage

python

import sys
sys.path.append('.claude/skills/embeddings')
from embedding_utils import (
    compute_embedding,
    cosine_distance,
    semantic_similarity
)

# Compute embeddings
sentence1 = "The cat sits on the mat"
sentence2 = "A feline rests on the rug"

emb1 = compute_embedding(sentence1)
emb2 = compute_embedding(sentence2)

# Calculate distance
distance = cosine_distance(emb1, emb2)
print(f"Semantic distance: {distance:.4f}")

# Or use the convenience function
distance = semantic_similarity(sentence1, sentence2, distance_metric='cosine')

Usage Example

Similar sentences:

Sentence 1: "The dog runs in the park"
Sentence 2: "A canine sprints through the garden"
Cosine Distance: ~0.15 (very similar)

Different sentences:

Sentence 1: "The dog runs in the park"
Sentence 2: "Quantum physics explains particle behavior"
Cosine Distance: ~0.85 (very different)

Functions Available

compute_embedding(sentence) - Get embedding vector for a sentence
compute_embeddings_batch(sentences) - Get embeddings for multiple sentences
cosine_distance(emb1, emb2) - Calculate cosine distance
euclidean_distance(emb1, emb2) - Calculate Euclidean distance
semantic_similarity(sent1, sent2, metric) - End-to-end similarity calculation

Model Information

Model: all-MiniLM-L6-v2
Embedding dimensions: 384
Performance: Fast and efficient
Use case: General-purpose sentence embeddings

Notes

The model is loaded once and cached for efficiency
Requires sentence-transformers package
Works well for short texts (sentences, paragraphs)
Language: Primarily English, with some multilingual capability

Search AI Tools

embeddings

Install this agent skill to your Project

SKILL.md