Embedding transforms text summaries into high-dimensional vectors (embeddings) that capture semantic meaning. Similar conversations have similar embeddings, enabling mathematical clustering.Kura supports multiple embedding providers through the BaseEmbeddingModel interface.
Implement BaseEmbeddingModel for custom providers:
from kura.base_classes import BaseEmbeddingModelclass CustomEmbeddingModel(BaseEmbeddingModel): def slug(self) -> str: """Unique identifier for this model configuration.""" return f"custom-model-{self.version}" async def embed(self, texts: list[str]) -> list[list[float]]: """Convert texts to embeddings. Args: texts: List of text strings to embed Returns: List of embedding vectors (one per input text) """ # Your implementation here embeddings = await self.api_client.embed_batch(texts) return embeddings
# OpenAI: Larger batches = fewer API callsembedding_model = OpenAIEmbeddingModel( model_batch_size=2048 # Max allowed by OpenAI)# Sentence Transformers: Tune based on GPU memoryembedding_model = SentenceTransformerEmbeddingModel( model_batch_size=256, # Increase if you have GPU memory device="cuda")
# OpenAI: Balance rate limits vs speedembedding_model = OpenAIEmbeddingModel( n_concurrent_jobs=10 # Higher = faster, but may hit rate limits)# Monitor rate limits: 5,000 RPM typical for most accounts
Store embeddings in checkpoints to avoid re-computation:
# Embeddings are automatically saved when using checkpoint managersfrom kura.checkpoints import HFDatasetCheckpointManagercheckpoint_mgr = HFDatasetCheckpointManager("./checkpoints")# First run: computes embeddingssummaries = await summarise_conversations( conversations=conversations, model=summary_model, checkpoint_manager=checkpoint_mgr)# Embeddings are in summary.metadata["embedding"] if cached
import logginglogging.basicConfig(level=logging.INFO)# Output:# INFO:kura.embedding:Initialized OpenAIEmbeddingModel with model=text-embedding-3-small# INFO:kura.embedding:Starting embedding of 1000 texts using text-embedding-3-small# DEBUG:kura.embedding:Split 1000 texts into 20 batches of size 50# INFO:kura.embedding:Successfully embedded 1000 texts, produced 1000 embeddings