Backend
Backoffice
Bala - Embedding Generation

Bala - Embedding Generation Agent

Bala is the agent responsible for generating and managing text embeddings in the TKM AI Agency Platform. It provides vector representations of text that can be used for semantic search, similarity comparisons, and other NLP tasks.

Overview

The Bala agent handles:

  • Text embedding generation using OpenAI's models
  • Efficient processing of large texts
  • Embedding storage and retrieval
  • Integration with other agents for text processing

Directory Structure

Bala/
├── bala.py              # Main agent implementation
├── tools.py             # Core functionality and utilities
├── tools_schema.py      # Data models and schemas
├── tools_definitions.py # Embedding configurations
├── data_validation.py   # Data validation utilities
└── data/               # Storage directory for embeddings

Core Components

Embedding Models

class EmbeddingModels(str, Enum):
    OPENAI_3_SMALL = "text-embedding-3-small"
    OPENAI_3_LARGE = "text-embedding-3-large"

Configuration

EMBEDDING_CONFIG = {
    "default_model": EmbeddingModels.OPENAI_3_SMALL,
    "max_tokens_per_request": 8191,
    "batch_size": 100,
    "retry_attempts": 3,
    "dimensions": {
        EmbeddingModels.OPENAI_3_SMALL: 1536,
        EmbeddingModels.OPENAI_3_LARGE: 3072
    },
    "cost_per_1k_tokens": {
        EmbeddingModels.OPENAI_3_SMALL: 0.00002,
        EmbeddingModels.OPENAI_3_LARGE: 0.00013
    }
}

Data Models

Embedding Request

class EmbeddingRequest(BaseModel):
    texts: List[str]
    user_id: str
    session_id: Optional[str]
    context_id: str
    organization_id: str

Embedding Result

class EmbeddingResult(BaseModel):
    success: bool
    embeddings: Optional[Dict[str, List[float]]]
    metadata: Optional[Dict[str, Union[str, int]]]
    embedding_id: Optional[str]
    error: Optional[str]

Main Features

Text Processing

  • Automatic text chunking for long inputs
  • Batch processing capabilities
  • Retry mechanism for API calls
  • Support for multiple text sources (raw text, images, audio transcripts)

Embedding Generation

  • High-dimensional vector generation
  • Model selection based on requirements
  • Efficient handling of rate limits
  • Error handling and retries

Storage Management

  • Organized file structure by user and session
  • Metadata tracking
  • Efficient retrieval system
  • Integration with Atta for conversation context

API Reference

Actions

process_text

Processes text and generates embeddings.

{
    "action": "process_text",
    "data": {
        "texts": List[str],
        "user_id": str,
        "session_id": str,
        "context_id": str,
        "organization_id": str,
        "source_type": str  # Optional: "text", "audio", "image"
    }
}

Integration

With Atta

Bala automatically updates Atta with embedding information:

await self.mediator.route_request(
    from_agent="bala",
    to_agent="atta",
    data={
        "action": "update_message_embedding",
        "conversation_id": context_id,
        "message_timestamp": timestamp,
        "embedding_id": embedding_id
    }
)

Performance Features

Text Chunking

  • Automatic splitting of long texts
  • Maximum token limit handling (8000 tokens)
  • Chunk averaging for consistent embeddings

Error Handling

  • Retry mechanism with exponential backoff
  • Detailed error logging
  • Graceful failure handling
  • API timeout management

Optimization

  • Batch processing for multiple texts
  • Caching of frequently used embeddings
  • Efficient memory management
  • Cost optimization through model selection