Bala - Embedding Generation Agent
Bala is the agent responsible for generating and managing text embeddings in the TKM AI Agency Platform. It provides vector representations of text that can be used for semantic search, similarity comparisons, and other NLP tasks.
Overview
The Bala agent handles:
- Text embedding generation using OpenAI's models
- Efficient processing of large texts
- Embedding storage and retrieval
- Integration with other agents for text processing
Directory Structure
Bala/
├── bala.py # Main agent implementation
├── tools.py # Core functionality and utilities
├── tools_schema.py # Data models and schemas
├── tools_definitions.py # Embedding configurations
├── data_validation.py # Data validation utilities
└── data/ # Storage directory for embeddings
Core Components
Embedding Models
class EmbeddingModels(str, Enum):
OPENAI_3_SMALL = "text-embedding-3-small"
OPENAI_3_LARGE = "text-embedding-3-large"
Configuration
EMBEDDING_CONFIG = {
"default_model": EmbeddingModels.OPENAI_3_SMALL,
"max_tokens_per_request": 8191,
"batch_size": 100,
"retry_attempts": 3,
"dimensions": {
EmbeddingModels.OPENAI_3_SMALL: 1536,
EmbeddingModels.OPENAI_3_LARGE: 3072
},
"cost_per_1k_tokens": {
EmbeddingModels.OPENAI_3_SMALL: 0.00002,
EmbeddingModels.OPENAI_3_LARGE: 0.00013
}
}
Data Models
Embedding Request
class EmbeddingRequest(BaseModel):
texts: List[str]
user_id: str
session_id: Optional[str]
context_id: str
organization_id: str
Embedding Result
class EmbeddingResult(BaseModel):
success: bool
embeddings: Optional[Dict[str, List[float]]]
metadata: Optional[Dict[str, Union[str, int]]]
embedding_id: Optional[str]
error: Optional[str]
Main Features
Text Processing
- Automatic text chunking for long inputs
- Batch processing capabilities
- Retry mechanism for API calls
- Support for multiple text sources (raw text, images, audio transcripts)
Embedding Generation
- High-dimensional vector generation
- Model selection based on requirements
- Efficient handling of rate limits
- Error handling and retries
Storage Management
- Organized file structure by user and session
- Metadata tracking
- Efficient retrieval system
- Integration with Atta for conversation context
API Reference
Actions
process_text
Processes text and generates embeddings.
{
"action": "process_text",
"data": {
"texts": List[str],
"user_id": str,
"session_id": str,
"context_id": str,
"organization_id": str,
"source_type": str # Optional: "text", "audio", "image"
}
}
Integration
With Atta
Bala automatically updates Atta with embedding information:
await self.mediator.route_request(
from_agent="bala",
to_agent="atta",
data={
"action": "update_message_embedding",
"conversation_id": context_id,
"message_timestamp": timestamp,
"embedding_id": embedding_id
}
)
Performance Features
Text Chunking
- Automatic splitting of long texts
- Maximum token limit handling (8000 tokens)
- Chunk averaging for consistent embeddings
Error Handling
- Retry mechanism with exponential backoff
- Detailed error logging
- Graceful failure handling
- API timeout management
Optimization
- Batch processing for multiple texts
- Caching of frequently used embeddings
- Efficient memory management
- Cost optimization through model selection