Hova - Image Processing Agent
Overview
Hova is the agent responsible for image processing in the TKM AI Agency Platform. Its main functions include image analysis, OCR text extraction, image description generation, and coordination with other agents for comprehensive image processing.
Directory Structure
Hova/
├── data/ # Directory for processed images and metadata
├── hova.py # Main agent implementation
├── api_hova.py # FastAPI endpoints
├── tools.py # Core image processing utilities
├── tools_schema.py # Data models and schemas
├── tools_definitions.py # Tool definitions
└── data_validation.py # Data validation utilities
Main Components
HovaAgent Class
The main class that handles image processing and coordination:
- Integration with mediator and event bus
- Image processing pipeline management
- Data path and storage handling
- Metadata management
Image Processing Pipeline
- Image validation and preprocessing
- OCR text extraction
- AI-powered image description generation
- Thumbnail creation
- Metadata extraction and storage
Key Features
Image Analysis
- OCR text extraction from images
- AI-generated image descriptions using Groq
- Image metadata extraction (format, size, dimensions)
- Thumbnail generation
Data Management
- Structured image storage
- Metadata organization
- Integration with Niger for data persistence
- File path and reference management
Integration Features
- Embedding generation through Bala
- Message storage with Atta
- Document classification with Orion
- Data persistence with Niger
API Endpoints
Image Processing
/process_image
: Main endpoint for image processing- Accepts image file uploads
- Returns processed results including:
- OCR text
- Image description
- Metadata
- File references
- Folder structure
Integration
Agent Communication
- Atta: Message storage and conversation management
- Bala: Text embedding generation for OCR and descriptions
- Orion: Document classification and folder structure
- Niger: Data persistence and storage
Data Flow
- Image reception and validation
- Processing and feature extraction
- Embedding generation
- Metadata storage
- Classification and organization
- Message delivery to conversation
Error Handling
- Image format validation
- Processing error management
- Integration error handling
- Data persistence verification
- Comprehensive error logging
Performance Features
- Asynchronous image processing
- Efficient storage management
- Thumbnail generation
- Optimized integration with other agents
Data Storage
Image Metadata
{
"file_path": str,
"thumbnail_path": str,
"metadata": {
"format": str,
"size": str,
"dimensions": str
},
"embedding_id": str,
"tokens_info": {
"prompt_tokens": int,
"completion_tokens": int,
"total_tokens": int
},
"folder_structure": {
"folder_name": str,
"category": str,
"subcategory": str,
"state": str,
"confidence": float
}
}