Rufa - Audio Processing Agent
Overview
Rufa is the agent responsible for audio processing in the TKM AI Agency Platform. Its primary function is to handle audio file transcription using Groq's Whisper model, process audio files, and manage audio-related data storage.
Directory Structure
Rufa/
├── data/ # Directory for transcription storage
├── data_recordings/ # Directory for processed audio files
├── rufa.py # Main agent implementation
├── api_rufa.py # FastAPI endpoints
├── tools.py # Audio processing utilities
├── tools_schema.py # Data models and schemas
├── tools_definitions.py # Constants and definitions
└── data_validation.py # Audio validation utilities
Main Components
RufaAgent Class
The main class that handles audio processing and transcription:
- Groq API integration
- Audio file management
- Transcription processing
- Data path handling
Audio Processing Pipeline
- Audio file validation
- Audio preprocessing
- Transcription generation
- Result storage and management
Key Features
Audio Processing
- Audio file validation
- Format compatibility checks
- Duration calculation
- File preprocessing
Transcription Service
- Integration with Groq's Whisper model
- JSON response format
- Error handling
- Quality validation
Data Management
- Structured file storage
- Metadata tracking
- Organization-level isolation
- User-specific data handling
API Operations
Audio Transcription
# Transcription Request
{
"audio_path": str,
"user_id": str,
"conversation_id": str,
"organization_id": str
}
# Transcription Response
{
"success": bool,
"text": str,
"file_path": str,
"duration": float,
"message": {
"type": "audio",
"content": str,
"timestamp": str,
"source_agent": "rufa"
}
}
Integration
Agent Communication
- Atta: Conversation verification and message storage
- Bala: Optional text embedding generation
- Niger: Data persistence (when available)
Data Flow
- Audio file reception
- Validation and preprocessing
- Transcription generation
- Result storage
- Message delivery to conversation
Error Handling
- Audio file validation errors
- Processing failures
- Transcription service errors
- Storage issues
- Detailed error logging
Performance Features
Audio Processing
- Efficient file handling
- Format optimization
- Resource management
- Processing queue handling
Storage Management
- Organized file structure
- Metadata tracking
- Space optimization
- Cleanup routines
Data Models
Audio Transcription Result
{
"success": bool,
"transcription": Optional[str],
"file_path": Optional[str],
"duration": Optional[float],
"error": Optional[str]
}
Audio Metadata
{
"original_filename": str,
"processed_filename": str,
"duration": float,
"created_at": str,
"organization_id": str,
"user_id": str
}