AI Providers
AI services you can pay for through Valta wallets.
Valta acts as the financial layer between your AI agents and the providers they consume. Instead of managing separate billing accounts, API keys, and invoices for each provider, Valta consolidates all AI spending into a single wallet system with unified metering, cost tracking, and budget enforcement. Your agents call any supported provider, and Valta handles the billing.
OpenAI
Full support for the OpenAI API, including chat completions, embeddings, image generation, audio transcription, and function calling. Valta meters each request at the token level, tracking both input and output tokens separately to match OpenAI's pricing structure. Cost is deducted from the calling agent's wallet immediately upon response.
Supported Models
- GPT-4 and GPT-4 Turbo — advanced reasoning, complex task execution, multi-step planning
- GPT-3.5 Turbo — fast, cost-effective for high-volume tasks like classification, summarization, and extraction
- DALL-E 3 — image generation, metered per image at the resolution requested
- Whisper — audio transcription, metered per minute of audio processed
- text-embedding-3-small and text-embedding-3-large — vector embeddings for retrieval and search
Anthropic
Valta supports the full Anthropic Messages API. Claude models are metered with input and output tokens tracked independently, reflecting Anthropic's asymmetric pricing. Agents using Claude benefit from the same real-time cost deduction, budget enforcement, and transaction logging as any other provider on Valta.
Supported Models
- Claude 3.5 Sonnet — high performance with strong cost efficiency, suitable for most production workloads
- Claude 3 Opus — maximum capability for complex analysis, long-form reasoning, and nuanced tasks
- Claude 3 Haiku — fastest and most affordable, optimized for high-throughput classification and routing
Google AI
Valta integrates with Google's Gemini model family through both the Google AI Studio API and Vertex AI. Metering accounts for Google's character-based and token-based pricing depending on the model and endpoint. Agents can use Gemini models for multimodal tasks, including text, image, and video understanding.
Supported Models
- Gemini Pro — general-purpose model for text generation, summarization, and conversational tasks
- Gemini Ultra — highest capability model for complex reasoning and multimodal input processing
- Gemini Flash — optimized for speed and cost, suitable for latency-sensitive applications
Open-Source Models
Valta is expanding support to open-source model inference providers. These providers host open-weight models like Llama, Mistral, and Mixtral, offering competitive pricing and full control over model selection. Metering works identically — per-request, per-token, deducted from the agent's wallet in real time.
- Replicate — on-demand inference for hundreds of open-source models, billed per compute second
- Together AI — optimized hosting for Llama, Mistral, and other popular open-weight models
- Additional providers under evaluation based on reliability and cost transparency
How Metering Works
Valta's provider-agnostic billing layer normalizes cost tracking across all supported providers. Regardless of whether an agent calls OpenAI, Anthropic, or Google, the billing process is identical:
- Per-request metering — every API call is an individual billable event, recorded with full metadata
- Cost transparency — agents and operators see the exact cost of each request before and after execution
- Provider-agnostic billing — one wallet, one balance, one transaction log regardless of how many providers an agent uses
- Real-time deduction — wallet balance updates the moment a provider responds, not on a delayed billing cycle
- Cost comparison — dashboard analytics show cost-per-task across providers so you can optimize spend
Want to see a specific provider supported? Reach out through support with details on your use case.