Create a Mistral account at console.cloud.google.com, generate your API key, store it securely, then integrate with Bifrost for virtual keys, budgets, and cost governance. Complete setup in minutes.
Bifrost supports Vertex AI for unified governance across Google-hosted foundation models.
| Property | Details |
|---|---|
| Description | Vertex AI provides access to Gemini, Claude, Llama, and other models via Google Cloud with enterprise controls. |
| Provider route on Bifrost | vertex-ai/<model> |
| Provider doc | Vertex AI |
| API endpoint for provider | https://{region}-aiplatform.googleapis.com |
| Supported endpoints | /v1/models, /v1/chat/completions, /v1/responses, /v1/images/generations, /v1/images/edits, /v1/embeddings, /v1/count-tokens, /v1/rerank, /v1/videos |
Google Cloud console and Vertex AI documentation.
Before you begin, you will need:
[ QUICK START ]
Use console.cloud.google.com.
Sign in to Google Cloud Console and create a new project (or select an existing one).
In APIs & Services, search for Vertex AI API and click Enable for your project.
Link a billing account under Billing. Required for production even when using free credits.
In IAM & Admin → Service Accounts, create a service account and grant Vertex AI User.
Store the file securely, never commit it.
Open the service account → Keys → Add key → JSON. Set GOOGLE_APPLICATION_CREDENTIALS to the file path.
[ MODELS ]
| Model | API ID | Best for |
|---|---|---|
| Gemini 2.5 Pro | gemini-2.5-pro | Flagship reasoning on Vertex AI. |
| Gemini 2.5 Flash | gemini-2.5-flash | Fast multimodal workloads at scale. |
| Gemini 2.5 Flash-Lite | gemini-2.5-flash-lite | Cost-optimized high-volume inference. |
| Gemini 2.0 Flash | gemini-2.0-flash | Prior-gen fast multimodal model. |
| Claude Sonnet 4.5 | claude-sonnet-4-5@20250929 | Anthropic model hosted on Vertex (MaaS). |
| Claude Haiku 4.5 | claude-haiku-4-5@20251001 | Fast Anthropic tier on Vertex. |
| Llama 3.3 70B Instruct | meta/llama-3.3-70b-instruct-maas | Meta open model on Vertex Model Garden. |
| Llama 3.1 405B Instruct | meta/llama-3.1-405b-instruct-maas | Largest Llama 3.1 on Vertex. |
| Llama 3.1 70B Instruct | meta/llama-3.1-70b-instruct-maas | Production open-weight chat on Vertex. |
| Llama 3.1 8B Instruct | meta/llama-3.1-8b-instruct-maas | Efficient Llama 3.1 on Vertex. |
| Mistral Large | mistral-large@2411 | Mistral flagship on Vertex. |
| Imagen 3 | imagen-3.0-generate-002 | Image generation on Vertex. |
| Veo 2 | veo-2.0-generate-001 | Video generation on Vertex. |
| text-embedding-005 | text-embedding-005 | Google text embeddings for RAG. |
Models and availability change over time. See the Vertex AI's models documentation for the latest list and pricing.
[ TROUBLESHOOTING ]
| Error | Likely Cause | What to Do |
|---|---|---|
401 Unauthorized | Invalid or missing API key. | Verify your API key is correct. Generate a new key if needed. |
400 Bad Request | Invalid request format or unsupported model. | Check request format and confirm model ID is valid. |
429 Rate Limited | Rate limit exceeded for your plan. | Upgrade your plan or implement exponential backoff. Use Bifrost for intelligent load distribution. |
502/503 Service Error | Temporary Mistral service unavailability. | Retry after a delay. Check Mistral status page. Configure failover with Bifrost. |
[ PRODUCTION-READY ]
Bifrost is a drop-in replacement for Vertex AI SDKs: keep your client code and change the base URL to your gateway. Bifrost handles cost tracking, virtual keys, budgets, and failover automatically.
Run the Bifrost gateway and configure your Mistral credentials in the Web UI.
$ npx -y @maximhq/bifrost
✓ Bifrost started ├─ HTTP server listening on http://localhost:8080 ├─ Web UI available at http://localhost:8080 └─ Configure providers and virtual keys in the dashboard
Update your Vertex AI SDK client to route through the Bifrost gateway.
from google import genai client = genai.Client( vertexai=True, project="your-project-id", location="us-central1", http_options={"base_url": "http://localhost:8080/genai"}, ) response = client.models.generate_content( model="gemini-2.0-flash", contents="Hello from Bifrost!", ) print(response.text)
x-bf-vk or Authorization: Bearer sk-bf-* per the Bifrost documentation.[ WHAT'S NEXT ]
You have your API key. Add governance, guardrails, and MCP controls for production.
[ BIFROST FEATURES ]
Everything you need to run AI in production, from free open source to enterprise-grade features.
01 Governance
SAML support for SSO and Role-based access control and policy enforcement for team collaboration.
02 Adaptive Load Balancing
Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.
03 Cluster Mode
High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.
04 Alerts
Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.
05 Log Exports
Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.
06 Audit Logs
Comprehensive logging and audit trails for compliance and debugging.
07 Vault Support
Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.
08 VPC Deployment
Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.
09 Guardrails
Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.
[ SHIP RELIABLE AI ]
Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.
[ FAQ ]
Yes. Billing must be enabled on your Google Cloud project, though new accounts receive free credits.
A service account lets applications authenticate to Google Cloud APIs using a JSON key file instead of user login.
Vertex AI is available in regions such as us-central1, europe-west1, and asia-southeast1. Pick one close to your users.
Create a new JSON key in IAM, update apps, then delete the old key.
Yes. Configure Vertex AI in Bifrost and route via http://localhost:8080/vertex-ai.
Pricing varies by model and region. Use Bifrost for real-time cost attribution across providers.