Try Bifrost Enterprise free for 14 days.

PERFORMANCE FEATURES ENTERPRISE PRICING DOCS BLOG

How to Get a Vertex AI API Key

Create a Mistral account at console.cloud.google.com, generate your API key, store it securely, then integrate with Bifrost for virtual keys, budgets, and cost governance. Complete setup in minutes.

Google CloudService accountGemini & ClaudeEnterprise SLAsBifrost gateway

Vertex AI provider summary

Bifrost supports Vertex AI for unified governance across Google-hosted foundation models.

Property	Details
Description	Vertex AI provides access to Gemini, Claude, Llama, and other models via Google Cloud with enterprise controls.
Provider route on Bifrost	vertex-ai/<model>
Provider doc	Vertex AI
API endpoint for provider	https://{region}-aiplatform.googleapis.com
Supported endpoints	/v1/models, /v1/chat/completions, /v1/responses, /v1/images/generations, /v1/images/edits, /v1/embeddings, /v1/count-tokens, /v1/rerank, /v1/videos

Official Vertex AI Resources

Google Cloud console and Vertex AI documentation.

Prerequisites

Before you begin, you will need:

Google accountGoogle Cloud projectBilling enabled on the project

Free credits: New Google Cloud accounts receive $300 in credits. Vertex AI also offers free-tier quotas for testing.

[ QUICK START ]

How Do You Set Up Vertex AI Credentials in 5 Steps?

Create a Google Cloud project

Use console.cloud.google.com.

Enable the Vertex AI API

In APIs & Services, search for Vertex AI API and click Enable for your project.

Enable billing

Link a billing account under Billing. Required for production even when using free credits.

Create a service account

In IAM & Admin → Service Accounts, create a service account and grant Vertex AI User.

Download the JSON key

Store the file securely, never commit it.

Open the service account → Keys → Add key → JSON. Set GOOGLE_APPLICATION_CREDENTIALS to the file path.

Sensitive credentials: The JSON key is equivalent to a password. Use secret managers in production.

[ MODELS ]

Available Vertex AI Models

Model	API ID	Best for
Gemini 2.5 Pro	gemini-2.5-pro	Flagship reasoning on Vertex AI.
Gemini 2.5 Flash	gemini-2.5-flash	Fast multimodal workloads at scale.
Gemini 2.5 Flash-Lite	gemini-2.5-flash-lite	Cost-optimized high-volume inference.
Gemini 2.0 Flash	gemini-2.0-flash	Prior-gen fast multimodal model.
Claude Sonnet 4.5	claude-sonnet-4-5@20250929	Anthropic model hosted on Vertex (MaaS).
Claude Haiku 4.5	claude-haiku-4-5@20251001	Fast Anthropic tier on Vertex.
Llama 3.3 70B Instruct	meta/llama-3.3-70b-instruct-maas	Meta open model on Vertex Model Garden.
Llama 3.1 405B Instruct	meta/llama-3.1-405b-instruct-maas	Largest Llama 3.1 on Vertex.
Llama 3.1 70B Instruct	meta/llama-3.1-70b-instruct-maas	Production open-weight chat on Vertex.
Llama 3.1 8B Instruct	meta/llama-3.1-8b-instruct-maas	Efficient Llama 3.1 on Vertex.
Mistral Large	mistral-large@2411	Mistral flagship on Vertex.
Imagen 3	imagen-3.0-generate-002	Image generation on Vertex.
Veo 2	veo-2.0-generate-001	Video generation on Vertex.
text-embedding-005	text-embedding-005	Google text embeddings for RAG.

Models and availability change over time. See the Vertex AI's models documentation for the latest list and pricing.

[ TROUBLESHOOTING ]

Troubleshooting Common Vertex AI Issues

Error	Likely Cause	What to Do
`401 Unauthorized`	Invalid or missing API key.	Verify your API key is correct. Generate a new key if needed.
`400 Bad Request`	Invalid request format or unsupported model.	Check request format and confirm model ID is valid.
`429 Rate Limited`	Rate limit exceeded for your plan.	Upgrade your plan or implement exponential backoff. Use Bifrost for intelligent load distribution.
`502/503 Service Error`	Temporary Mistral service unavailability.	Retry after a delay. Check Mistral status page. Configure failover with Bifrost.

[ PRODUCTION-READY ]

Use Vertex AI with Bifrost

Bifrost is a drop-in replacement for Vertex AI SDKs: keep your client code and change the base URL to your gateway. Bifrost handles cost tracking, virtual keys, budgets, and failover automatically.

Step 1: Start Bifrost and register Vertex AI

Run the Bifrost gateway and configure your Mistral credentials in the Web UI.

Terminal

$ npx -y @maximhq/bifrost

OUTPUT

✓ Bifrost started
├─ HTTP server listening on http://localhost:8080
├─ Web UI available at   http://localhost:8080
└─ Configure providers and virtual keys in the dashboard

→

Add the Vertex AI integration in the Web UI. For details, read Vertex AI on Bifrost.

Step 2: Point your Vertex AI SDK at Bifrost

Update your Vertex AI SDK client to route through the Bifrost gateway.

example.py

from google import genai

client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1",
    http_options={"base_url": "http://localhost:8080/genai"},
)

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Hello from Bifrost!",
)
print(response.text)

→

Virtual keys can be sent as x-bf-vk or Authorization: Bearer sk-bf-* per the Bifrost documentation.

[ WHAT'S NEXT ]

Explore Bifrost Resources

You have your API key. Add governance, guardrails, and MCP controls for production.

Access Control

Governance

Virtual keys, budgets, rate limits, routing, and enterprise RBAC with SSO.

Security

Guardrails

PII detection, content moderation, prompt injection defense, and compliance.

MCP

MCP Gateway

High-performance tool execution for AI agents with approvals and audit trails.

View all resources

Ready to Route Vertex AI Through Bifrost?

Bifrost is open source and production-ready. Get started in minutes with cost tracking, virtual keys, and failover built in.

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os

2from anthropic import Anthropic

4anthropic = Anthropic(

5 api_key=os.environ.get("ANTHROPIC_API_KEY"),

6 base_url="https://<bifrost_url>/anthropic",

9message = anthropic.messages.create(

10 model="claude-3-5-sonnet-20241022",

11 max_tokens=1024,

12 messages=[

13 {"role": "user", "content": "Hello, Claude"}

14 ]

15)

Drop in once, run everywhere.

[ FAQ ]

Frequently Asked Questions

Yes. Billing must be enabled on your Google Cloud project, though new accounts receive free credits.

A service account lets applications authenticate to Google Cloud APIs using a JSON key file instead of user login.

Vertex AI is available in regions such as us-central1, europe-west1, and asia-southeast1. Pick one close to your users.

Create a new JSON key in IAM, update apps, then delete the old key.

Yes. Configure Vertex AI in Bifrost and route via http://localhost:8080/vertex-ai.

Pricing varies by model and region. Use Bifrost for real-time cost attribution across providers.