Try Bifrost Enterprise free for 14 days.

PERFORMANCE FEATURES ENTERPRISE PRICING DOCS BLOG

How to Get a Groq API Key

Create a Groq account at console.groq.com, generate your API key, configure billing for production, then integrate with Bifrost for ultra-fast LLM inference with failover and cost governance. Complete in minutes.

Console & keysBearer authOpenAI compatibleLPU accelerationBifrost gateway

Groq provider summary

Bifrost supports Groq models through OpenAI-compatible HTTP APIs and standard JSON request shapes. Groq uses LPU hardware for ultra-fast inference.

Property	Details
Description	Groq provides ultra-fast LLM inference using Language Processing Units (LPUs) for chat, reasoning, and coding workloads.
Provider route on Bifrost	groq/<model>
Provider doc	Groq Documentation
API endpoint for provider	https://api.groq.com/openai/v1
Supported endpoints	/v1/models, /v1/completions, /v1/chat/completions, /v1/responses, /v1/audio/speech, /v1/audio/transcriptions

Official Groq Resources

Use these Groq-hosted links for console access, API documentation, and authentication details.

Prerequisites

Before you begin, you will need:

Groq accountEmail addressPayment method for production (optional for free tier)

Generous free tier: Groq's free tier includes substantial token allowances for testing and development. Upgrade to a paid plan for production workloads.

[ QUICK START ]

How Do You Get a Groq API Key in 5 Steps?

Create or sign in to a Groq account

Use the Groq Console.

Go to console.groq.com and sign up with your email address, Google, GitHub, or SSO.

GroqCloud create account or login page with Google, GitHub, SSO, and email sign-in options

Navigate to API Keys

In the Groq console, click "API Keys". You'll see your existing keys and the option to create a new one.

Generate and copy your API key

Your key is displayed once. Copy it immediately and store it securely.

Click "Create API Key" and give it a descriptive name. Your key will be displayed. Copy it immediately and store it securely as an environment variable.

Terminal (macOS/Linux)

export GROQ_API_KEY="gsk-..."

Treat keys like passwords: Never expose API keys in client-side code or commit them to version control. Store in .env files and add to .gitignore.

Set up billing for production

Add a payment method when ready for higher limits.

Groq offers a free tier with generous token allowances. When you're ready for production use or exceed free limits, add a payment method in the Billing section.

Make your first Chat Completions call

Authenticate with Bearer tokens per Groq's OpenAI-compatible API.

Groq's API is OpenAI-compatible and uses Authorization: Bearer GROQ_API_KEY for REST calls:

Terminal

$ curl https://api.groq.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -d '{
    "model": "mixtral-8x7b-32768",
    "messages": [{"role":"user","content":"Hello!"}]
  }'

[ MODELS ]

Available Groq Models

Model	API ID	Best for
Llama 3.3 70B Versatile	llama-3.3-70b-versatile	Production chat, reasoning, and coding at scale.
Llama 3.1 8B Instant	llama-3.1-8b-instant	Fast, cost-efficient inference for high volume.
GPT OSS 120B	openai/gpt-oss-120b	Open-weight flagship with tool use and reasoning.
GPT OSS 20B	openai/gpt-oss-20b	Lower-latency open-weight model for real-time apps.
Groq Compound	groq/compound	Agentic system with web search and code execution.
Groq Compound Mini	groq/compound-mini	Lighter compound system for faster agent workflows.
Llama 4 Scout 17B	meta-llama/llama-4-scout-17b-16e-instruct	Preview multimodal Llama 4 on Groq.
Qwen3 32B	qwen/qwen3-32b	Preview dense Qwen3 reasoning model.
Whisper Large V3	whisper-large-v3	Speech-to-text transcription.
Whisper Large V3 Turbo	whisper-large-v3-turbo	Faster, lower-cost transcription.
Safety GPT OSS 20B	openai/gpt-oss-safeguard-20b	Preview safety-classified GPT-OSS variant.
Llama Prompt Guard 2 22M	meta-llama/llama-prompt-guard-2-22m	Input safety classification (preview).
Llama Prompt Guard 2 86M	meta-llama/llama-prompt-guard-2-86m	Stronger prompt guard model (preview).

Models and availability change over time. See the Groq's documentation for the latest list and pricing.

[ TROUBLESHOOTING ]

Troubleshooting Common Groq API Errors

Error	Likely Cause	What to Do
`401 Unauthorized`	Invalid or missing API key.	Verify your API key is correct. Generate a new key if needed.
`400 Bad Request`	Invalid request format or unsupported model.	Check request format against OpenAI API reference. Verify model ID.
`429 Rate Limited`	Rate limit exceeded for your plan.	Upgrade your plan or implement exponential backoff. Use Bifrost for intelligent load distribution.
`502/503 Service Error`	Temporary Groq service unavailability.	Retry after a delay. Check Groq status page. Configure failover with Bifrost.

[ PRODUCTION-READY ]

Use Your Groq Key with Bifrost

Bifrost is a drop-in replacement for Groq SDKs. Update your base URL and keep your client code. Bifrost handles cost tracking, virtual keys, budgets, and intelligent failover.

Step 1: Start Bifrost and register Groq

Run the Bifrost gateway and configure your Groq credentials in the Web UI.

Terminal

$ npx -y @maximhq/bifrost

OUTPUT

✓ Bifrost started
├─ HTTP server listening on http://localhost:8080
├─ Web UI available at   http://localhost:8080
└─ Configure providers and virtual keys in the dashboard

→

Add the Groq integration in the Web UI. For details, read Groq on Bifrost.

Step 2: Point your Groq SDK at Bifrost

Update your SDK to route through Bifrost's OpenAI-compatible gateway.

example.py

from openai import OpenAI

# BEFORE
# client = OpenAI(api_key="your-groq-key", base_url="https://api.groq.com/openai/v1")

# AFTER: route via Bifrost + virtual key
client = OpenAI(
    api_key="sk-bf-your-virtual-key",
    base_url="http://localhost:8080/openai"
)

response = client.chat.completions.create(
    model="groq/mixtral-8x7b-32768",
    messages=[{"role": "user", "content": "Hello from Bifrost!"}]
)

print(response.choices[0].message.content)

→

Virtual keys can be sent as x-bf-vk or Authorization: Bearer sk-bf-* per the Bifrost documentation.

[ WHAT'S NEXT ]

Explore Bifrost Resources

You have your API key. Add governance, guardrails, and MCP controls for production.

Access Control

Governance

Virtual keys, budgets, rate limits, routing, and enterprise RBAC with SSO.

Security

Guardrails

PII detection, content moderation, prompt injection defense, and compliance.

MCP

MCP Gateway

High-performance tool execution for AI agents with approvals and audit trails.

View all resources

Ready to Route Groq Through Bifrost?

Bifrost is open source and production-ready. Get started in minutes with cost tracking, virtual keys, and failover built in.

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os

2from anthropic import Anthropic

4anthropic = Anthropic(

5 api_key=os.environ.get("ANTHROPIC_API_KEY"),

6 base_url="https://<bifrost_url>/anthropic",

9message = anthropic.messages.create(

10 model="claude-3-5-sonnet-20241022",

11 max_tokens=1024,

12 messages=[

13 {"role": "user", "content": "Hello, Claude"}

14 ]

15)

Drop in once, run everywhere.

[ FAQ ]

Frequently Asked Questions

Groq uses Language Processing Units (LPUs) instead of GPUs. LPUs are purpose-built hardware optimized for sequential token generation, delivering significantly lower latency than general-purpose GPUs.

Yes, Groq provides a free tier with generous token allowances for testing and development. For production use, upgrade to a paid plan for higher rate limits and throughput.

Yes. Groq provides an OpenAI-compatible API. You can use the official OpenAI Python and JavaScript SDKs by changing the base URL to Groq endpoint and providing your Groq API key.

Groq supports popular open-source models like Mixtral 8x7B, LLaMA 2, Gemma, and others. Check the Groq console for the latest available models and their specifications.

Implement exponential backoff in your application. For production workloads, upgrade your plan for higher limits. Use Bifrost to intelligently distribute requests across multiple providers.

Absolutely. Groq's ultra-fast inference makes it ideal for failover and fallback scenarios. Configure Bifrost to route requests to Groq when your primary provider is unavailable or slow.