

Intel Gaudi AI Accelerators
Managed API endpoints for serverless or dedicated model hosting.
Denvr AI Inference
Deploy and scale your GenAI applications with foundation and private models


Denvr AI Inference
Experience Intel Gaudi: Price-Performant AI Inference at Scale with OpenAI API compatibility.
The Denvr and Intel Partnership



Managed Endpoints
Strategic AI Collaboration
Maximized Efficiency
Use OpenAI API-compatible APIs with leading open source models like Llama, Qwen, and DeepSeek. Support for shared or dedicated deployments for reliability and privacy.
Denvr Dataworks and Intel partner to deliver high-performance acceleration for GenAI and LLMs. Powered by cost-efficient Intel Gaudi 2 AI accelerators.
Leverage Intel Gaudi 2 AI acceleration to enhance compute density, lower infrastructure costs, and drive AI scalability across demanding enterprise applications.

MODEL NAME
PARAMS
CONTEXT
PRECISION
Llama 3.3
70B
128k
BF16
Llama 3.2
1B, 3B
128k
BF16
Llama 3.1
8B, 70B, 405B
128k
BF16, FP8
DeepSeek-R1
671B
128k
FP8
Qwen 2.5
7B, 14B, 32B, 72B
128k
BF16
Mistral v0.1
7B, 8x7B
32k
BF16
Falcon 3
7B, 10B
32k
BF16
BGE M3 Embedder
108M
8k
BF16
BGE M3 Reranker
568M
1k
BF16
Private models
Any
128k
BF16
Foundation Models Supported
-> Built for large-scale AI, delivering enterprise-grade performance for first token and inter-token latency.
-> Native OpenAI API compatibility allowing quick model migration and efficient inference deployment.
-> Scalable infrastructure optimized for LLM and RAG (Retrieval-augmented generation).
-> Cost efficient managed services that reduces your hosting and operational expenses.
-> Expanded model flexibility giving developers the freedom to choose LLMs based on specific needs, price, and availability.
-> Private endpoints for predictable performance, no rate limits, any model selection not available on serverless models.