https://www.denvrdata.com/?utm_campaign=XAds&utm_campaign_id=1&utm_medium=paid&utm_source=X
top of page
Intel Page-10.png
Managed API endpoints for serverless or dedicated model hosting.

Intel Gaudi AI Accelerators

API endpoints for serverless and dedicated model hosting.

Denvr AI Inference

Deploy and scale your GenAI applications with foundation and private models

Partners Assets - 1.png
Managed API endpoints for serverless or dedicated model hosting.

Denvr AI Inference

Experience Intel Gaudi: Price-Performant AI Inference at Scale with OpenAI API compatibility.

Denvr AI Inference Services

Denvr Dataworks
Denvr Dataworks
Denvr Dataworks

Managed Endpoints

Intel Collaboration

Maximized Efficiency

Use OpenAI API-compatible APIs with leading open source models like Llama, Qwen, and DeepSeek. Support for shared or dedicated deployments for reliability and privacy.

Partnership with Intel to develop enterprise-ready inference engines.  Powered by cost-efficient Intel Gaudi 2 AI accelerators.

Leverage Intel Gaudi 2 AI acceleration to enhance compute density, lower infrastructure costs, and drive AI scalability across demanding enterprise applications.

Intel Page-08.png

MODEL NAME

PARAMS

CONTEXT

PRECISION

Llama 3.3

70B

32k

BF16

Llama 3.2

1B, 3B

32k

BF16

Llama 3.1

8B, 70B

32k

BF16

Llama 3.1

405B (available soon)

32k

FP8

DeepSeek-R1

671B (available soon)

32k

FP8

Mistral v0.1

7B, 8x7B

32k

BF16

Qwen 2.5

7B, 14B, 32B, 72B

32k

BF16

Falcon 3

7B, 10B

32k

BF16

ALLam-AI Preview

7B

32k

FP16

BGE M3 Embedder

108M

8k

BF16

BGE M3 Reranker

568M (available soon)

1k

BF16

Private models

Any

Any

Any

Foundation Models Supported

-> Native OpenAI API compatibility allowing quick model migration and efficient inference deployment.

-> Cost efficient managed services that reduces your hosting and operational expenses.

-> Model serving optimized for either interactive latency, or batch throughput.

-> Serverless endpoints limited to published models and up 60 requests per second.
 

-> Private endpoints for predictable performance and no rate limits

Beta Program

Partner Feedback Goals

-> Validate use of serverless models and preferred models

-> Validate use of private endpoints for user workloads

-> Validate performance requirements (Time-to-first-token and inter-token-latency)

-> Provide feedback on feature prioritization

-> Consult on pricing and SLA requirements

Pricing guidance is expected to be inline with market for Serverless; private endpoints at $2.50 per Intel Gaudi2 GPU hour (on-demand) and discounts for term-based commitments.  Pricing is under review with Beta Partners.

Upcoming Features

-> UI and API access for serverless and private endpoints

-> Full context size models (128K)

-> Self-service API key management

-> On-demand management of private endpoints

-> Model fine-tuning workflow

-> Detailed utilization metrics

-> Flexible billing via pre-paid VISA or post-paid Invoice

bottom of page