top of page
Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -07.png

Enterprise-grade inference in minutes

Managed services for serverless and dedicated model endpoints.

Denvr AI Inference Services

Price-performant inference at scale with OpenAI API compatibility.

Denvr Dataworks

Serverless Endpoints

Use OpenAI API-compatible APIs with leading open source foundation models like Llama, Qwen, and DeepSeek. 

Denvr Dataworks

Dedicated Endpoints

Leverage private endpoints for reliability and privacy. For use with open weight and private fine-tuned models.

Denvr Dataworks

Intel Collaboration

Partnership with Intel to develop enterprise-ready inference powered by cost-efficient Intel Xeon and Gaudi AI accelerators.

Serverless Models Supported

Open-weight foundation models available with API endpoints for rapid integration.

MODEL NAME

PARAMS

CONTEXT

PRECISION

Llama 3.3

70B

32k

BF16

Llama 3.2

1B, 3B

32k

BF16

Llama 3.1

8B, 70B

32k

BF16

Llama 3.1 (soon)

405B

32k

FP8

DeepSeek R1 (soon)

671B

32k

FP8

Mistral v0.1

7B, 8x7B

32k

BF16

Qwen 2.5

7B, 14B, 32B, 72B

32k

BF16

Falcon 3

7B, 10B

32k

BF16

ALLam-AI Preview

7B

32k

BF

BGE M3 Embedder

108M

8k

BF16

BGE M3 Reranker

568M

1k

BF16

-> Native OpenAI API compatibility allowing quick model migration and efficient inference deployment.

-> Cost efficient managed services that reduces your hosting and operational expenses.

-> Model serving optimized for first-token latency, or batch throughput.

-> Serverless endpoints limited to published models and up 60 requests per second.

 

Early Access Program

Apply now for early access to Denvr AI Inference Services.

Partner Feedback Goals

-> Validate use of serverless models and preferred models
-> Validate use of private endpoints for user workloads
-> Validate performance requirements (Time-to-first-token and inter-token-latency)
-> Provide feedback on feature prioritization
-> Consult on pricing and SLA requirements

Upcoming Features

-> UI and API access for serverless and private endpoints

-> Full context size models (128K)

-> Self-service API key management

-> On-demand management of private endpoints
-> Model fine-tuning workflow

-> Detailed utilization metrics

-> Flexible billing via pre-paid VISA or post-paid Invoice

Pricing guidance inline with market for Serverless, private endpoints at $2.50 per Intel Gaudi2 GPU hour (on-demand) and discounts for term-based commitments.  Pricing is under review with Early Access Partners.

Easy Deployment of Models with Denvr AI Inference Services

High value Inference with no unnecessary overhead.

inference api_edited.png
Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -10

Pre-Trained Models

Easy access and deployment of common ready-to-use (pre-trained) AI models.

Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -10

No Hardware Management

Model deployment does not require any management, maintenance or operational overhead of hardware infrastructure.

Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -10

Custom Model Support

Support for custom model hosting and deployment.

Denvr Dataworks
Denvr Dataworks
Denvr Dataworks
Denvr Dataworks
Denvr Dataworks

Adaptability

Experiment with different stack configurations and optimize compute resource dynamically.

Denvr Dataworks

Pay-Per-Use

Only pay for the compute resources used, reducing costs and eliminating wastage.

Denvr Dataworks

Flexibility & Scalability

Scale compute resources up or down quickly based on immediate needs.

Experiment with different stack configurations and optimize compute resource dynamically.

Adaptability

Only pay for the compute resources used, reducing costs and eliminating wastage.

Pay-Per-Use

Experiment with different stack configurations and optimize compute resource dynamically.

Immediate Access

Scale compute resources up or down quickly based on immediate needs.

Flexibility & Scalability

AI Services - Primary

For Native AI developers that are commercializing developed models for generative AI and agentic AI workflows.

For businesses that are adopting AI to improve business efficiencies and insights.

Frequently asked questions

Website - July 2025 - V2.0-08.jpg

Ready to get started?

Apply now for early access to Denvr AI Inference Services.

bottom of page