AI Inference | Denvr Dataworks

Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -07.png

Enterprise-grade inference in minutes

Managed services for serverless and dedicated model endpoints.

Request Early Access

Denvr AI Inference Services

Price-performant inference at scale with OpenAI API compatibility.

Serverless Endpoints

Use OpenAI API-compatible APIs with leading open source foundation models like Llama, Qwen, and DeepSeek.

Dedicated Endpoints

Leverage private endpoints for reliability and privacy. For use with open weight and private fine-tuned models.

Intel Collaboration

Partnership with Intel to develop enterprise-ready inference powered by cost-efficient Intel Xeon and Gaudi AI accelerators.

Serverless Models Supported

Open-weight foundation models available with API endpoints for rapid integration.

MODEL NAME

PARAMS

CONTEXT

PRECISION

Llama 3.3

70B

32k

BF16

Llama 3.2

1B, 3B

32k

BF16

Llama 3.1

8B, 70B

32k

BF16

Llama 3.1 (soon)

405B

32k

FP8

DeepSeek R1 (soon)

671B

32k

FP8

Mistral v0.1

7B, 8x7B

32k

BF16

Qwen 2.5

7B, 14B, 32B, 72B

32k

BF16

Falcon 3

7B, 10B

32k

BF16

ALLam-AI Preview

32k

BGE M3 Embedder

108M

BF16

BGE M3 Reranker

568M

BF16

-> Native OpenAI API compatibility allowing quick model migration and efficient inference deployment.

-> Cost efficient managed services that reduces your hosting and operational expenses.

-> Model serving optimized for first-token latency, or batch throughput.

-> Serverless endpoints limited to published models and up 60 requests per second.

Early Access Program

Apply now for early access to Denvr AI Inference Services.

Partner Feedback Goals

-> Validate use of serverless models and preferred models
-> Validate use of private endpoints for user workloads
-> Validate performance requirements (Time-to-first-token and inter-token-latency)
-> Provide feedback on feature prioritization
-> Consult on pricing and SLA requirements

Upcoming Features

-> UI and API access for serverless and private endpoints

-> Full context size models (128K)

-> Self-service API key management

-> On-demand management of private endpoints
-> Model fine-tuning workflow

-> Detailed utilization metrics

-> Flexible billing via pre-paid VISA or post-paid Invoice

Pricing guidance inline with market for Serverless, private endpoints at $2.50 per Intel Gaudi2 GPU hour (on-demand) and discounts for term-based commitments. Pricing is under review with Early Access Partners.

Easy Deployment of Models with Denvr AI Inference Services

High value Inference with no unnecessary overhead.

Denvr Dataworks Website Updates 2025 - Home and Services pages V6.1 -10

Pre-Trained Models

Easy access and deployment of common ready-to-use (pre-trained) AI models.

No Hardware Management

Model deployment does not require any management, maintenance or operational overhead of hardware infrastructure.

Custom Model Support

Support for custom model hosting and deployment.

Adaptability

Experiment with different stack configurations and optimize compute resource dynamically.

Pay-Per-Use

Only pay for the compute resources used, reducing costs and eliminating wastage.

Flexibility & Scalability

Scale compute resources up or down quickly based on immediate needs.

Experiment with different stack configurations and optimize compute resource dynamically.

Adaptability

Only pay for the compute resources used, reducing costs and eliminating wastage.

Pay-Per-Use

Experiment with different stack configurations and optimize compute resource dynamically.

Immediate Access

Scale compute resources up or down quickly based on immediate needs.

Flexibility & Scalability

Denvr AI Inference Services

For Native AI developers that are commercializing developed models for generative AI and agentic AI workflows.

For businesses that are adopting AI to improve business efficiencies and insights.

Frequently asked questions

Chat

Ai Ascend

Inference Services Use Cases

Inference Core Features

AI Platform Services - Customer Types

Full Stack Optimization

Ready to get started?

Apply now for early access to Denvr AI Inference Services.

Request Early Access

Enterprise-grade inference in minutes

Managed services for serverless and dedicated model endpoints.

Denvr AI Inference Services

Serverless Endpoints

Dedicated Endpoints

Intel Collaboration

Serverless Models Supported

Early Access Program

Partner Feedback Goals

Upcoming Features

Easy Deployment of Models with Denvr AI Inference Services

High value Inference with no unnecessary overhead.

Pre-Trained Models

Easy access and deployment of common ready-to-use (pre-trained) AI models.

No Hardware Management

Model deployment does not require any management, maintenance or operational overhead of hardware infrastructure.

Custom Model Support

Support for custom model hosting and deployment.

Adaptability

Experiment with different stack configurations and optimize compute resource dynamically.

Pay-Per-Use

Only pay for the compute resources used, reducing costs and eliminating wastage.

Flexibility & Scalability

Scale compute resources up or down quickly based on immediate needs.

Experiment with different stack configurations and optimize compute resource dynamically.

Adaptability

Only pay for the compute resources used, reducing costs and eliminating wastage.

Pay-Per-Use

Experiment with different stack configurations and optimize compute resource dynamically.

Immediate Access

Scale compute resources up or down quickly based on immediate needs.

Flexibility & Scalability

Denvr AI Inference Services

For Native AI developers that are commercializing developed models for generative AI and agentic AI workflows. For businesses that are adopting AI to improve business efficiencies and insights.

Frequently asked questions

I want to learn more about the promotional pricing on the H100s

Take me to the AI Ascend page so I can sign up for free credits

I need to get my AI model training up and running fast. I'd like to speak to someone about getting onboarded.

Ready to get started?

Products

Resources

Company

For Native AI developers that are commercializing developed models for generative AI and agentic AI workflows.

For businesses that are adopting AI to improve business efficiencies and insights.