Systems Online

Private AI
Infrastructure
Coming Soon

Enterprise-grade language model inference and dedicated cloud VMs. Sovereign compute. No rate limits. No data harvesting. Your workload runs on bare metal — not shared cloud. Email Us at Investigations@obsidianwatch.org For custom compute before we finish the site

Start Free Trial View Models VM Packages

<80ms Avg. Latency

99.9% Uptime SLA

128K Context Window

AI Models

Available Models

Open-weight models on dedicated GPU hardware. No queuing. No throttling. OpenAI-compatible API.

Fast

Mistral 7B

Ultra-low latency for high-throughput tasks. Ideal for real-time applications and classification.

Context32K tokens

Speed~120 tok/s

Price$0.08 / 1M tok

Balanced

Llama 3.1 70B

Best-in-class open model. Strong reasoning, code generation, and complex instruction following.

Context128K tokens

Speed~45 tok/s

Price$0.40 / 1M tok

Pro

DeepSeek Coder V2

Specialized for software engineering. Exceptional at multi-file context and debugging.

Context128K tokens

Speed~38 tok/s

Price$0.35 / 1M tok

Balanced

Qwen 2.5 32B

Multilingual powerhouse with strong STEM reasoning and instruction-following capabilities.

Context128K tokens

Speed~55 tok/s

Price$0.25 / 1M tok

Fast

Mistral NeMo 12B

Fast, capable mid-size model. Excellent for summarization, extraction, and multi-turn conversation.

Context128K tokens

Speed~90 tok/s

Price$0.12 / 1M tok

Pro

Llama 3.1 405B

Frontier-class open model. Near-GPT4 performance on complex reasoning, math, and coding tasks.

Context128K tokens

Speed~18 tok/s

Price$1.20 / 1M tok

Cloud VMs

Dedicated Virtual Machines

Full VMs on bare-metal Xeon hosts. GPU-accelerated options available. No noisy neighbours.

Scout

$29/mo

■4 vCPU

■16 GB RAM

■100 GB NVMe

■1 TB / mo

View Details

Ranger

$59/mo

■8 vCPU

■32 GB RAM

■250 GB NVMe

■3 TB / mo

View Details

Patriot

$119/mo

■16 vCPU

■64 GB RAM

■500 GB NVMe

■5 TB / mo

View Details

Sentinel

GPU Ready

$249/mo

■32 vCPU

■128 GB RAM

■1 TB NVMe

■Unmetered

View Details

See All VM Packages & GPU Options

Pricing

API Access Tiers

Transparent token-based pricing. No hidden fees. Cancel anytime.

Consumer

$19/mo

Full chat interface with generous monthly token allocation

Unlimited chat sessions
5M tokens included
All available models
128K context window
Zero data retention

Get Started

Drop-in OpenAI Compatible

Change one line. Keep your existing code. Our API is fully compatible with the OpenAI client spec — works with any SDK that supports a custom base URL.

REST + streaming (SSE)
Chat completions endpoint
Function calling support
Bearer token authentication
JSON mode
Python & JavaScript SDKs
Per-key rate limiting
Usage webhooks

Full API Reference

Python
from openai import OpenAI

# Change only the base_url — keep everything else
client = OpenAI(
  base_url="https://api.patriotsci.com/v1",
  api_key="ps-your-api-key-here"
)

response = client.chat.completions.create(
  model="llama3.1-70b",
  messages=[{
    "role": "user",
    "content": "Analyze this dataset..."
  }],
  stream=True
)

for chunk in response:
  print(chunk.choices[0].delta.content, end="")

Infrastructure

Bare Metal. No Cloud.

Your requests never touch AWS, Azure, or GCP. Dedicated hardware. Sovereign compute. Columbia, TN.

[GPU] A4500 w/NVlink NVIDIA Pro GPUs

[RAM] 1800GB+ System Memory

[CPU] 652c Xeon Cores

[NET] 10GbE Network Fabric

[STG] iSCSI SAN Storage

[SEC] VPN WireGuard Access

[HYP] Level 1 Hypervisors Hypervisor

[FW] Palo Alto Networks Firewall

Private AIInfrastructure Coming Soon

Available Models

Dedicated Virtual Machines

API Access Tiers

Drop-in OpenAI Compatible

Bare Metal. No Cloud.

Private AI
Infrastructure
Coming Soon