Systems Online
Private AI
Infrastructure
Coming Soon
Enterprise-grade language model inference and dedicated cloud VMs.
Sovereign compute. No rate limits. No data harvesting.
Your workload runs on bare metal — not shared cloud. Email Us at Investigations@obsidianwatch.org For custom compute before we finish the site
<80ms
Avg. Latency
99.9%
Uptime SLA
128K
Context Window
AI Models
Available Models
Open-weight models on dedicated GPU hardware. No queuing. No throttling. OpenAI-compatible API.
Fast
Mistral 7B
Ultra-low latency for high-throughput tasks. Ideal for real-time applications and classification.
Context32K tokens
Speed~120 tok/s
Price$0.08 / 1M tok
Balanced
Llama 3.1 70B
Best-in-class open model. Strong reasoning, code generation, and complex instruction following.
Context128K tokens
Speed~45 tok/s
Price$0.40 / 1M tok
Pro
DeepSeek Coder V2
Specialized for software engineering. Exceptional at multi-file context and debugging.
Context128K tokens
Speed~38 tok/s
Price$0.35 / 1M tok
Balanced
Qwen 2.5 32B
Multilingual powerhouse with strong STEM reasoning and instruction-following capabilities.
Context128K tokens
Speed~55 tok/s
Price$0.25 / 1M tok
Fast
Mistral NeMo 12B
Fast, capable mid-size model. Excellent for summarization, extraction, and multi-turn conversation.
Context128K tokens
Speed~90 tok/s
Price$0.12 / 1M tok
Pro
Llama 3.1 405B
Frontier-class open model. Near-GPT4 performance on complex reasoning, math, and coding tasks.
Context128K tokens
Speed~18 tok/s
Price$1.20 / 1M tok
Cloud VMs
Dedicated Virtual Machines
Full VMs on bare-metal Xeon hosts. GPU-accelerated options available. No noisy neighbours.
Scout
$29/mo
■4 vCPU
■16 GB RAM
■100 GB NVMe
■1 TB / mo
View Details
Ranger
$59/mo
■8 vCPU
■32 GB RAM
■250 GB NVMe
■3 TB / mo
View Details
Patriot
$119/mo
■16 vCPU
■64 GB RAM
■500 GB NVMe
■5 TB / mo
View Details
Sentinel
GPU Ready
$249/mo
■32 vCPU
■128 GB RAM
■1 TB NVMe
■Unmetered
View Details
Pricing
API Access Tiers
Transparent token-based pricing. No hidden fees. Cancel anytime.
Consumer
$19/mo
Full chat interface with generous monthly token allocation
- Unlimited chat sessions
- 5M tokens included
- All available models
- 128K context window
- Zero data retention
Get Started
Most Popular
Developer
$49/mo
Full API access + chat UI. Build production applications on private infrastructure.
- Full REST API access
- 10M tokens included
- OpenAI-compatible endpoint
- All consumer features
- Usage dashboard
- Webhook support
- Priority inference queue
Get Started
Enterprise
Custom
Dedicated capacity, SLAs, and white-glove onboarding for high-volume workloads.
- Dedicated GPU allocation
- Unlimited tokens
- Custom model deployment
- 99.9% uptime SLA
- Private VPN endpoint
- WireGuard tunnel access
- 24/7 priority support
Contact Sales
Developer API
Drop-in OpenAI Compatible
Change one line. Keep your existing code. Our API is fully compatible with the OpenAI client spec — works with any SDK that supports a custom base URL.
- REST + streaming (SSE)
- Chat completions endpoint
- Function calling support
- Bearer token authentication
- JSON mode
- Python & JavaScript SDKs
- Per-key rate limiting
- Usage webhooks
from openai import OpenAI
client = OpenAI(
base_url="https://api.patriotsci.com/v1",
api_key="ps-your-api-key-here"
)
response = client.chat.completions.create(
model="llama3.1-70b",
messages=[{
"role": "user",
"content": "Analyze this dataset..."
}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
Infrastructure
Bare Metal. No Cloud.
Your requests never touch AWS, Azure, or GCP. Dedicated hardware. Sovereign compute. Columbia, TN.
[GPU]
A4500 w/NVlink
NVIDIA Pro GPUs
[RAM]
1800GB+
System Memory
[CPU]
652c
Xeon Cores
[NET]
10GbE
Network Fabric
[STG]
iSCSI
SAN Storage
[SEC]
VPN
WireGuard Access
[HYP]
Level 1 Hypervisors
Hypervisor
[FW]
Palo Alto Networks
Firewall