Everything you can run on a DGX Spark (and what I actually installed)

📌
Series: Running Local, building a private AI stack from scratch. This is Part 3. Part 1 and Part 2 are here.

After getting vLLM and VS Code Copilot running in part 2, I spent a few days going through everything else the DGX can run. The local AI ecosystem is enormous right now. Tools launching every week, repositories with five-digit star counts, every framework claiming to be production-ready.

I made a list, researched each one, and came out with strong opinions about what is worth your time. Most of it I skipped. Here is what I actually looked at and what ended up on the machine.

What was already running

Coming out of part 2, I had three things: vLLM serving Qwen3-Coder-Next-FP8 on port 8000, Open WebUI as a chat interface, Tailscale keeping the Mac connected. That covered model serving and basic access. The gap was everything else: a second model tier for cheaper work, a proper router, coding agents for autonomous tasks, and anything in the personal productivity space.

Coding agents

I had VS Code Copilot pointed at the DGX for in-editor work. That covers interactive coding: you open a file, ask a question, get an answer inline. What it does not cover is autonomous work. Read this codebase, find the bug, write the fix, open a PR. That is a different class of tool.

ToolStarsUXStatus
OpenCode172kTerminal, model-agnostic, MCP support✅ Installed
OpenHands74kWeb UI, Docker sandbox, async autonomous✅ Installed
Cline63kVS Code extension, plan/actSkipped
Aider46kTerminal, git-native, direct file editsSkipped
Continue.dev24kVS Code/JetBrains, autocomplete-heavySkipped

OpenCode (172k stars) is the closest open-source equivalent to Claude Code. Terminal-based, model-agnostic, MCP support, slash commands. I run it on the Mac pointed at the DGX. It is what I use for longer sessions where I want to stay in the terminal.

OpenHands (74k stars) handles async autonomous work. You give it a task, it runs in a Docker sandbox on the DGX, you come back when it is done. Slower and less reliable than interactive agents, but for batch tasks it is the right shape. The two complement each other: OpenCode for interactive, OpenHands for autonomous.

The rest: Cline overlaps with VS Code Copilot. Aider is faster for quick file edits but I did not need both. Continue.dev is autocomplete-heavy rather than agent-mode. None of them made the cut.

The routing layer

LiteLLM. OpenAI-compatible router, sits between every tool and the models. Every agent, every UI, every script points at LiteLLM and never touches vLLM directly. Swapping models or adding a new tier is a config change, not a rewiring job. Nothing else in this category is worth switching to.

A quick sweep of everything else

Vector databases: I had LanceDB from OpenClaw's memory plugin and pgvector available through Postgres. Neither hit a wall, so I added nothing. If either does, Qdrant is the pick: Rust-based, fast, production-grade for self-hosted RAG.

Agent frameworks: I looked at LangChain, Dify, Flowise, Langflow, and CrewAI. The one I would actually install is n8n, not because it is the most capable framework but because it solves a specific problem: workflow automation with AI nodes. GitHub webhook fires, DGX processes it, Slack gets a message. Four hundred integrations, runs as a Docker container. The rest try to be everything. n8n solves one thing well.

Image generation: ComfyUI. It is the standard, supports FLUX and WAN 2.2, and the node graph makes complex workflows reproducible. The other options (A1111, Forge, InvokeAI) are fine but ComfyUI has won this category.

Personal tools

ToolWhat it doesStatus
ImmichSelf-hosted Google Photos, face recognition, semantic search✅ Installed
Paperless-ngxDocument OCR and search, wireable to local LLM for classification✅ Installed
KarakeepSelf-hosted Pocket, bookmarks and notes with AI auto-tagging✅ Installed
AnythingLLMTurnkey RAG over docs, drop files and chat with themOnly if you need turnkey RAG

This was the category that surprised me most. None of these require a DGX: any always-on home server would do. But since the machine is already running, they are free to add.

Immich is self-hosted Google Photos. Face recognition, semantic search, all the features you would pay for, running locally. Paperless-ngx does document management with OCR: drop a receipt, contract, or tax scan and it finds it when you need it. Karakeep is self-hosted Pocket for bookmarks and notes with AI auto-tagging. I use all three.

Voice: faster-whisper on the DGX as an always-on transcription endpoint. Everything else in the voice category requires a specific use case I do not have yet.

What I deliberately did not install

I looked at Dify, LibreChat, Lobe Chat, Weaviate, AnythingLLM, LM Studio, Langflow, and several others. None went on the machine.

The filter I kept using: does this solve a problem I have today, or one I am imagining? Most tools that seemed fascinating failed that test. If you cannot name the specific workflow it enables, skip it for now. The trap with a machine this capable is installing twelve things you never use.

What is actually running on the DGX now is what I built SparkMonitor to track, and I will cover that in a future post.

Subscribe to Sahil's Playbook

Clear thinking on product, engineering, and building at scale. No noise. One email when there's something worth sharing.
[email protected]
Subscribe
Mastodon