What is Nemotron 3 Super and how does it compare to other open models for agents?

Nemotron 3 Super is NVIDIA's new 120B-parameter open model (12B active parameters — MoE architecture) optimized for agentic AI workflows. It scores 85.6% on PinchBench, the benchmark for OpenClaw task performance, making it the top open model for agent use cases. It delivers 5x the throughput of previous Nemotron models. Locally, it requires 96GB VRAM (DGX Spark or RTX PRO). On Ollama's cloud, it runs free: `ollama launch openclaw --model nemotron-3-super:cloud`.

What's new in NemoClaw with the Ollama update?

NemoClaw now ships with native Ollama support pre-configured in the installer. The one-line install (`curl -fsSL https://nvidia.com/nemoclaw.sh | bash`) now prompts you to select Ollama as your runtime, then select nemotron-3-super:cloud as your model. This removes the manual configuration step that previously made running NemoClaw with local/cloud models more involved. Connect after install with `nemoclaw my-assistant connect`.

What is OpenShell and which agents does it now support?

OpenShell is NVIDIA's sandboxed runtime for AI agents — it provides isolated execution with policy-based security, network guardrails, and data privacy controls. Originally NemoClaw-specific, OpenShell now extends to Claude Code, Codex, and OpenCode. Any agent can get the same sandboxed safety model: `openshell sandbox create --from ollama`. This brings enterprise-grade agent security to the full coding agent ecosystem, not just OpenClaw.

Do I need NVIDIA hardware to use Nemotron 3 Super?

Not for cloud use. Nemotron 3 Super is available free on Ollama's cloud without any GPU requirements. For local deployment, you need 96GB VRAM — which means a DGX Spark (128GB unified memory) or NVIDIA RTX PRO workstation. GeForce RTX users can instead run Nemotron 3 Nano 4B, the smaller model in the same family optimized for resource-constrained hardware.

What is PinchBench and why does the Nemotron 3 Super score matter?

PinchBench is a new benchmark designed specifically to measure how well LLMs perform with OpenClaw — testing the types of tool calls, multi-step task execution, and agent reasoning that OpenClaw relies on. Generic coding benchmarks (HumanEval, SWE-bench) don't capture agentic performance well. Nemotron 3 Super scoring 85.6% and ranking #1 among open models means it's been optimized specifically for the tool-use and planning patterns that make agents actually useful.

How does the NemoClaw + Ollama setup compare to running OpenClaw with cloud models?

The key tradeoff is privacy vs. cost. Running OpenClaw with Anthropic/OpenAI sends every prompt to a cloud provider. NemoClaw with Ollama + Nemotron 3 Super (cloud) keeps your prompts within Ollama's infrastructure rather than a general LLM provider — and it's free at the base tier. For full local privacy (prompts never leave your machine), you need the hardware for local deployment. NemoClaw also adds OpenShell sandboxing on top, which cloud-model OpenClaw deployments don't have.

Nemotron 3 Super + NemoClaw: NVIDIA and Ollama Just Made Local Agents Practical

By Prahlad Menon Published 2026-03-19 2 min read

Two weeks after NVIDIA announced NemoClaw at GTC 2026, the stack just got meaningfully more usable. Ollama and NVIDIA have shipped a set of updates that together make running a local, private, enterprise-safe AI agent practical for the first time without significant infrastructure overhead.

Here’s what changed.

Nemotron 3 Super: The Model That Matters

Architecture: 120B total parameters, 12B active (Mixture-of-Experts) PinchBench: 85.6% — #1 among open models for OpenClaw tasks Throughput: 5x faster than previous Nemotron generation Local requirement: 96GB VRAM (DGX Spark or RTX PRO) Cloud: Free on Ollama’s cloud (nemotron-3-super:cloud)

The MoE architecture is why the throughput jump is possible: 120B parameters but only 12B active per token, so the model gets the reasoning depth of a 120B model at a fraction of the inference cost.

PinchBench is the number that matters. Generic LLM benchmarks (HumanEval, MMLU, SWE-bench) test coding ability and knowledge, not agentic performance. PinchBench specifically measures the tool-calling, multi-step planning, and task execution patterns that OpenClaw relies on. 85.6% and #1 among open models is a meaningful result — it means Nemotron 3 Super has been deliberately optimized for agent workflows, not just general capability.

To run it on Ollama’s cloud:

ollama launch openclaw --model nemotron-3-super:cloud

To run locally (96GB VRAM required):

ollama pull nemotron-3-super

NemoClaw + Native Ollama: The Setup Is Now One Command

Our previous NemoClaw post covered the initial announcement — a privacy and security wrapper for OpenClaw with OpenShell sandboxing. The friction was configuration: wiring Ollama support in manually.

That’s gone. The updated installer handles it:

curl -fsSL https://nvidia.com/nemoclaw.sh | bash

During install, select 2 for Ollama when prompted for your runtime. When prompted for a model, select nemotron-3-super:cloud. Then connect:

nemoclaw my-assistant connect

What you get:

OpenClaw with OpenShell sandboxing (isolated agent execution)
Privacy guardrails on cloud model calls
Nemotron 3 Super as the model — optimized for agent tasks
Pre-configured Ollama runtime — no manual wiring

The install adds up to a local (or cloud-backed) agent with enterprise-grade security controls that previously required significant configuration to achieve.

OpenShell: Now for Claude Code, Codex, and OpenCode Too

This is the underreported piece of the update.

OpenShell — NVIDIA’s sandboxed execution runtime — has extended beyond NemoClaw. It now works as a safety wrapper for other coding agents:

# Create a sandboxed environment from Ollama
openshell sandbox create --from ollama

# Launch any agent inside it
ollama

This means Claude Code, Codex, and OpenCode can now run inside an OpenShell sandbox — getting the same policy-based security, network controls, and execution isolation that NemoClaw has, without requiring NemoClaw itself.

The practical implication: if you’re already using Claude Code for development, you can wrap it in OpenShell to prevent it from accidentally exfiltrating credentials, hitting external endpoints it shouldn’t, or executing destructive commands outside the sandbox. Same agent, safer runtime.

The Hardware Picture

Setup	Model	Hardware Required	Cost
Ollama cloud (free)	nemotron-3-super:cloud	None	Free
Ollama cloud (Pro/Max)	Any, multi-agent	None	Subscription
Local full	nemotron-3-super (full)	96GB VRAM	Hardware
Local small	Nemotron 3 Nano 4B	GeForce RTX	Hardware

Nemotron 3 Nano 4B — if you’re on a standard GeForce RTX without 96GB VRAM, NVIDIA shipped Nano 4B as a compact option for the same agent workflows on constrained hardware. Lower capability ceiling, but runs on consumer GPUs.

DGX Spark — NVIDIA’s desktop AI supercomputer with 128GB unified memory is the natural home for Nemotron 3 Super locally. At 120B parameters, it fits comfortably within the memory envelope with room for context.

What This Means for the Agent Stack

The picture that’s emerging from GTC 2026 and these follow-on updates:

NVIDIA is positioning as the infrastructure layer for agents. Not just GPU hardware — the full stack from model (Nemotron) to runtime (OpenShell) to security framework (NemoClaw) to hardware (DGX Spark, RTX PRO). The Ollama partnership plugs the distribution gap: Ollama handles discovery and delivery; NVIDIA provides the optimized model and security layer.

The cloud fallback is free. Nemotron 3 Super on Ollama’s cloud costs nothing at the base tier. This removes the barrier that previously made local-quality agents expensive — you get the PinchBench #1 model without paying per-token to Anthropic or OpenAI.

OpenShell extending to all coding agents is the move that matters for enterprise adoption. CISOs don’t block specific agents — they block the security risk that agents represent. OpenShell gives them a sandboxed runtime answer to that concern, regardless of which agent is running inside it.

Quick Start

# Full NemoClaw setup with Ollama (recommended)
curl -fsSL https://nvidia.com/nemoclaw.sh | bash
# → Select 2 (Ollama) → Select nemotron-3-super:cloud
nemoclaw my-assistant connect

# OpenShell sandbox for existing agents (Claude Code, Codex, etc.)
openshell sandbox create --from ollama

Full documentation: docs.nvidia.com/nemoclaw

Sources: NVIDIA Blog — GTC 2026 NemoClaw · Ollama announcement email, March 19 2026 · PinchBench