Skip to main content
Most AI tools run a simple loop: get your prompt, call an LLM, return text. If you’re lucky, it uses a tool. That’s it. NIOM is different. Every message flows through a reasoning pipeline that adapts its effort based on how complex your request is — from instant responses for quick questions to multi-step planning with quality checks for serious work.

How it works

Your message → Analyze → Route → Execute → Evaluate → Respond (or refine and try again)
1

1. Analyze your intent

A fast model (~200ms) reads your message and figures out: What’s the goal? How complex? Which tools are needed? Should this be a background task?Simple messages like “hi” or “what time is it?” skip this step entirely — zero added latency.
2

2. Pick the right strategy

Based on complexity, NIOM chooses how much effort to invest:
ComplexityExampleWhat NIOM does
Simple”What’s my CPU usage?”Quick response, 3 tool calls max
Standard”List my project structure and explain it”Executes with light evaluation, up to 10 steps
Complex”Refactor this module into smaller files”Full pipeline with quality evaluation loop, up to 25 steps
Long-running”Write a summary every 2 days”Creates a background task with scheduling
3

3. Execute with tools

The agent gets to work — reading files, running commands, searching the web, taking screenshots, calling MCP tools — whatever your request needs.
4

4. Check its own work

After execution, a separate evaluation step asks: Did we actually achieve the goal? What’s the quality?For complex tasks, this creates a refinement loop: execute → evaluate → improve → re-execute (up to 3 rounds) until the quality bar is met.

Smart model routing

NIOM doesn’t burn expensive models on cheap tasks. It uses the right model for each phase:
RoleWhat it doesDefaultWhy this model
FastAnalyzes intent, evaluates qualityGroq Llama 3.3 70BSub-second, nearly free
CapableDoes the actual workYour selected modelThe workhorse — Claude, GPT-4o, etc.
VisionUnderstands screenshotsGPT-4o or ClaudeNeeded for computer use tasks
This means analysis happens in ~200ms on a free-tier model, while your premium model is reserved for the work that actually matters.

Why this matters

What most AI agents doWhat NIOM does
LLM decides when to stop (usually too early)Quality criteria defined upfront, checked after
No quality assessmentExplicit evaluation with refinement loops
Same effort for “hi” and “refactor my codebase”Adaptive depth based on detected complexity
Can’t loop back to improveExecute → evaluate → refine → re-execute
One model for everythingRight model for each phase = faster + cheaper
This pipeline is what makes background tasks, MCP integration, and computer use reliable — every tool call goes through structured reasoning, not hope.