Adding a safety and decision layer to robot AI

Most robotics engineers eventually encounter the same frustrating situation. A robot behaves unexpectedly during a test run. The team begins the usual investigation: logs are replayed, controller outputs are inspected, trajectories are plotted and reviewed frame by frame. And yet, even after all that effort, one fundamental question often remains unanswered.

Why did the robot make that particular decision at that precise moment?

In a laboratory environment this uncertainty is merely inconvenient. In a warehouse, factory, or logistics facility where robots operate near people, it becomes far more serious. If a system behaves unexpectedly, engineers need to understand not only what happened, but why the system considered that action acceptable in the first place.

Modern robotics stacks provide remarkable tools for building intelligent machines. We have powerful motion planners, sophisticated perception pipelines, mature simulation environments, and entire ecosystems built around ROS2. More recently, teams have begun experimenting with learned controllers and even LLM-driven robotic agents. These technologies make it possible to design robots that can make complex decisions and adapt to dynamic environments.

What most robotics stacks still lack, however, is a systematic way to answer three basic questions:

Was this action safe before it executed?
Which rule allowed or modified that action?
Can we reproduce this exact decision months later, when the system has already evolved?

Partenit was created to address precisely this gap.

What Partenit actually is

Repository:
https://github.com/GradeBuilderSL/partenit

Partenit is an open-source middleware layer designed to sit between a robot’s decision-making system and the robot itself. It does not attempt to replace existing planners or controllers. Instead, it introduces a validation and auditing step immediately before any command is executed.

In a typical robotics pipeline, commands flow directly from the controller to the robot:

controller / planner / LLM
            ↓
         robot API
            ↓
         hardware

With Partenit in place, every command is evaluated first:

controller / planner / LLM
            ↓
        AgentGuard
            ↓
   adapter (ROS2 / Isaac Sim / HTTP / Mock)
            ↓
      robot or simulator

This guard layer performs three tasks.

First, it validates each requested action against a set of safety policies before the command is executed. Second, if a command violates those policies, it can either modify the parameters—reducing speed, for example—or block the action entirely. Third, every decision is recorded in a structured log entry known as a DecisionPacket, which can later be inspected, replayed, or verified.

Importantly, the system does not interfere with the architecture of the robot controller. The planner continues to operate exactly as before. The guard simply evaluates the outcome before it reaches the robot.

Integrating the guard layer

In many cases, integrating Partenit requires surprisingly little code. Consider a simple Python environment in which a robot controller sends commands directly to a robot or simulator.

from partenit.adapters import MockRobotAdapter
from partenit.agent_guard import GuardedRobot

adapter = MockRobotAdapter()
adapter.add_human("worker-1", x=1.2, y=0.0)

robot = GuardedRobot(
    adapter=adapter,
    policy_path="examples/warehouse/policies.yaml",
    session_name="my_test",
)

decision = robot.navigate_to(zone="shipping", speed=2.0)

In this example the adapter provides environmental observations—in this case, the presence of a human worker roughly 1.2 meters away. The guard layer evaluates the command against the active safety policies before allowing it to proceed.

The resulting decision object contains a detailed explanation of what happened.

print(decision.allowed)
print(decision.modified_params)
print(decision.risk_score.value)
print(decision.applied_policies)

A typical output might show that the command was allowed but automatically modified. Because a person was detected within a specified safety distance, the system reduced the requested velocity and recorded the rule responsible for the change.

This evaluation occurs automatically, without modifying the controller itself.

Consistent behaviour across platforms

One advantage of separating the guard logic from the robot interface is that the same safety layer can operate across multiple environments. The only component that changes is the adapter.

For example:

adapter = ROS2Adapter(node_name="partenit_guard")

adapter = IsaacSimAdapter(base_url="http://localhost:8000")

adapter = HTTPRobotAdapter(base_url="http://192.168.1.100")

Because the safety policies and decision engine remain unchanged, engineers can write their rules once, test them in simulation, and then apply the same guard logic to physical robots.

This consistency significantly reduces the friction between simulated experiments and real-world deployment.

Safety policies that humans can read

Safety policies are defined using a YAML-based configuration format designed to be understandable outside the software engineering team.

rule_id: human_proximity_slowdown
name: Human Proximity Speed Limit
priority: safety_critical
provenance: ISO 3691-4 section 5.2

condition:
  type: threshold
  metric: human.distance
  operator: less_than
  value: 1.5
  unit: meters

action:
  type: clamp
  parameter: max_velocity
  value: 0.3
  unit: m/s

Because the policies are expressed declaratively, safety engineers, QA specialists, and auditors can review them without needing to understand the underlying codebase. This separation between safety specification and implementation can simplify both internal reviews and regulatory discussions.

Testing policies without running a robot

Partenit also allows policies to be evaluated independently of any simulation or hardware system. For example, a developer can run a policy check directly from the command line:

partenit-policy sim \
  --action navigate_to \
  --speed 2.0 \
  --human-distance 1.2 \
  --policy-path examples/warehouse/policies.yaml

The command returns a detailed explanation of which policies triggered and how the parameters were modified. This makes it possible to iterate on safety rules quickly, without repeatedly restarting a simulator.

Evaluating controller safety

Another useful capability is scenario-based benchmarking. Partenit includes tools for evaluating controllers using reproducible scenarios.

partenit-eval run examples/benchmarks/human_crossing_path.yaml \
  --report eval.html

The resulting report compares the behaviour of different controller configurations. Metrics such as collision rate, minimum human distance, near-miss frequency, and task completion rate are aggregated into a single score.

This makes it possible to quantify improvements in safety behaviour when introducing new policies or controller changes. Because the reports can be generated automatically, many teams integrate them into continuous integration pipelines to detect regressions early.

Understanding what actually happened

When debugging robotic behaviour, raw logs are rarely sufficient. Engineers typically need to reconstruct the sequence of decisions that led to a particular outcome.

Every action processed by Partenit generates a structured DecisionPacket that records the request, the evaluated policies, the resulting decision, and a cryptographic fingerprint for verification.

These packets can be replayed using a command-line tool:

partenit-log replay decisions/my_test/

The resulting timeline provides a clear view of how the system responded at each step. Because the packets are immutable and signed, they also provide a reliable audit trail for later analysis.

Supported environments

Partenit is designed to integrate with the environments robotics engineers already use.

Platform	Adapter
Pure Python simulation	MockRobotAdapter
HTTP-controlled robots	HTTPRobotAdapter
ROS2 systems	ROS2Adapter
NVIDIA Isaac Sim	IsaacSimAdapter
Unitree robots	UnitreeAdapter
Gazebo	GazeboAdapter

For example, when working in Isaac Sim, a minimal demonstration can be launched using:

python examples/isaac_sim/minimal_guard_demo.py

The robot in the simulation will immediately begin executing commands through the guard layer.

Sensor trust and uncertainty

One challenge often overlooked in safety systems is the gradual degradation of sensor reliability. Lighting conditions change, cameras become partially occluded, and perception models occasionally produce uncertain classifications.

Partenit incorporates a trust model that tracks sensor confidence over time. When confidence falls below a threshold, the system adopts more conservative assumptions. For example, an object that might be a person will be treated as a person until confidence improves.

This mechanism allows the guard layer to respond dynamically to uncertain perception data rather than assuming that every detection is equally reliable.

Getting started

Partenit can be installed directly from PyPI:

pip install partenit-core partenit-agent-guard \
            partenit-policy-dsl partenit-safety-bench \
            partenit-decision-log partenit-adapters

Alternatively, the source code can be obtained from GitHub:

https://github.com/GradeBuilderSL/partenit

After installation, the quickest way to understand how the guard layer works is to run the included examples.

python examples/robot_with_guard.py

and compare it with the unguarded version:

python examples/robot_without_guard.py

The difference in behaviour is immediately visible.

Why this matters

Robotics systems are becoming increasingly sophisticated. Learned controllers, reinforcement learning policies, and LLM-driven planning systems introduce powerful new capabilities—but they also make the reasoning behind decisions more opaque.

A guard layer such as Partenit does not attempt to constrain innovation in robot intelligence. Instead, it provides a structured way to understand and audit the decisions that result from those systems.

For teams building robots that operate near people, the ability to explain and reproduce decisions is not merely convenient. Over time, it becomes essential.

Repository