No, our robot will not coo at your dog. But the analogy is more engineering than poetry.
We build a self-control layer for robots. Not the brain — the thing that keeps the brain from doing something stupid.
Partenit sits between the robot’s decision-making (whether that’s a classical planner, an RL policy, or an LLM agent) and the actual execution. Every action gets validated against safety policies, scored for risk, and logged with a cryptographic fingerprint before anything moves. Think of it as pre-execution auditing for autonomous systems.
That’s the product. But this post is about the idea behind the product — and why we ended up reading endocrinology papers while building robotics infrastructure.
The Perfectly Rational Robot That Can’t Keep Its Cool
Here’s a pattern every robotics engineer has seen. You build a solid stack: perception → world model → planner → execution. The planner is rational — it evaluates risk, picks optimal actions, re-plans when conditions change.
And the robot behaves like it’s had four espressos.
A person appears at the end of a corridor — full stop, replan. They leave — resume, accelerate. A door opens somewhere — brake, replan. Door closes — resume. Plot the velocity profile and it looks like an EKG during a panic attack.
Every individual decision is correct. The trajectory as a whole is a disaster.
The standard fix: low-pass filter on the risk score, hysteresis, tuned re-planning frequency. These help the way noise-cancelling headphones help when someone is yelling at you — the signal gets quieter, but you’re losing real information.
We kept asking: why is this so hard? Humans navigate chaotic environments all day without jittering. What’s structurally different?
Your Endocrine System Is a Better Engineer Than You Think
Right now your sensory system is processing thousands of signals per second. People in peripheral vision, sounds, light changes, temperature shifts. Objectively noisier than anything a LiDAR sees.
Yet you don’t re-plan your route every 100ms.
“The brain filters noise” — true, but that explains perception, not decision-making. You can notice a sudden movement and still not change your plan. That’s a different mechanism.
That mechanism is endocrine. Cortisol, adrenaline, serotonin — these molecules don’t make decisions. They configure the parameters of decision-making. Cortisol doesn’t say “run.” It shifts threat detection thresholds, attention allocation, and motor readiness so that downstream decisions happen with different trade-offs.
The engineering insight: hormones operate on a fundamentally different timescale than reflexes. They integrate patterns over minutes, not milliseconds. They create a slowly-evolving regulatory state that acts as a prior for every fast decision.
A reflex says: “loud noise → flinch.” An endocrine state says: “the last ten minutes have been increasingly tense → lower the flinch threshold, widen attention, bias toward escape routes.”
This isn’t philosophy. This is control theory with a biological implementation.
What Most Robot Architectures Are Missing
We looked at how most autonomous systems work and saw a structural gap: everything operates on a single timescale — the control loop.
Risk score? Computed fresh every cycle. Planning horizon? Fixed. Safety margins? Fixed or hysteresis-smoothed (which is just a lazy integral with no semantics). Confidence in the world model? Binary — “sensor works” or “sensor doesn’t.”
No state variable accumulates experience across cycles. Nothing encodes “the last 90 seconds have been calm” versus “the last 90 seconds have been chaotic.” The planner treats every cycle as if the robot just booted up.
In biological terms: the field has built sophisticated nervous systems with zero endocrine regulation. Fast reflexes, no regulatory context.
What We Actually Built (And What It Has to Do With Hormones)
Partenit’s self-control layer does something that maps surprisingly well onto the endocrine analogy — though we arrived at the architecture from an engineering direction, not a biological one.
When AgentGuard validates a robot’s action, it doesn’t just check the current state against a policy. It evaluates the action within an accumulated context — a regulatory state that evolves slower than the control loop.
Concretely, this means tracking variables like:
Predictability — how well the world model’s predictions have matched observations over the last N seconds. Sustained prediction error means the environment is behaving in ways the model doesn’t capture.
Interaction load — not “is there a person nearby right now” but “how many collision-avoidance events has the system handled recently.” A robot that’s been dodging people for two minutes is in a different operational reality than one that dodged someone once.
Model trust — an aggregate confidence. When sensors disagree, when localisation jumps, when objects appear that aren’t on the semantic map — individually, each might be noise. Accumulated, they signal model degradation.
None of these make decisions. They modulate the parameters under which decisions get validated. Same action, same instantaneous risk score — but different regulatory context leads to a different validation outcome.
A speed of 1.5 m/s with risk score 0.4 in a high-trust regulatory state? Allowed. The same speed and risk score after two minutes of escalating unpredictability? Clamped to 0.8 m/s, wider margins enforced, re-check interval shortened.
“So You Built a Low-Pass Filter With Extra Steps?”
Fair question. Here’s the difference.
A low-pass filter smooths the output. It trades latency for stability — you react slower to everything, including real threats.
A regulatory state doesn’t smooth anything. The risk score stays fast and reactive. What changes is how that score gets interpreted by the validation layer. A risk spike of 0.7 in a high-trust context means “something unusual in an otherwise calm environment — brief caution.” The same 0.7 in a low-trust context means “latest in a series of escalating anomalies — change the operating regime.”
Same signal, different response — not because the signal was filtered, but because the accumulated context shapes interpretation. This is closer to Bayesian inference with an evolving prior than to signal smoothing.
No, the Robot Will Not Get Sad
Precision matters here. This is not artificial emotions. The robot doesn’t “feel cautious.” It has a state vector that parameterises the validation pipeline in a way that produces behaviour a human would describe as cautious.
The distinction is critical because it keeps everything debuggable. Every value in the regulatory state traces back to specific sensor events and specific integration dynamics. No black box, no learned “mood” — leaky integrators with interpretable semantics.
And this is where it connects to what we think matters most: explainability.
A system with a persistent regulatory state produces better audit trails. Instead of “action blocked because risk exceeded 0.7,” a DecisionPacket can say: “action blocked because risk exceeded 0.7 within a low-trust regulatory context that accumulated over the last 94 seconds due to 3 prediction failures, 2 unmodelled obstacles, and 1 localisation discontinuity.“
That’s the difference between “the AI said no” and a forensic record a safety engineer can actually work with.
A Concrete Scenario
Hospital corridor. The robot has been operating for three minutes in a calm, predictable environment. Regulatory state: high trust. Speed allowances are generous, planning horizon is long, re-check intervals are relaxed.
A group of visitors appears. Instantaneous risk spikes. But the regulatory state is high-trust, so AgentGuard allows a proportional response — minor speed reduction, slightly wider margins.
The group passes. Behind them: equipment that’s not on the map. Localisation wobbles. Two more unexpected obstacles. The regulatory state starts shifting — not because any single event was critical, but because the rate of surprises is climbing.
By the time a nurse steps out from a room, the system is already in a more conservative regime. It stops earlier, requests a wider corridor, flags the area for re-mapping. Not because this nurse was more dangerous than the visitors — but because the accumulated evidence says the world model in this area is currently unreliable.
This is the behaviour you want from a self-control layer. And it’s extremely hard to get from instantaneous risk scoring alone.
The Multi-Timescale Problem
Almost all autonomy software is architected around a single timescale — the control loop. Everything optimised to be as fast and reactive as possible.
But robots in human spaces face a fundamentally multi-timescale problem. Events happen on milliseconds. Situations evolve over seconds. The character of an environment shifts over minutes. A system that only operates on the fastest timescale will always be reactive and never truly adaptive.
Biology solved this with layered architecture: fast reflexes for immediate threats, hormonal regulation for medium-term adaptation, circadian rhythms for long-term adjustment. Each layer runs at its natural timescale and modulates the layers below it.
Brooks’ subsumption architecture was circling related ideas decades ago. But the field has under-explored the slow end of the hierarchy. We’ve gotten very good at fast loops. The slow loop — the one that gives a robot something like a persistent sense of its own operational situation — is still mostly missing.
That’s what we’re building into Partenit. Not just a validator that checks each action in isolation, but a self-control layer that remembers what kind of day the robot is having and adjusts its standards accordingly.
The Question
Maybe truly reliable autonomous systems won’t be defined by better perception, faster planners, or smarter LLM agents.
Maybe what they need is a metabolism — a slow, persistent process that digests experience and shapes the conditions under which every decision gets validated.
Nature figured this out a few hundred million years ago. We’re taking notes.
Building the self-control layer for autonomous robots at Partenit. Open-source, Apache 2.0. If you’ve hit similar walls with reactive architectures — let’s compare notes. https://github.com/GradeBuilderSL/partenit

