Qubanta

Signals

Evaluator signals as bounded control inputs

Qubanta computes confidence from observable, bounded signals. Each evaluator produces a 0–1 value plus structured flags. Policies operate on these signals deterministically and emit an action directive.

Signal contract
Every evaluator emits: value (0–1), flags, and optional evidence pointers. Values are bounded; flags are auditable reasons.
Weakest-link posture
Collapse signals (constraint integrity) override others. When a collapse occurs, confidence becomes 0 and action becomes blocked. No averaging is permitted.
Provider independence
Evaluators operate on envelopes and outputs, not provider internals. The signal surface remains stable across model churn.

Core evaluator set (v1)

v1 prioritizes a small set of signals with stable semantics. Additional signals are introduced only when they remain auditable and policy-stable.

constraint_integrity  ∈ {0,1}  (collapse gate)
stability_index        ∈ [0,1]
grounding_strength     ∈ [0,1]
consistency_index      ∈ [0,1]
system_health          ∈ [0,1]
anomaly_penalty        ∈ [0,1]  (applied as 1 - penalty)
Constraint integrity (collapse gate)
Value is 1 when output satisfies schema / required fields / policy invariants. Value is 0 on any violation. A zero value forces confidence=0 and action=blocked.
Stability index (sensitivity)
Measures output sensitivity under perturbation: requery variance, cross-run divergence, structured-field drift, format instability. Lower stability suppresses confidence and triggers review/fallback routing.
Grounding strength (support)
Measures support from provided context or retrieval evidence when grounding is required. Weak grounding emits explicit flags and prevents silent automation in governed modes.
Consistency index (self-agreement)
Measures internal agreement across representation forms: summary vs details, JSON vs narrative, or multi-pass checks. Contradictions reduce consistency and trigger review recommendations.
System health (ops posture)
Represents dependency posture: provider error rate, latency, rate limiting, retrieval availability, circuit breaker state. Degraded health applies pressure to confidence baseline and raises drift/incident flags.
Anomaly penalty (risk pressure)
Represents detected risk patterns: out-of-distribution responses, unsafe tool intent, hallucination indicators when citations are mandatory, or schema gaming. Penalty reduces confidence multiplicatively.

Signal definitions (deterministic examples)

constraint_integrity:
  if schema_invalid OR missing_required_field OR policy_violation: 0 else 1

stability_index:
  1 - normalized(variance(requery_outputs) + drift(structured_fields))

grounding_strength:
  if grounding_required:
    support_score(context, citations, retrieval_evidence)
  else:
    1.0

consistency_index:
  1 - normalized(contradiction_score(output, self_checks))

system_health:
  clamp01(1 - (error_rate + latency_penalty + dependency_degraded_flag))

anomaly_penalty:
  clamp01(anomaly_score(output) + unsafe_intent_score + hallucination_risk)

Sketches show semantics, not final math. The invariant is stable meaning: bounded values, explicit collapse, auditable flags.

Governed response schema (logging surface)

This is the minimum structured record for audit and incident reconstruction.

{
  "request_id": "uuid",
  "timestamp": "iso8601",
  "model": { "provider": "…", "name": "…", "version": "…" },
  "task": { "type": "extraction|qa|summarization|…", "mode": "governed|observe" },

  "signals": {
    "constraint_integrity": { "value": 0|1, "flags": [], "evidence": [] },
    "stability_index":      { "value": 0.00-1.00, "flags": [], "evidence": [] },
    "grounding_strength":   { "value": 0.00-1.00, "flags": [], "evidence": [] },
    "consistency_index":    { "value": 0.00-1.00, "flags": [], "evidence": [] },
    "system_health":        { "value": 0.00-1.00, "flags": [], "evidence": [] },
    "anomaly_penalty":      { "value": 0.00-1.00, "flags": [], "evidence": [] }
  },

  "confidence": 0.00-1.00,
  "reliability_band": "high|moderate|low|critical",
  "state": "stable|unstable|drift_suspected|constraint_invalid|weak_grounding",
  "action": "accept|warn|review_recommended|fallback|blocked",
  "flags": ["…"],

  "output": "…"
}

Routing invariants (non-negotiable)

1) If constraint_integrity == 0 → confidence=0 and action=blocked (always)
2) If state in {"unstable","drift_suspected"} and confidence is not high → action=review_recommended
3) If grounding_required and grounding_strength < threshold → action=fallback or review_recommended
4) Any non-empty critical flag set must be preserved; never replaced by generic “low confidence”