Weight & Zone Cross-Validation: Implementation Guide for Freight Audit Pipelines
Weight & Zone Cross-Validation functions as the deterministic reconciliation layer within modern freight audit architectures. Positioned strictly downstream of raw invoice normalization and upstream of dispute routing, this stage isolates pricing anomalies before they reach payment workflows. By independently computing expected zones, resolving billable weight brackets, and querying contracted rate tables, the engine enforces mathematical consistency between carrier billing logic and shipper agreements. This validation layer operates as a foundational component of the broader Rule-Based Rate Validation & Accessorial Auditing framework, ensuring that base freight charges are mathematically sound before downstream modules evaluate surcharges or penalties.
Canonical Schema & Contract Table Preparation
The validation engine requires strictly typed, normalized shipment records. Upstream ETL processes must flatten heterogeneous carrier payloads (EDI 210 segments, carrier API JSON, or OCR-extracted PDFs) into a unified schema. The following Pydantic model enforces type safety and nullability constraints at the ingestion boundary:
from pydantic import BaseModel, Field, field_validator
from typing import Optional
import re
class CanonicalShipment(BaseModel):
shipment_id: str
carrier_scac: str
origin_zip: str = Field(pattern=r"^\d{5}$")
dest_zip: str = Field(pattern=r"^\d{5}$")
billed_weight_lbs: float
actual_weight_lbs: Optional[float] = None
dim_length_in: Optional[float] = None
dim_width_in: Optional[float] = None
dim_height_in: Optional[float] = None
service_level: str
billed_zone: Optional[int] = None
billed_freight_charge: float
contract_id: str
@field_validator("origin_zip", "dest_zip")
@classmethod
def validate_zip(cls, v: str) -> str:
if not re.match(r"^\d{5}$", v):
raise ValueError("ZIP code must be exactly 5 digits")
return v
Contract rate tables must be pre-loaded into a columnar, query-optimized store. DuckDB or Parquet-backed data lakes are recommended for sub-second lookups across millions of rate combinations. Each table must map weight_bracket, zone, and service_level to a base rate, alongside fuel surcharge multipliers and minimum charge floors. All monetary values should be stored as DECIMAL(19,4) to prevent floating-point drift during aggregation, per Python’s decimal documentation.
Deterministic Zone Resolution
Carrier zone assignments are rarely static. Parcel networks rely on annual zip-to-zone grid updates, while LTL carriers utilize distance-based matrices and freight class routing rules. The pipeline must independently derive the expected zone before comparing it to the carrier’s billed value.
Zone resolution begins with a direct lookup against the carrier’s published zip-pair table. When a direct mapping fails, the engine falls back to Lane Matching Algorithms that compute zones via centroid distance, regional grouping, or state-to-state routing tables. The resolved zone must then be validated against service-level constraints (e.g., Ground services capped at Zone 8, Express services capped at Zone 10).
import duckdb
from dataclasses import dataclass
from typing import Optional
@dataclass
class ZoneResolutionResult:
resolved_zone: int
resolution_method: str # "direct", "centroid_fallback", "state_group"
is_valid_for_service: bool
class ZoneResolver:
def __init__(self, duckdb_conn: duckdb.DuckDBPyConnection):
self.conn = duckdb_conn
def resolve(self, origin_zip: str, dest_zip: str, service_level: str) -> ZoneResolutionResult:
# Direct lookup
query = """
SELECT zone FROM carrier_zone_grid
WHERE origin_zip = ? AND dest_zip = ?
"""
result = self.conn.execute(query, [origin_zip, dest_zip]).fetchone()
if result:
return self._validate_service(result[0], service_level, "direct")
# Fallback to centroid/state logic (delegated to lane matching module)
fallback_zone = self._compute_fallback_zone(origin_zip, dest_zip)
return self._validate_service(fallback_zone, service_level, "centroid_fallback")
def _validate_service(self, zone: int, service: str, method: str) -> ZoneResolutionResult:
service_caps = {"GROUND": 8, "EXPRESS": 10, "FREIGHT": 12}
cap = service_caps.get(service.upper(), 12)
return ZoneResolutionResult(
resolved_zone=zone,
resolution_method=method,
is_valid_for_service=zone <= cap
)
def _compute_fallback_zone(self, origin: str, dest: str) -> int:
# Placeholder for centroid distance or state-grouping logic
# In production, this delegates to the lane matching pipeline
return 5 # Default safe zone for demonstration
Weight Bracket & Dimensional Logic
Billable weight is rarely the raw scale weight. Carriers apply dimensional weight formulas ((L × W × H) / divisor) and snap results to predefined weight brackets. The pipeline must replicate this logic deterministically.
When dimensional data is present, the engine calculates the dimensional weight and compares it against the scale weight. The higher value becomes the billable weight. This value is then mapped to the nearest contracted weight bracket (e.g., 1-50, 51-100, 101-150 lbs). Tolerance thresholds (typically ±1.0 lb or ±2%) are applied to account for carrier scale calibration variances. Detailed methodologies for handling scale discrepancies are documented in Cross-checking billable weight against actual weight logs.
from typing import Optional
def calculate_billable_weight(
actual: Optional[float],
dims: tuple[Optional[float], Optional[float], Optional[float]],
dim_divisor: int = 166,
bracket_step: int = 50
) -> int:
if actual is None:
raise ValueError("Actual weight is required for billable weight calculation")
dim_weight = 0.0
if all(d is not None for d in dims):
l, w, h = dims
dim_weight = (l * w * h) / dim_divisor
raw_billable = max(actual, dim_weight)
# Snap to contracted bracket
snapped = int(((raw_billable - 0.01) // bracket_step) * bracket_step + bracket_step)
return snapped
Contract Rate Reconciliation Engine
With the resolved zone and snapped weight bracket established, the engine queries the contract rate table to derive the expected base charge. This expected value is then compared to the billed_freight_charge from the canonical schema. Variance is calculated as a percentage and absolute delta.
This stage strictly validates base freight charges. It does not parse, score, or validate accessorial fees; those are routed to Accessorial Charge Scoring to maintain strict separation of concerns.
def reconcile_rate(
conn: duckdb.DuckDBPyConnection,
weight_bracket: int,
zone: int,
service_level: str,
contract_id: str,
billed_charge: float
) -> dict:
query = """
SELECT base_rate, fuel_surcharge_pct, min_charge
FROM contract_rates
WHERE contract_id = ?
AND weight_bracket = ?
AND zone = ?
AND service_level = ?
"""
row = conn.execute(query, [contract_id, weight_bracket, zone, service_level]).fetchone()
if not row:
raise LookupError(f"No contract rate found for {contract_id} | {weight_bracket} | {zone}")
base, fuel_pct, min_charge = row
expected = Decimal(str(base)) * (Decimal("1.0") + Decimal(str(fuel_pct)) / Decimal("100"))
expected = max(expected, Decimal(str(min_charge)))
expected = expected.quantize(Decimal("0.01"), rounding=ROUND_HALF_UP)
billed = Decimal(str(billed_charge))
variance_abs = abs(billed - expected)
variance_pct = (variance_abs / expected * Decimal("100")).quantize(Decimal("0.01"))
return {
"expected_charge": float(expected),
"variance_abs": float(variance_abs),
"variance_pct": float(variance_pct),
"status": "PASS" if variance_abs <= Decimal("0.50") else "FLAG"
}
Pipeline Boundaries & Error Handling Strategy
Maintaining strict stage boundaries prevents logic bleed and ensures predictable audit trails. This module explicitly excludes:
- Invoice Parsing/Normalization: Handled upstream.
- Accessorial/Stop Charge Validation: Routed downstream.
- Dispute Ticket Generation: Handled by the dispute routing engine.
Error Handling & Routing
The validation engine implements a tiered failure strategy:
- Missing Contract/Rate: Records are tagged
STATUS: CONTRACT_MISSINGand routed to a contract reconciliation queue. No financial variance is calculated. - Unresolvable Zone: If both direct and fallback lookups fail, the record is tagged
STATUS: ZONE_UNRESOLVEDand logged with origin/dest metadata for manual review. - Data Type/Schema Violations: Caught at the Pydantic boundary. Records are rejected immediately and routed to a dead-letter queue (DLQ) with validation error payloads.
- Tolerance Exceeded: Records passing schema and contract checks but exceeding variance thresholds are tagged
STATUS: RATE_VARIANCEand passed to the dispute routing layer with full audit metadata.
All errors are logged with structured JSON payloads containing shipment_id, carrier_scac, failure_code, and stack_trace. Retry logic is disabled for deterministic validation failures; only transient infrastructure errors (e.g., DuckDB connection timeouts) trigger exponential backoff. The output schema is strictly flattened to ensure downstream consumers receive only validated, enriched records ready for financial routing.