Back to Directory
eval-traced-swarm

eval-traced-swarm

Boilerplate for running isolated, exception-safe parallel agent swarms in Python eval with simulated trace contexts.

When orchestrating complex, multi-agent sweeps (like mapping repositories or bulk code reviews) inside the Python eval sandbox, a single failing subagent can crash the entire parallel() pool, losing all successful results.

To prevent data loss and ensure robust execution, ALWAYS wrap your parallel mapped functions in a defensive try/except wrapper that captures errors gracefully and injects a simulated trace context (useful for eventual Langfuse/OTEL propagation).

Boilerplate for eval Swarms:

import json
import uuid

def execute_with_trace(item, map_fn):
    """
    Wraps individual agent executions to catch exceptions 
    and provide isolated trace/observation IDs.
    """
    trace_id = str(uuid.uuid4())
    obs_id = str(uuid.uuid4())
    log(f"Starting traced task for item (traceId: {trace_id})")
    try:
        # Pass tracing context down if the map_fn supports it
        res = map_fn(item)
        return {
            "item": item, 
            "traceId": trace_id, 
            "obsId": obs_id,
            "status": "success", 
            "data": res
        }
    except Exception as e:
        return {
            "item": item, 
            "traceId": trace_id, 
            "obsId": obs_id,
            "status": "error", 
            "error": str(e)
        }

# Example Usage:
# results = parallel([lambda i=i: execute_with_trace(i, my_mapping_func) for i in TARGETS])
# successful = [r["data"] for r in results if r["status"] == "success"]

This ensures that partial swarm executions complete successfully, allowing the orchestrator to log failures and process the successful payload seamlessly.