Skip to content

register_enabled_extensions_for_agent has no per-extension error isolation: one failing extension silently drops the rest #2950

@PascalThuet

Description

@PascalThuet

Summary

ExtensionManager.register_enabled_extensions_for_agent iterates over enabled extensions with no per-extension error isolation. If registering one extension raises, the loop aborts and the exception propagates, so every subsequent enabled extension is skipped — the agent ends up with only a prefix of its extensions.

The callers wrap the whole call in a single best-effort try/except (_register_extensions_for_agent in src/specify_cli/integrations/_helpers.py), so the wholesale abort surfaces as one warning while the command still exits 0. The partial state is silent.

Affected paths

Root cause

src/specify_cli/extensions.py, register_enabled_extensions_for_agent:

for ext_id, metadata in self.registry.list().items():
    if not metadata.get("enabled", True):
        continue
    manifest = self.get_extension(ext_id)
    if manifest is None:
        continue
    ext_dir = self.extensions_dir / ext_id
    updates = {}
    if agent_config and not skills_mode_active:
        registered = registrar.register_commands_for_agent(agent_name, manifest, ext_dir, self.project_root)
        ...
    registered_skills = self._register_extension_skills(manifest, ext_dir)
    ...
    self.registry.update(ext_id, updates)

There is no try/except around the per-extension body, so an exception from register_commands_for_agent / register_commands (e.g. an OSError writing a command file, or a path collision where a command directory is expected) or from _register_extension_skills propagates out of the loop and skips the remaining extensions.

Note the loop already handles the non-raising degraded cases, so those are not triggers: a manifest that fails to load is skipped (get_extension(...) is None → continue), and an empty registration result is handled by clearing stale state. The gap is specifically genuine exceptions raised mid-loop.

Reproduction

With two or more enabled extensions, force the first one iterated to raise during registration (e.g. an OSError from a command-file write). Trigger registration for an agent (integration install <agent> / switch / upgrade): only a single warning is printed, the later healthy extensions are not registered for that agent, and the command exits 0.

A deterministic version is in the fix's regression test (test_one_failing_extension_does_not_abort_the_rest): it patches the first extension's registration to raise and asserts the later one still registers.

Expected

A failure registering one extension should warn and continue to the rest (per-extension isolation), so a single bad extension cannot silently drop the others.

Fix

PR #2951 wraps the per-extension loop body in try/except, warns per failing extension, and continues; the caller-level best-effort catch remains as a backstop.

Surfaced by review of #2949 (follow-up to #2886). Distinct from #2948, which is about where extension artifacts render (skills vs commands) for non-active skills-mode agents, not error isolation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions