fix(deps): support multi-host dependency identity#1735
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses multi-host dependency correctness and UX by making lockfile identity host-aware (while keeping github.com keys stable), aligning single-file download auth boundaries with clone auth, adding an explicit type: gitlab host signal for bespoke GitLab instances, and improving download error clarity (only 404 falls through; non-404/network errors surface with endpoint context).
Changes:
- Introduces a shared host-aware dependency unique key (
build_dependency_unique_key) and threadshost_typethrough dependency parsing, lockfile round-trips, host classification, and backend selection. - Updates generic-host file download behavior to use per-dependency auth resolution (avoiding managed PAT headers on generic HTTP downloads) and improves verbose/error diagnostics for raw/API attempts.
- Adds regression tests plus documentation updates for
type: gitlaband the lockfile key migration rule.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/models/dependency/reference.py |
Adds shared host-aware unique-key builder and host_type parsing for object-form deps. |
src/apm_cli/deps/lockfile.py |
Persists host_type and uses shared unique-key logic for lockfile indexing. |
src/apm_cli/core/auth.py |
Extends host classification + auth resolution cache keys to incorporate explicit host_type. |
src/apm_cli/deps/host_backends.py |
Threads host_type into backend selection/classification to route GitLab correctly. |
src/apm_cli/deps/github_downloader.py |
Ensures GitLab routing decisions consider host_type in host classification. |
src/apm_cli/deps/github_downloader_validation.py |
Aligns validation host classification with host_type. |
src/apm_cli/deps/download_strategies.py |
Uses per-dependency auth boundary for file downloads; improves fallback/error surfacing; adds helper error builders. |
src/apm_cli/install/validation.py |
Makes GitLab detection in validation respect explicit host_type. |
tests/test_lockfile.py |
Adds regression tests for host-aware lockfile keys and host_type round-trip. |
tests/test_github_downloader.py |
Updates/extends tests for token boundaries, GitLab routing via type: gitlab, and non-404 error surfacing. |
tests/unit/deps/test_download_strategies_selection.py |
Updates host mock to reflect generic-host token boundary behavior. |
tests/unit/deps/test_download_strategies_phase3.py |
Same as above for phase3 coverage. |
packages/apm-guide/.apm/skills/apm-usage/dependencies.md |
Documents type: gitlab and lockfile key migration behavior. |
packages/apm-guide/.apm/skills/apm-usage/authentication.md |
Documents GitLab host classification options including type: gitlab. |
docs/src/content/docs/reference/lockfile-spec.md |
Documents host_type and the new host-aware lockfile key rule. |
docs/src/content/docs/getting-started/authentication.md |
Adds type: gitlab as an explicit GitLab classification mechanism. |
docs/src/content/docs/consumer/manage-dependencies.md |
Documents object-form type: gitlab and provides examples/usage guidance. |
CHANGELOG.md |
Adds an Unreleased “Fixed” entry describing the multi-host lockfile + auth + error improvements. |
Copilot's findings
- Files reviewed: 18/18 changed files
- Comments generated: 3
| host_value = (host or "").strip() | ||
| if host_value and host_value.lower() != "github.com": | ||
| return f"{host_value}/{key}" | ||
| return key |
| | `repo_url` | string | yes | Canonical repo URL (e.g. `github.com/owner/repo`). Unique key for the entry, except for virtual and local entries (see below). | | ||
| | `host` | string | no | FQDN when not inferable from `repo_url` (e.g. for registry proxies or non-GitHub hosts). | | ||
| | `host_type` | string | no | Explicit host-kind hint, currently `gitlab`, copied from object-form `type: gitlab`. | |
|
|
||
| ### Fixed | ||
|
|
||
| - `apm install` now keeps same-path dependencies from different git hosts distinct in `apm.lock.yaml`, aligns generic-host file-download auth with clone auth, and surfaces non-404 download failures with host and endpoint context. (#773) |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | Shared key helper removes duplication; host-blind vs host-qualified split is clean and documented. |
| Auth Expert | 0 | 1 | 0 | host_type threaded everywhere and folded into the resolve cache key; GitLab token precedence correct; unknown types fail closed. |
| Supply Chain Security | 0 | 1 | 0 | Host-qualified identity closes a cross-host overwrite vector; generic-host downloads stop leaking managed/credential-fill tokens. |
| CLI Logging Expert | 0 | 1 | 1 | Download errors now name host + endpoint + status/exception; verbose breadcrumbs unified across hosts. |
| DevX UX Expert | 0 | 1 | 0 | Migration-safe and additive; determinism preserved; orphan/install UX unaffected by the new key. |
| Test Coverage Expert | 0 | 0 | 1 | TDD-first; every behavior trapped. Error-message URL parsing in tests is format-brittle. |
| Performance Expert | 0 | 0 | 0 | No added round-trips (resolve is cached); fail-fast on hard download errors trims wasted attempts. |
| OSS Growth Hacker | 0 | 0 | 0 | type: gitlab opens self-managed GitLab/Gitea adoption; amplify in release notes. |
| Doc Writer | 0 | 0 | 0 | lockfile-spec migration section, type: gitlab, CHANGELOG, and apm-usage guide all updated; no drift. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 4 follow-ups
- [Supply Chain Security] Keep the credential-fill posture shift visible -- private generic-host (Gitea/Gogs) file reads via the HTTP download path now require a clone path or a host backend signal. -- The trade-off is documented, but it is the one behavior change an existing user could feel; a prominent migration line reduces surprise 401/empty-read reports.
- [Python Architect] The two
get_canonical_dependency_string()bodies (lockfile.py and reference.py) are now near-identical. -- Consider sharing the host-blind branch the same waybuild_dependency_unique_key()was extracted, before it drifts or trips the R0801 duplication gate later. - [Test Coverage / CLI Logging] Download-error tests parse URLs out of the message via
split()+startswith("https://")+rstrip("."). -- Asserting on structured context (host, endpoint, status) instead of message scraping would survive future wording tweaks. - [Auth Expert]
typecurrently accepts onlygitlaband raises on anything else. -- Track the extension path (e.g. futuregitea/bitbuckethints) so the fail-closedValueErroris a conscious gate rather than a future foot-gun.
Architecture
classDiagram
class DependencyReference {
+str repo_url
+str host
+str host_type
+int port
+get_unique_key() str
+get_canonical_dependency_string() str
}
class LockedDependency {
+str repo_url
+str host
+str host_type
+get_unique_key() str
+get_canonical_dependency_string() str
}
class build_dependency_unique_key {
<<function>>
+repo_url
+host
+source
+virtual_path
+registry_prefix
}
class AuthResolver {
+classify_host(host, port, host_type) HostInfo
+resolve(host, org, port, host_type) AuthContext
+resolve_for_dep(dep_ref) AuthContext
}
DependencyReference --> build_dependency_unique_key : host-qualified key
LockedDependency --> build_dependency_unique_key : host-qualified key
AuthResolver ..> DependencyReference : reads host_type
class DependencyReference:::core
class build_dependency_unique_key:::core
flowchart TD
manifest[apm.yml dependency] --> ref[DependencyReference]
ref --> typehint{explicit type gitlab?}
typehint -->|yes| gitlab[GitLab REST + GITLAB_APM_PAT chain]
typehint -->|no| classify[AuthResolver.classify_host]
ref --> keyfn[build_dependency_unique_key]
keyfn --> hostq{host == github.com or unset?}
hostq -->|yes| barekey[bare owner/repo lock key]
hostq -->|no| qualkey[host/owner/repo lock key]
barekey --> lock[apm.lock.yaml]
qualkey --> lock
ref --> orphan[get_canonical_dependency_string]
orphan --> match[host-blind orphan / Source match]
classify --> token[_resolve_dep_token / _resolve_dep_auth_ctx]
token --> dl[single-file download = clone auth boundary]
Recommendation
Merge once you are comfortable with the documented credential-fill posture shift (follow-up 1) -- it is the only change an existing user could feel, and it is intentional and secure-by-default. Everything else is polish: track the shared-helper duplication (follow-up 2) and the test message-scraping brittleness (follow-up 3) as low-priority cleanups. CI is green, tests are TDD-first and comprehensive, and the migration is additive. Note this PR shares lockfile.py with #1738 (#1209); whichever lands second should rebase.
Full per-persona findings
Python Architect
- [recommended] Shared
build_dependency_unique_key()is consumed by bothDependencyReference.get_unique_key()andLockedDependency.get_unique_key()atsrc/apm_cli/models/dependency/reference.pyandsrc/apm_cli/deps/lockfile.py.
Good extraction: it removes the prior copy-paste of the key logic and makes the host-qualification rule single-sourced. The registry-prefix carve-out (proxy host is transport, not identity) is correctly documented and preserves the manifest/lockfile key correspondence.
Suggested: none -- this is the right shape. - [nit] The two
get_canonical_dependency_string()implementations are now near-identical (local_path / virtual_path / repo_url) acrosslockfile.pyandreference.py.
Below the R0801 10-line threshold today, but a candidate for the same extraction treatment before it drifts.
Auth Expert
- [recommended]
host_typeis threaded throughclassify_host,resolve,resolve_for_dep,backend_for, and validation, and is folded into theresolve()cache key tuple atsrc/apm_cli/core/auth.py.
Includinghost_typein the cache key is the correct call -- it prevents a hinted classification from poisoning an unhinted sibling entry for the same (host, port, org). GitLab token precedence (GITLAB_APM_PAT->GITLAB_TOKEN-> credential fill) is preserved, and an unsupportedtyperaisesValueError(fail closed).
Supply Chain Security
- [recommended] Host-qualified lockfile identity (
host/owner/repofor non-default hosts) closes a cross-host key-overwrite vector, anddownload_github_filenow resolves tokens via_resolve_dep_token/_resolve_dep_auth_ctxatsrc/apm_cli/deps/download_strategies.py.
Net secure-by-default tightening: generic hosts no longer receive APM-managed PATs, and the previously-forwarded host-scoped credential-fill token is no longer attached on the HTTP file-download path. The trade-off (private generic-host file reads must use clone or a host backend signal) is documented in the PR body and docs. No blocking concern; flagged as a follow-up for user-facing migration visibility.
CLI Logging Expert
- [recommended] New
_build_download_http_error/_build_download_network_errorhelpers surface host + endpoint + status (or exception type), and verbose breadcrumbs now log Contents API attempts uniformly (previously gated onnot is_github_host).
This directly satisfies the download-error-clarity item: a non-404 failure now names the failing endpoint instead of degrading to a generic missing-file message. ASCII-clean. - [nit] Error strings end with a trailing period that tests strip (
rstrip(".")); harmless but couples message punctuation to test parsing.
DevX UX Expert
- [recommended] Migration is additive and deterministic: existing
github.comkeys stay byte-stable, non-default hosts become visible only in their own key, and host casing is normalized so it cannot mint duplicate keys (tests/test_lockfile.py::test_get_unique_key_lowercases_non_default_host).
Because orphan/Source matching uses the host-blind key, the install/orphan UX is unchanged for existing users -- the new key surface is invisible unless a non-default host is actually in play.
Test Coverage Expert
- [nit] Coverage is TDD-first and traps every claimed behavior: collision (
test_add_dependency_keeps_same_repo_from_different_hosts), github.com preservation, host lowercasing, host_type round-trip, bespoke GitLab routing (test_object_form_type_gitlab_routes_bespoke_host_to_gitlab_api), non-404 surfacing, and the generic token boundary.
The only soft spot: error-path tests reconstruct URLs by scraping the message string; asserting on structured fields would be more robust. (The ~16 localtest_download_*failures are pre-existing network/env artifacts, green in CI -- not attributable to this PR.)
Performance Expert
No findings. The file-download path swaps a direct resolve() call for the cached _resolve_dep_token / _resolve_dep_auth_ctx boundary (no added round-trips), and narrowing raw-URL fallback to 404-only trims wasted attempts on hard failures. classify_host with host_type stays pure/cheap.
OSS Growth Hacker
No findings. Adoption note only: type: gitlab plus host-qualified identity makes APM viable for self-managed GitLab and multi-host Gitea/Gogs estates without hostname heuristics -- a concrete contributor/adopter segment worth amplifying in the release announcement.
Doc Writer
No findings. The lockfile-spec gains a "Lockfile identity keys" migration section, type: gitlab is documented across manage-dependencies.md, authentication.md, and the apm-usage guide, and the CHANGELOG carries a user-facing "Fixed" entry. No code/doc drift detected.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Folds copilot-pull-request-reviewer follow-ups: case-insensitive host key normalization, lockfile-spec doc update, changelog PR number. Refs #773 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The initial #773 implementation conflated two distinct identity functions, shipping the PR with failing CI: - get_canonical_dependency_string() (host-blind canonical for the apm_modules/ filesystem layout + orphan detection) had been made to delegate to the host-qualified get_unique_key(), breaking deps-list orphan/Source matching for non-default-host deps. Restored its host-blind body. - get_unique_key() gated "local" detection on source=="local" while every other method on DependencyReference gates on is_local; a local ref with is_local=True but unset source returned repo_url instead of local_path. Now derives the local signal from is_local, consistent with get_identity()/get_canonical_dependency_string(). - Registry-proxy deps (registry_prefix set, e.g. an Artifactory mirror) were host-qualified, breaking the manifest/lockfile key correspondence used by reinstall + orphan detection. build_dependency_unique_key() now short-circuits to the bare logical key when registry_prefix is set; LockedDependency/DependencyReference pass their prefix through. Added LockedDependency.get_canonical_dependency_string() (host-blind, mirrors DependencyReference) and switched deps-list orphan/Source matching to it so non-default-host deps match their installed dir. Tests: updated lockfile-roundtrip lookups for non-default hosts to the host-qualified key (intended #773 behavior, asserted by the author's own lockfile tests); refreshed deps-list/insecure mocks to stub the new host-blind accessor; fixed auth-phase3/validation dep stubs to set host_type. Full unit suite green (16856 passed); the only remaining local integration failures are the pre-existing test_download_* network artifacts that pass in CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e gate Addresses apm-review-panel follow-ups 2 and 4. Extracts the near-identical host-blind canonical-string body shared by DependencyReference and LockedDependency into build_canonical_dependency_string(), mirroring the build_dependency_unique_key() extraction. Each caller passes its own is_local signal (DependencyReference.is_local vs LockedDependency.source == "local") so the two identity models keep their distinct local-detection semantics and the Phase-6 regression fixes are preserved. Also documents the deliberate fail-closed gate in _parse_host_type so the future gitea/bitbucket extension path is a conscious choice rather than a foot-gun. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses apm-review-panel follow-up 3. Replaces the brittle
split()/startswith("https://")/rstrip(".") URL scraping in the
generic-host download-error tests with a shared _error_url_components()
helper that returns parsed (scheme, hostname, path) tuples, per the
tests/ urllib.parse convention. The assertions now survive future
error-message wording or punctuation tweaks.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d410382 to
ac5a4c6
Compare
Shepherd-driver: conflict resolved + panel follow-ups foldedThis PR was Conflict resolutionRebased onto current Folded in this run
DeferredNone within scope. The Lint contract
CIAll required checks green at Mergeability
|
fix(deps): support multi-host dependency identity
TL;DR
This PR implements the board-approved #773 items 1-4: host-aware lockfile identity, consistent per-dependency token resolution for file downloads, explicit GitLab host signaling, and clearer download-attempt errors. Existing
github.comlockfile keys remain byte-stable as bareowner/repo; only non-default hosts add the host segment. Closes #773.Important
Lockfile migration note: existing
github.comkeys are preserved as-is. New non-default host entries usehost/owner/repo, sogithub.com/team/skillsandgitea.myorg.com/team/skillscan coexist without colliding.Problem (WHY)
get_unique_key()returned the bare repo path, so two hosts with the same path could "collide silently inapm.lock"._download_github_fileresolved tokens differently from clone, causing "misleading 401 errors when a GitHub PAT is sent to a Gitea / GitLab server".code.acme.com.Approach (WHAT)
build_dependency_unique_key()keepsgithub.comimplicit and prefixes only non-default hosts._resolve_dep_token()/_resolve_dep_auth_ctx(), matching clone boundaries.type: gitlab; the hint flows through auth, backends, validation, and lockfile replay.apm-usageguide documenttype: gitlaband the lockfile key migration rule.Implementation (HOW)
src/apm_cli/models/dependency/reference.py,src/apm_cli/deps/lockfile.pyhost_typeparse/round-trip support.src/apm_cli/core/auth.py,src/apm_cli/deps/host_backends.pysrc/apm_cli/deps/download_strategies.py,src/apm_cli/deps/github_downloader.pytype: gitlabto GitLab REST, and surfaced endpoint-specific errors.src/apm_cli/install/validation.py,src/apm_cli/deps/github_downloader_validation.pytests/**CHANGELOG.mdDiagram
The diagram shows the new dependency flow from manifest identity to host-aware lockfile key and download backend.
flowchart LR manifest[Manifest dependency] --> ref[DependencyReference] ref --> key{Host value} key -->|github.com or unset| ghkey[Lock key owner/repo] key -->|non-default host| hostkey[Lock key host/owner/repo] ref --> type{Explicit type gitlab} type -->|yes| gitlab[GitLab REST and GitLab token chain] type -->|no| generic[Generic raw/API attempts without managed PAT]Trade-offs
github.comentries, but non-default hosts now become visible in the in-memory key.type: gitlabis explicit instead of heuristic or probe-based, avoiding network capability detection in manifest parsing.Benefits
owner/repoon two hosts no longer overwrites a lockfile entry.Validation
Scenario evidence
github.comremains implicittests/test_lockfile.pytests/test_github_downloader.py::TestGiteaRawUrlDownload,tests/unit/deps/test_download_strategies_*tests/test_github_downloader.py::TestGiteaGogsApiVersionNegotiation::test_object_form_type_gitlab_routes_bespoke_host_to_gitlab_apitests/test_github_downloader.py::TestGiteaRawUrlDownloadCommands run
How to test
apm.ymlwithgithub.com/team/skillsandgitea.myorg.com/team/skills, runapm install, and confirm both lockfile entries remain present.apm install --verbose; confirm raw/API attempts are logged and no GitHub PAT header is sent.type: gitlabfor a bespoke hostname and confirm GitLab REST URLs are used.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com