Skip to content

Use cluster endpoint at startup#90

Open
piceri wants to merge 11 commits into
mainfrom
piceri/startup-cluster-endpoint
Open

Use cluster endpoint at startup#90
piceri wants to merge 11 commits into
mainfrom
piceri/startup-cluster-endpoint

Conversation

@piceri

@piceri piceri commented May 29, 2026

Copy link
Copy Markdown
Contributor

This change allows deployment tracker to use the cluster endpoint at startup to send the current state of the cluster. This will reduce the load cause at startup when deployment tracker sends the state of the cluster one container at a time.

Change details:

  • While the informers sync, add events are no longer added to the work queue
  • Once the informers have synced, use the pod informer to get the current list of running pods
  • Any new events are then added the the work queue
  • Current pod list is processed and deduped by deployment name + digest
  • Send the list to the cluster endpoint
    • If this fails, deployment tracker does not continue to run
  • Use the response to fill observed and unknown caches

piceri added 8 commits May 22, 2026 14:44
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>

@ajbeattie ajbeattie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 💯, looking great on initial pass. Still working through some of the changes but wanted to go ahead and raise the Job/CronJob suggestion ⬇️

Comment thread internal/controller/controller.go Outdated
Comment thread internal/controller/reporting.go Outdated
piceri added 2 commits June 10, 2026 14:22
Signed-off-by: Eric Pickard <piceri@github.com>
Signed-off-by: Eric Pickard <piceri@github.com>
GitHub Advanced Security started work on behalf of piceri June 11, 2026 19:08 View session
GitHub Advanced Security finished work on behalf of piceri June 11, 2026 19:09
@piceri piceri marked this pull request as ready for review June 11, 2026 19:10
@piceri piceri requested a review from a team as a code owner June 11, 2026 19:10
Copilot AI review requested due to automatic review settings June 11, 2026 19:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a startup “cluster sync” path so deployment-tracker can post the current cluster state in a single request to the deployment-record cluster endpoint, reducing startup load versus posting one container at a time.

Changes:

  • Introduces new deployment record data models to support the cluster endpoint request/response payloads.
  • Adds PostCluster (and refactors PostOne to share retry logic) in the deployment record client, with accompanying unit tests.
  • Updates the controller startup flow to (a) suppress enqueueing during informer sync, (b) list current pods, (c) post cluster state, then (d) seed observed/unknown caches before starting workers.
Show a summary per file
File Description
pkg/deploymentrecord/record.go Refactors record types into base/request/response structs for cluster endpoint support.
pkg/deploymentrecord/client.go Adds cluster endpoint posting and shared retry helper; updates request body builders.
pkg/deploymentrecord/client_test.go Adds PostCluster and URL-escaping tests; updates record test helpers.
internal/controller/reporting.go Implements building sync records, posting cluster sync, and filling caches; refactors record building.
internal/controller/reporting_test.go Adds tests for sync processing, dedupe, and cache fill behavior; updates poster call assertions.
internal/controller/controller.go Adds startup sync gating via atomic.Bool and calls sync processing after informer sync.
internal/controller/controller_test.go Extends mock poster to support PostCluster; adds mock metadata aggregator for new code paths.
internal/controller/controller_integration_test.go Updates integration test mocks/types for renamed record type and new interface method.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 8/8 changed files
  • Comments generated: 4

Comment on lines 158 to +162
AddFunc: func(obj any) {
// Skip adding sync events
if cntrl.syncing.Load() {
return
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these events should still be captured. I will adjust the PR description.

Comment on lines +142 to 150
var deploymentRecords deploymentrecord.RecordsClusterResp
err = json.Unmarshal(respBody, &deploymentRecords)
if err != nil {
slog.Error("Failed to unmarshall response",
"error", err,
"record_count", len(syncRecords),
)
return nil
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this would only happen for a 2xx response, so this scenario would only mean we miss out on filling the local caches. I think this would b okay to continue without failure.

Comment on lines +346 to 348
// Drain and close response body to enable connection reuse
respBody, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
Comment thread pkg/deploymentrecord/client.go
Signed-off-by: Eric Pickard <piceri@github.com>
GitHub Advanced Security started work on behalf of piceri June 11, 2026 19:41 View session
GitHub Advanced Security finished work on behalf of piceri June 11, 2026 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants