Architecture¶

This document describes the internal architecture of the git-hubby operator.

Overview¶

git-hubby follows the standard Kubebuilder architecture with additional patterns for GitHub API integration, rate limiting, and high availability.

The operator uses a factory-based reconciliation pattern:

Controller receives event → checks predicates (generation/annotation changes)
Spreading Check evaluates if reconciliation should be delayed during startup window
Factory creates reconciler → fetches CR, builds GitHub client, checks rate limits
Reconciler executes reconciliation groups in sequence, with parallel execution within each group
Mapper produces GitHub API request objects with opinionated defaults
GitHub Client applies changes via GitHub API
Conditions updated to reflect sync status for each reconciliation task
Status written back to resource, including sub-resource generation tracking
Requeue scheduled after configurable interval for continuous drift detection

To prevent API rate limit exhaustion during pod restarts (e.g., rolling deployments), the operator implements a startup spreading mechanism:

Spread Period (default 5 minutes): Window after startup during which reconciliations may be delayed
Spread Interval (default 180 minutes): Time window across which reconciliations are distributed
Smart Detection: Only spreads warm-start reconciliations (healthy resources with unchanged specs)
Immediate Processing: Changed resources, unhealthy resources, and deletions bypass spreading

Control via environment variables:

Variable	Default	Description
`ENABLE_STARTUP_SPREADING`	`true`	Enable/disable spreading
`STARTUP_SPREAD_PERIOD_MINUTES`	`5`	Window after startup for spreading
`SPREAD_INTERVAL_MINUTES`	`180`	Time window for distribution

Reconciliation logic is organized into sequential groups, with tasks within each group executing concurrently. For example:

Group 1: Independent tasks that can run in parallel (e.g., org settings, custom properties, rulesets)
Group 2: Dependent tasks that require Group 1 completion
Additional groups: Can be added as needed based on dependencies

Common patterns:

Timeout Protection: Each reconciliation task has a 5-minute timeout
Error Handling: All errors collected and reported; execution stops at first failed group

The operator manages reconciliation timing to conserve GitHub API quota:

Checks remaining quota before each reconciliation (threshold: 100 requests)
Delays reconciliations until rate limit resets when quota is low
Global limiter synchronizes delays across all controller instances
Priority queue ensures new resources reconcile first when quota becomes available

This protects against self-inflicted rate limit exhaustion but does not prevent exhaustion from external sources (CI/CD, other tools).

The operator implements safe deletion semantics to prevent accidental data loss:

Organizations: The GitHub organization is never deleted. The Kubernetes CR can only be removed when no Repository or Team CRs reference it (enforced via finalizer). This ensures the organization remains intact on GitHub while allowing cleanup of Kubernetes resources.
Repositories: Behavior depends on the REPOSITORY_FINALIZER_MODE environment variable:
- ignore or unset (default): Repository remains unchanged on GitHub, only the Kubernetes CR is removed
- archive: Repository is archived on GitHub before the Kubernetes CR is removed, preserving all data while marking it as read-only
- delete: Repository is permanently deleted from GitHub (use with caution)

The GitHubCachingClientFactory maintains a per-process cache of authenticated GitHub clients: