This repository hosts the deployment and migration scripts for LightningIQ, served from deploy.lightningiq.io. There are three main scripts:
| Script | Purpose |
|---|---|
run |
Deploy or upgrade LightningIQ v1.7+ on a Kubernetes cluster |
v1.6/run |
Deploy or upgrade LightningIQ v1.6 on a Kubernetes cluster |
migrate-1.6-to-1.7.sh |
Migrate an existing 1.6 install to 1.7 |
run — LightningIQ v1.7+ Deployerv1.6/run — LightningIQ v1.6 Deployermigrate-1.6-to-1.7.sh — Migration ScriptAll scripts require the following tools installed and available in $PATH:
| Tool | Purpose |
|---|---|
kubectl |
Kubernetes cluster management |
helm |
Kubernetes package manager (used for operator deployment) |
aws CLI |
ECR registry authentication |
curl |
Fetching remote imports and GitHub files |
git |
Cloning the deployer repository |
base64 |
Decoding Kubernetes secrets (migration script only) |
For Kubernetes-level deployments (k/k8s, f/full), Ansible is used under the hood via the k8s-deployer child script. The req action installs all required tools (kubectl, helm, Ansible, and others) via ASDF.
run — LightningIQ v1.7+ DeployerThe run script is the primary entry point for deploying and managing LightningIQ v1.7+. It can be run directly from a local clone or piped from the remote host.
Remote (recommended):
export GH_TOKEN=ghp_xxx
export SSH_PASSWORD=<ssh_password>
export VAULT_PASSWORD=<vault_password>
export ENV_REPO_NAME=<client-repo-name>
curl -s https://deploy.lightningiq.io/run | bash -s -- \
-r "v1.7.145" \
-e <environment_name> \
-d <action>
With all arguments inline:
curl -s https://deploy.lightningiq.io/run | bash -s -- \
-r "v1.7.145" \
-e prod-usw2 \
-er client-repo \
-d k8s \
-t $GH_TOKEN \
-s "<ssh_password>" \
-vp "<vault_password>"
Local (for development/debugging):
export USE_LOCAL_IMPORTS=true
./run -r "v1.7.145" -e qa -er client-repo -d sync
| Flag | Long form | Required | Description |
|---|---|---|---|
-r |
--release |
Yes | Release version tag, e.g. v1.7.145 |
-e |
--env |
Yes | Environment name, e.g. prod-usw2 (or ENV_NAME env var) |
-er |
--env-repo |
Yes | Client environment repository name (or ENV_REPO_NAME env var) |
-d |
--deploy |
Yes | Action to perform (see Actions) |
-t |
--token |
No* | GitHub token (or GH_TOKEN / GITHUB_TOKEN env var) |
-s |
--ssh-password |
No* | SSH password for Ansible (or SSH_PASSWORD env var) |
-vp |
--vault-password |
No* | Ansible vault password (or VAULT_PASSWORD env var) |
-v |
--verbose |
No | Enable verbose mode, show Ansible task progress |
* If not provided as a flag or environment variable, the script will interactively prompt for the value.
| Action | Short | Description |
|---|---|---|
req |
r |
Check and install prerequisites (runs Ansible req playbook) |
k8s |
k |
Deploy the Kubernetes cluster (runs req first) |
sync |
s |
Deploy/update LightningIQ applications via Helm (runs req first) |
full |
f |
Full deployment: req → k8s → sync |
upgrade |
u |
Upgrade the Kubernetes cluster version (requires UPGRADE_VERSION) |
variables.sh and functions.sh from deploy.lightningiq.io (or locally).update_variables.sh to export runtime variables (inventory paths, ECR registry, etc.).OneDiscovery/k8s-deployer from the main branch into /tmp/k8s-deployer./tmp/<ENV_REPO_NAME>.ansible_become_method from the inventory’s connection_settings.yml to set SUDO./tmp/k8s-deployer/run.sh.sync Action — Helm Deployment DetailThe sync (and full) action runs start_apps_deployment, which:
aws ecr get-login-password | helm registry login./tmp/<ENV_REPO_NAME>/deploy/<environment>/values.yml.lightningiq-operator Helm release does not exist, bootstraps the operator with cr.enabled=false first (registers the CRD), waits for the operator to be ready, then applies the full values with cr.spec.keycloak.initialAdminPassword=oned-pass.helm upgrade --install with the environment values directly.upgrade ActionRequires UPGRADE_VERSION to be set (e.g. 1.28). The script will interactively prompt if it is not set. Delegates to the child script’s Ansible playbook with -t upgrade -e common_k8s_ver=<version>.
Logs are written to /tmp/onediscovery-init-logs/init.log. The log directory is recreated fresh on each run.
v1.6/run — LightningIQ v1.6 DeployerThe v1.6/run script is the deployer for LightningIQ v1.6. It follows the same interface as the v1.7 deployer but has key behavioral differences (see Differences).
Remote:
export GH_TOKEN=ghp_xxx
export SSH_PASSWORD=<ssh_password>
export VAULT_PASSWORD=<vault_password>
export ENV_REPO_NAME=<client-repo-name>
curl -s https://deploy.lightningiq.io/v1.6/run | bash -s -- \
-r "v1.6.x" \
-e <environment_name> \
-d <action>
Local:
export USE_LOCAL_IMPORTS=true
./v1.6/run -r "v1.6.55" -e qa -er client-repo -d sync
Identical to the v1.7 run script. See Arguments above.
| Action | Short | Description |
|---|---|---|
req |
r |
Check and install prerequisites |
k8s |
k |
Deploy the Kubernetes cluster |
test |
t |
Run test suite |
sync |
s |
Deploy/update LightningIQ applications |
full |
f |
Full deployment: req → k8s → sync |
upgrade |
u |
Upgrade Kubernetes cluster version |
variables.sh and functions.sh from deploy.lightningiq.io/v1.6/.update_variables.sh for runtime variables.BRANCH is not set, fetches release info from GitHub API (latest or specified tag), downloads the release tarball, and extracts it to /tmp/k8s-deployer. If BRANCH is set, clones that branch directly..netrc, SSH password file, Vault password file).ansible_become_method from connection_settings.yml (no file-existence guard — the file is expected to always exist in v1.6).| Behavior | v1.6/run |
run (v1.7+) |
|---|---|---|
| Imports URL | .../v1.6/imports/... |
.../imports/... (no version prefix) |
| Deployer fetch | Release tarball or branch clone | Always clones main branch |
| Release handling | fetch_release_info → get_release_tarball (or get_deployer_branch) |
get_deployer_main only |
test action |
Supported (t\|test) |
Not supported |
SUDO extraction |
Always reads connection_settings.yml (no guard) |
Only reads if file exists; falls back to ${SUDO:-sudo} |
| Default admin password | N/A (not set) | Set to oned-pass on fresh install |
OPERATOR_VERSION |
Derived from RELEASE in update_variables.sh |
Derived from RELEASE after get_deployer_main |
migrate-1.6-to-1.7.sh — Migration Scriptmigrate-1.6-to-1.7.sh automates the full migration of an existing LightningIQ v1.6 installation to v1.7. It is destructive: the old cluster is wiped and rebuilt. The migration is organized into 8 sequential phases with a persistent state file so interrupted runs can be safely resumed.
Warning: This script wipes the entire v1.6 cluster. The backups created in Phase 2 are the only recovery path once Phase 3 begins.
Remote (recommended):
curl -s https://deploy.lightningiq.io/migrate-1.6-to-1.7.sh | bash -s -- \
--version 1.7.145 \
--env-name qa \
--env-repo client-repo \
--token ghp_xxx
Local:
./migrate-1.6-to-1.7.sh \
--version 1.7.145 \
--env-name qa \
--env-repo client-repo \
--token $GH_TOKEN
With environment variables:
export GH_TOKEN=ghp_xxx
export ENV_NAME=qa
export ENV_REPO_NAME=client-repo
./migrate-1.6-to-1.7.sh --version 1.7.145
| Flag | Required | Default | Description |
|---|---|---|---|
--version VERSION |
Yes | — | Operator version in X.Y.Z format, e.g. 1.7.145 |
--env-name NAME |
Yes | ENV_NAME env var |
Environment name, e.g. qa, prod-usw2 |
--env-repo REPO |
Yes | ENV_REPO_NAME env var |
Client environment repository name |
--token TOKEN |
Yes | GH_TOKEN env var |
GitHub personal access token |
--branch BRANCH |
No | main |
Branch to fetch env values from |
--backup-dir DIR |
No | ~/liq-backup |
Directory for backups and state file |
-h, --help |
No | — | Show usage |
Before running, ensure:
kubectl, helm, aws, base64 are installed.kubectl is configured with the correct cluster context. The script will display the current context and ask for confirmation before proceeding.Surveys the running cluster and captures all secrets required for migration:
$BACKUP_DIR/migration-config.txt (mode 600):
REMOTE_PG_PASSWORD — external PostgreSQL passwordSTORAGE_TARGET_ACCESS_KEY_ID / STORAGE_TARGET_SECRET_ACCESS_KEY — object storage credentialsOVERLORD_PG_PASSWORD — overlord PostgreSQL passwordREDIS_PASSWORD — Redis passwordREDIS_TLS flag (detected from TLS secret presence)Security note:
migration-config.txtcontains secrets in plaintext. Delete it after the migration is complete.
Creates off-cluster backups of all stateful data:
pg_dump --clean --if-exists on the overlord database inside the postgresql-0 pod. Post-processes the dump to rewrite data_centers URLs by stripping the legacy :32443 port. Records row counts for templates, credentials, destinations, search_patterns, and jobs tables.BGSAVE, waits up to 60 seconds for completion (warns but continues if timeout is exceeded), then copies /data/dump.rdb off-cluster. Records DBSIZE.Backup artifacts:
$BACKUP_DIR/
overlord_full.sql # PostgreSQL dump
overlord_row_counts.txt # Row count snapshot
redis_dump.rdb # Redis RDB file
redis_dbsize.txt # Redis key count
migration-config.txt # Captured secrets (delete after migration!)
DESTRUCTIVE. Requires explicit yes confirmation.
Deletes all v1.6 resources:
overlord, kafka, kong, prometheus, logging, core, configs, kubernetes-dashboard, local-path-provisionerdefault namespace.pv-protection finalizers for any stuck in Terminating).Installs LightningIQ v1.7 with all application services explicitly disabled so only PostgreSQL, Redis, and Kafka are provisioned first:
helm upgrade --install lightningiq-operator with:
cr.enabled=true, cr.spec.storageBasePath=/mnt/nfsoverlordApi, scanner, adminApi, metrics, metadataAnalyzer, unpack, fileProcessor, cleanup, fileRouter, datastore, ui) set to enabled=falsedeploy/$ENV_NAME/values.yml)PostgreSQLReady, RedisReady, and KafkaReady conditions on the liq CR. Fails if not all three are True.Restores the backups created in Phase 2 into the new v1.7 infrastructure:
cnpg.io/cluster=postgresql,role=primary), reads the new postgresql-superuser secret, runs psql with the SQL dump piped in, and verifies row counts match.PING, copies redis_dump.rdb into the pod at /data/dump.rdb, then shuts down Redis with SHUTDOWN NOSAVE (the pod restarts and loads the RDB automatically). Waits for the pod to return to Ready. Verifies DBSIZE > 0 and lists RediSearch indexes with FT._LIST.Presents a menu to selectively create Kubernetes secrets in the lightning-iq namespace from the values captured in Phase 1:
| Secret | Contains |
|---|---|
pgsql-ddc-lan |
password (external PostgreSQL) |
garage-ddc-lan |
accessKeyId, secretAccessKey (object storage) |
Input options: individual numbers (1), comma-separated (1,2), all, or none. Selecting none skips secret creation with a warning to create them manually before starting services.
Uses --dry-run=client -o yaml | kubectl apply -f - for idempotent creation.
Requires explicit yes confirmation.
Runs helm upgrade -i lightningiq-operator without --reuse-values, intentionally picking up chart defaults (all services enabled), applying only:
cr.enabled=trueWaits up to 15 minutes for the liq CR to reach phase: Ready. Issues a warning (not a failure) if the timeout is exceeded, since services may still be starting.
Final health check:
lightning-iq namespace.liq CR status.kafka-topics.sh inside the Kafka broker pod.DBSIZE and FT._LIST from the new Redis instance to confirm data is loaded.The migration tracks progress in $BACKUP_DIR/.migration-state. Each phase appends PHASE_N_COMPLETE to this file when it finishes. On re-run, completed phases are automatically skipped.
To re-run a specific phase, remove its marker from the state file:
# Example: re-run Phase 5
sed -i '/PHASE_5_COMPLETE/d' ~/liq-backup/.migration-state
./migrate-1.6-to-1.7.sh --version 1.7.145 --env-name qa --env-repo client-repo
To start over completely:
rm ~/liq-backup/.migration-state
If the script is interrupted (Ctrl+C), it prints: Interrupted. Re-run the script to resume from the last completed phase.
All output is tee’d to migration_1.6-to-1.7.log in the current working directory. To use a different filename, set OUTPUT:
export OUTPUT=/var/log/migration.log
$BACKUP_DIR/overlord_row_counts.txt.$BACKUP_DIR/redis_dbsize.txt.$BACKUP_DIR/migration-config.txt — it contains plaintext secrets.$BACKUP_DIR entirely once the migration is confirmed stable.| Variable | Used By | Description |
|---|---|---|
GH_TOKEN / GITHUB_TOKEN |
All scripts | GitHub personal access token |
ENV_NAME |
run, v1.6/run, migration |
Environment name (alternative to -e flag) |
ENV_REPO_NAME |
run, v1.6/run, migration |
Client environment repo name |
SSH_PASSWORD |
run, v1.6/run |
SSH password for Ansible become |
VAULT_PASSWORD |
run, v1.6/run |
Ansible vault password |
UPGRADE_VERSION |
run, v1.6/run |
Target Kubernetes version for upgrade action |
SUDO |
run, v1.6/run |
Override become method (default: sudo) |
USE_LOCAL_IMPORTS |
All scripts | Set to true to source imports/ from local files instead of deploy.lightningiq.io |
SHOW_PROGRESS |
run, v1.6/run |
Set to true to show Ansible task progress (same as -v) |
BRANCH |
v1.6/run, migration |
Branch to use for deployer or env values checkout |
OUTPUT |
Migration | Log file path (default: migration_1.6-to-1.7.log) |
ORG |
Migration | GitHub org (default: OneDiscovery) |
GH_TOKEN / credentials not foundIf any required credential is missing, the deployer scripts will prompt interactively via /dev/tty. Alternatively, set the corresponding environment variable before running.
kubectl context wrong clusterThe migration script displays the current context and asks for confirmation. For the deployer scripts, ensure the correct context is active:
kubectl config use-context <cluster-name>
kubectl config current-context
Terminating (Phase 3)Stuck namespaces usually have finalizers blocking deletion. To force-remove them:
kubectl get namespace <ns> -o json \
| jq '.spec.finalizers = []' \
| kubectl replace --raw "/api/v1/namespaces/<ns>/finalize" -f -
Then remove PHASE_3_COMPLETE from the state file and re-run.
Check the operator and CR status:
kubectl -n lightning-iq get liq liq -o yaml
kubectl -n lightning-iq get pods
kubectl -n lightning-iq logs deployment/lightning-iq-operator
The RDB may not have loaded if Redis restarted before the file was fully copied, or if the copied file was corrupt. Check:
kubectl -n lightning-iq exec <redis-pod> -c redis -- redis-cli -a <pass> DEBUG RELOAD
If necessary, repeat Phase 5 by removing PHASE_5_COMPLETE from the state file.
Ensure your AWS credentials are valid and have ecr:GetAuthorizationToken and ecr:BatchGetImage permissions for the 060147281721.dkr.ecr.us-east-1.amazonaws.com registry.
USE_LOCAL_IMPORTS=trueThis bypasses the remote fetch of imports/ files and uses the local copies in ./imports/ (or ./v1.6/imports/ for the v1.6 script). Useful for testing changes to import files without publishing them:
export USE_LOCAL_IMPORTS=true
./run -r "v1.7.145" -e qa -er client-repo -d sync