ONE Discovery, Inc.

LightningIQ Deployment Scripts

This repository hosts the deployment and migration scripts for LightningIQ, served from deploy.lightningiq.io. There are three main scripts:

Script	Purpose
`run`	Deploy or upgrade LightningIQ v1.7+ on a Kubernetes cluster
`v1.6/run`	Deploy or upgrade LightningIQ v1.6 on a Kubernetes cluster
`migrate-1.6-to-1.7.sh`	Migrate an existing 1.6 install to 1.7

Prerequisites
run — LightningIQ v1.7+ Deployer
v1.6/run — LightningIQ v1.6 Deployer
Differences Between v1.6 and v1.7 Deployers
migrate-1.6-to-1.7.sh — Migration Script
Environment Variables Reference
Troubleshooting

Prerequisites

All scripts require the following tools installed and available in $PATH:

Tool	Purpose
`kubectl`	Kubernetes cluster management
`helm`	Kubernetes package manager (used for operator deployment)
`aws` CLI	ECR registry authentication
`curl`	Fetching remote imports and GitHub files
`git`	Cloning the deployer repository
`base64`	Decoding Kubernetes secrets (migration script only)

For Kubernetes-level deployments (k/k8s, f/full), Ansible is used under the hood via the k8s-deployer child script. The req action installs all required tools (kubectl, helm, Ansible, and others) via ASDF.

`run` — LightningIQ v1.7+ Deployer

Overview

The run script is the primary entry point for deploying and managing LightningIQ v1.7+. It can be run directly from a local clone or piped from the remote host.

Usage

Remote (recommended):

export GH_TOKEN=ghp_xxx
export SSH_PASSWORD=<ssh_password>
export VAULT_PASSWORD=<vault_password>
export ENV_REPO_NAME=<client-repo-name>

curl -s https://deploy.lightningiq.io/run | bash -s -- \
  -r "v1.7.145" \
  -e <environment_name> \
  -d <action>

With all arguments inline:

curl -s https://deploy.lightningiq.io/run | bash -s -- \
  -r "v1.7.145" \
  -e prod-usw2 \
  -er client-repo \
  -d k8s \
  -t $GH_TOKEN \
  -s "<ssh_password>" \
  -vp "<vault_password>"

Local (for development/debugging):

export USE_LOCAL_IMPORTS=true
./run -r "v1.7.145" -e qa -er client-repo -d sync

Arguments

Flag	Long form	Required	Description
`-r`	`--release`	Yes	Release version tag, e.g. `v1.7.145`
`-e`	`--env`	Yes	Environment name, e.g. `prod-usw2` (or `ENV_NAME` env var)
`-er`	`--env-repo`	Yes	Client environment repository name (or `ENV_REPO_NAME` env var)
`-d`	`--deploy`	Yes	Action to perform (see Actions)
`-t`	`--token`	No*	GitHub token (or `GH_TOKEN` / `GITHUB_TOKEN` env var)
`-s`	`--ssh-password`	No*	SSH password for Ansible (or `SSH_PASSWORD` env var)
`-vp`	`--vault-password`	No*	Ansible vault password (or `VAULT_PASSWORD` env var)
`-v`	`--verbose`	No	Enable verbose mode, show Ansible task progress

* If not provided as a flag or environment variable, the script will interactively prompt for the value.

Actions

Action	Short	Description
`req`	`r`	Check and install prerequisites (runs Ansible req playbook)
`k8s`	`k`	Deploy the Kubernetes cluster (runs `req` first)
`sync`	`s`	Deploy/update LightningIQ applications via Helm (runs `req` first)
`full`	`f`	Full deployment: `req` → `k8s` → `sync`
`upgrade`	`u`	Upgrade the Kubernetes cluster version (requires `UPGRADE_VERSION`)

How It Works

Imports variables.sh and functions.sh from deploy.lightningiq.io (or locally).
Parses and validates arguments. Prompts interactively for any missing credentials.
Sources update_variables.sh to export runtime variables (inventory paths, ECR registry, etc.).
Clones OneDiscovery/k8s-deployer from the main branch into /tmp/k8s-deployer.
Clones the client environment repository into /tmp/<ENV_REPO_NAME>.
Detects ansible_become_method from the inventory’s connection_settings.yml to set SUDO.
Executes the selected action by delegating to the child script /tmp/k8s-deployer/run.sh.

The `sync` Action — Helm Deployment Detail

The sync (and full) action runs start_apps_deployment, which:

Prompts for AWS credentials if not already configured.
Authenticates with ECR: aws ecr get-login-password | helm registry login.
Reads values from /tmp/<ENV_REPO_NAME>/deploy/<environment>/values.yml.
Fresh install path: If lightningiq-operator Helm release does not exist, bootstraps the operator with cr.enabled=false first (registers the CRD), waits for the operator to be ready, then applies the full values with cr.spec.keycloak.initialAdminPassword=oned-pass.
Upgrade path: If the release already exists, runs helm upgrade --install with the environment values directly.

The `upgrade` Action

Requires UPGRADE_VERSION to be set (e.g. 1.28). The script will interactively prompt if it is not set. Delegates to the child script’s Ansible playbook with -t upgrade -e common_k8s_ver=<version>.

Logs

Logs are written to /tmp/onediscovery-init-logs/init.log. The log directory is recreated fresh on each run.

`v1.6/run` — LightningIQ v1.6 Deployer

Overview

The v1.6/run script is the deployer for LightningIQ v1.6. It follows the same interface as the v1.7 deployer but has key behavioral differences (see Differences).

Usage

Remote:

export GH_TOKEN=ghp_xxx
export SSH_PASSWORD=<ssh_password>
export VAULT_PASSWORD=<vault_password>
export ENV_REPO_NAME=<client-repo-name>

curl -s https://deploy.lightningiq.io/v1.6/run | bash -s -- \
  -r "v1.6.x" \
  -e <environment_name> \
  -d <action>

Local:

export USE_LOCAL_IMPORTS=true
./v1.6/run -r "v1.6.55" -e qa -er client-repo -d sync

Arguments

Identical to the v1.7 run script. See Arguments above.

Actions

Action	Short	Description
`req`	`r`	Check and install prerequisites
`k8s`	`k`	Deploy the Kubernetes cluster
`test`	`t`	Run test suite
`sync`	`s`	Deploy/update LightningIQ applications
`full`	`f`	Full deployment: `req` → `k8s` → `sync`
`upgrade`	`u`	Upgrade Kubernetes cluster version

How It Works

Imports variables.sh and functions.sh from deploy.lightningiq.io/v1.6/.
Parses and validates arguments.
Sources update_variables.sh for runtime variables.
Release resolution: If BRANCH is not set, fetches release info from GitHub API (latest or specified tag), downloads the release tarball, and extracts it to /tmp/k8s-deployer. If BRANCH is set, clones that branch directly.
Sets up authentications (.netrc, SSH password file, Vault password file).
Clones the client environment repository.
Extracts ansible_become_method from connection_settings.yml (no file-existence guard — the file is expected to always exist in v1.6).
Delegates to the child script.

Differences Between v1.6 and v1.7 Deployers

Behavior	`v1.6/run`	`run` (v1.7+)
Imports URL	`.../v1.6/imports/...`	`.../imports/...` (no version prefix)
Deployer fetch	Release tarball or branch clone	Always clones `main` branch
Release handling	`fetch_release_info` → `get_release_tarball` (or `get_deployer_branch`)	`get_deployer_main` only
`test` action	Supported (`t\\|test`)	Not supported
`SUDO` extraction	Always reads `connection_settings.yml` (no guard)	Only reads if file exists; falls back to `${SUDO:-sudo}`
Default admin password	N/A (not set)	Set to `oned-pass` on fresh install
`OPERATOR_VERSION`	Derived from `RELEASE` in `update_variables.sh`	Derived from `RELEASE` after `get_deployer_main`

`migrate-1.6-to-1.7.sh` — Migration Script

Overview

migrate-1.6-to-1.7.sh automates the full migration of an existing LightningIQ v1.6 installation to v1.7. It is destructive: the old cluster is wiped and rebuilt. The migration is organized into 8 sequential phases with a persistent state file so interrupted runs can be safely resumed.

Warning: This script wipes the entire v1.6 cluster. The backups created in Phase 2 are the only recovery path once Phase 3 begins.

Usage

Remote (recommended):

curl -s https://deploy.lightningiq.io/migrate-1.6-to-1.7.sh | bash -s -- \
  --version 1.7.145 \
  --env-name qa \
  --env-repo client-repo \
  --token ghp_xxx

Local:

./migrate-1.6-to-1.7.sh \
  --version 1.7.145 \
  --env-name qa \
  --env-repo client-repo \
  --token $GH_TOKEN

With environment variables:

export GH_TOKEN=ghp_xxx
export ENV_NAME=qa
export ENV_REPO_NAME=client-repo

./migrate-1.6-to-1.7.sh --version 1.7.145

Arguments

Flag	Required	Default	Description
`--version VERSION`	Yes	—	Operator version in `X.Y.Z` format, e.g. `1.7.145`
`--env-name NAME`	Yes	`ENV_NAME` env var	Environment name, e.g. `qa`, `prod-usw2`
`--env-repo REPO`	Yes	`ENV_REPO_NAME` env var	Client environment repository name
`--token TOKEN`	Yes	`GH_TOKEN` env var	GitHub personal access token
`--branch BRANCH`	No	`main`	Branch to fetch env values from
`--backup-dir DIR`	No	`~/liq-backup`	Directory for backups and state file
`-h`, `--help`	No	—	Show usage

Prerequisites

Before running, ensure:

kubectl, helm, aws, base64 are installed.
kubectl is configured with the correct cluster context. The script will display the current context and ask for confirmation before proceeding.
AWS credentials are configured (used for ECR login in Phase 4 and 7).
The GitHub token has read access to the client environment repository.

Migration Phases

Phase 1 — Discover 1.6 Landscape

Surveys the running cluster and captures all secrets required for migration:

Lists all namespaces, pods, PVCs, storage classes, and nodes.
Extracts the following secrets from the live cluster and writes them to $BACKUP_DIR/migration-config.txt (mode 600):
- REMOTE_PG_PASSWORD — external PostgreSQL password
- STORAGE_TARGET_ACCESS_KEY_ID / STORAGE_TARGET_SECRET_ACCESS_KEY — object storage credentials
- OVERLORD_PG_PASSWORD — overlord PostgreSQL password
- REDIS_PASSWORD — Redis password
- Datastore external PG connection config
- REDIS_TLS flag (detected from TLS secret presence)
Displays the captured config and asks for confirmation before proceeding.

Security note: migration-config.txt contains secrets in plaintext. Delete it after the migration is complete.

Phase 2 — Back Up PostgreSQL and Redis

Creates off-cluster backups of all stateful data:

PostgreSQL: Runs pg_dump --clean --if-exists on the overlord database inside the postgresql-0 pod. Post-processes the dump to rewrite data_centers URLs by stripping the legacy :32443 port. Records row counts for templates, credentials, destinations, search_patterns, and jobs tables.
Redis: Triggers BGSAVE, waits up to 60 seconds for completion (warns but continues if timeout is exceeded), then copies /data/dump.rdb off-cluster. Records DBSIZE.

Backup artifacts:

$BACKUP_DIR/
  overlord_full.sql        # PostgreSQL dump
  overlord_row_counts.txt  # Row count snapshot
  redis_dump.rdb           # Redis RDB file
  redis_dbsize.txt         # Redis key count
  migration-config.txt     # Captured secrets (delete after migration!)

Phase 3 — Wipe the Cluster

DESTRUCTIVE. Requires explicit yes confirmation.

Deletes all v1.6 resources:

Namespaces: overlord, kafka, kong, prometheus, logging, core, configs, kubernetes-dashboard, local-path-provisioner
Waits up to 5 minutes for namespace termination. Fails if any namespace is still terminating after the timeout.
Deletes all workloads in the default namespace.
Deletes cluster-scoped Helm resources (ClusterRoles, ClusterRoleBindings).
Deletes all v1.6 storage classes.
Clears PVC protection finalizers and deletes all PVCs.
Deletes all PVs (patches off pv-protection finalizers for any stuck in Terminating).

Phase 4 — Install Fresh 1.7 (Infrastructure Only)

Installs LightningIQ v1.7 with all application services explicitly disabled so only PostgreSQL, Redis, and Kafka are provisioned first:

Authenticates with ECR.
Runs helm upgrade --install lightningiq-operator with:
- cr.enabled=true, cr.spec.storageBasePath=/mnt/nfs
- All services (overlordApi, scanner, adminApi, metrics, metadataAnalyzer, unpack, fileProcessor, cleanup, fileRouter, datastore, ui) set to enabled=false
- Environment values from the client repo (deploy/$ENV_NAME/values.yml)
Waits up to 10 minutes for PostgreSQLReady, RedisReady, and KafkaReady conditions on the liq CR. Fails if not all three are True.

Phase 5 — Restore PostgreSQL and Redis

Restores the backups created in Phase 2 into the new v1.7 infrastructure:

PostgreSQL: Discovers the primary CNPG pod (label cnpg.io/cluster=postgresql,role=primary), reads the new postgresql-superuser secret, runs psql with the SQL dump piped in, and verifies row counts match.
Redis: Discovers the Redis pod, verifies connectivity with PING, copies redis_dump.rdb into the pod at /data/dump.rdb, then shuts down Redis with SHUTDOWN NOSAVE (the pod restarts and loads the RDB automatically). Waits for the pod to return to Ready. Verifies DBSIZE > 0 and lists RediSearch indexes with FT._LIST.

Phase 6 — Create External Secrets

Presents a menu to selectively create Kubernetes secrets in the lightning-iq namespace from the values captured in Phase 1:

Secret	Contains
`pgsql-ddc-lan`	`password` (external PostgreSQL)
`garage-ddc-lan`	`accessKeyId`, `secretAccessKey` (object storage)

Input options: individual numbers (1), comma-separated (1,2), all, or none. Selecting none skips secret creation with a warning to create them manually before starting services.

Uses --dry-run=client -o yaml | kubectl apply -f - for idempotent creation.

Phase 7 — Enable All Services

Requires explicit yes confirmation.

Runs helm upgrade -i lightningiq-operator without --reuse-values, intentionally picking up chart defaults (all services enabled), applying only:

cr.enabled=true
The environment values file

Waits up to 15 minutes for the liq CR to reach phase: Ready. Issues a warning (not a failure) if the timeout is exceeded, since services may still be starting.

Phase 8 — Verify

Final health check:

Lists all pods in the lightning-iq namespace.
Shows the liq CR status.
Lists Kafka topics via kafka-topics.sh inside the Kafka broker pod.
Checks scanner StatefulSet presence.
Reads DBSIZE and FT._LIST from the new Redis instance to confirm data is loaded.

Idempotency and Resuming

The migration tracks progress in $BACKUP_DIR/.migration-state. Each phase appends PHASE_N_COMPLETE to this file when it finishes. On re-run, completed phases are automatically skipped.

To re-run a specific phase, remove its marker from the state file:

# Example: re-run Phase 5
sed -i '/PHASE_5_COMPLETE/d' ~/liq-backup/.migration-state
./migrate-1.6-to-1.7.sh --version 1.7.145 --env-name qa --env-repo client-repo

To start over completely:

rm ~/liq-backup/.migration-state

If the script is interrupted (Ctrl+C), it prints: Interrupted. Re-run the script to resume from the last completed phase.

Output Log

All output is tee’d to migration_1.6-to-1.7.log in the current working directory. To use a different filename, set OUTPUT:

export OUTPUT=/var/log/migration.log

Post-Migration Checklist

Verify the application is accessible and functioning.
Confirm row counts in PostgreSQL match the pre-migration snapshot in $BACKUP_DIR/overlord_row_counts.txt.
Confirm Redis DBSIZE matches $BACKUP_DIR/redis_dbsize.txt.
Delete $BACKUP_DIR/migration-config.txt — it contains plaintext secrets.
Optionally archive or delete $BACKUP_DIR entirely once the migration is confirmed stable.

Environment Variables Reference

Variable	Used By	Description
`GH_TOKEN` / `GITHUB_TOKEN`	All scripts	GitHub personal access token
`ENV_NAME`	`run`, `v1.6/run`, migration	Environment name (alternative to `-e` flag)
`ENV_REPO_NAME`	`run`, `v1.6/run`, migration	Client environment repo name
`SSH_PASSWORD`	`run`, `v1.6/run`	SSH password for Ansible become
`VAULT_PASSWORD`	`run`, `v1.6/run`	Ansible vault password
`UPGRADE_VERSION`	`run`, `v1.6/run`	Target Kubernetes version for `upgrade` action
`SUDO`	`run`, `v1.6/run`	Override become method (default: `sudo`)
`USE_LOCAL_IMPORTS`	All scripts	Set to `true` to source `imports/` from local files instead of `deploy.lightningiq.io`
`SHOW_PROGRESS`	`run`, `v1.6/run`	Set to `true` to show Ansible task progress (same as `-v`)
`BRANCH`	`v1.6/run`, migration	Branch to use for deployer or env values checkout
`OUTPUT`	Migration	Log file path (default: `migration_1.6-to-1.7.log`)
`ORG`	Migration	GitHub org (default: `OneDiscovery`)

Troubleshooting

`GH_TOKEN` / credentials not found

If any required credential is missing, the deployer scripts will prompt interactively via /dev/tty. Alternatively, set the corresponding environment variable before running.

`kubectl context` wrong cluster

The migration script displays the current context and asks for confirmation. For the deployer scripts, ensure the correct context is active:

kubectl config use-context <cluster-name>
kubectl config current-context

Namespace stuck in `Terminating` (Phase 3)

Stuck namespaces usually have finalizers blocking deletion. To force-remove them:

kubectl get namespace <ns> -o json \
  | jq '.spec.finalizers = []' \
  | kubectl replace --raw "/api/v1/namespaces/<ns>/finalize" -f -

Then remove PHASE_3_COMPLETE from the state file and re-run.

Infrastructure not ready after Phase 4

Check the operator and CR status:

kubectl -n lightning-iq get liq liq -o yaml
kubectl -n lightning-iq get pods
kubectl -n lightning-iq logs deployment/lightning-iq-operator

Redis DBSIZE is 0 after Phase 5

The RDB may not have loaded if Redis restarted before the file was fully copied, or if the copied file was corrupt. Check:

kubectl -n lightning-iq exec <redis-pod> -c redis -- redis-cli -a <pass> DEBUG RELOAD

If necessary, repeat Phase 5 by removing PHASE_5_COMPLETE from the state file.

AWS ECR authentication failure

Ensure your AWS credentials are valid and have ecr:GetAuthorizationToken and ecr:BatchGetImage permissions for the 060147281721.dkr.ecr.us-east-1.amazonaws.com registry.

Local development with `USE_LOCAL_IMPORTS=true`

This bypasses the remote fetch of imports/ files and uses the local copies in ./imports/ (or ./v1.6/imports/ for the v1.6 script). Useful for testing changes to import files without publishing them:

export USE_LOCAL_IMPORTS=true
./run -r "v1.7.145" -e qa -er client-repo -d sync

ONE Discovery, Inc.

LightningIQ Deployment Scripts

Table of Contents

Prerequisites

run — LightningIQ v1.7+ Deployer

Overview

Usage

Arguments

Actions

How It Works

The sync Action — Helm Deployment Detail

The upgrade Action

Logs

v1.6/run — LightningIQ v1.6 Deployer

Overview

Usage

Arguments

Actions

How It Works

Differences Between v1.6 and v1.7 Deployers

migrate-1.6-to-1.7.sh — Migration Script

Overview

Usage

Arguments

Prerequisites

Migration Phases

Phase 1 — Discover 1.6 Landscape

Phase 2 — Back Up PostgreSQL and Redis

Phase 3 — Wipe the Cluster

Phase 4 — Install Fresh 1.7 (Infrastructure Only)

Phase 5 — Restore PostgreSQL and Redis

Phase 6 — Create External Secrets

Phase 7 — Enable All Services

Phase 8 — Verify

Idempotency and Resuming

Output Log

Post-Migration Checklist

Environment Variables Reference

Troubleshooting

GH_TOKEN / credentials not found

kubectl context wrong cluster

Namespace stuck in Terminating (Phase 3)

Infrastructure not ready after Phase 4

Redis DBSIZE is 0 after Phase 5

AWS ECR authentication failure

Local development with USE_LOCAL_IMPORTS=true

`run` — LightningIQ v1.7+ Deployer

The `sync` Action — Helm Deployment Detail

The `upgrade` Action

`v1.6/run` — LightningIQ v1.6 Deployer

`migrate-1.6-to-1.7.sh` — Migration Script

`GH_TOKEN` / credentials not found

`kubectl context` wrong cluster

Namespace stuck in `Terminating` (Phase 3)

Local development with `USE_LOCAL_IMPORTS=true`