Documentation Index
Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-featse-1779998369-ad736a3.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Command cheat sheet for day-to-day operations against an Azure LangSmith deployment provisioned with the Azure Terraform modules. All make targets run from modules/azure/. Run make help for an inline summary.
For the full deployment walkthrough, see the Azure deployment guide.
Deployment overview
| Stage | What gets deployed | Command |
|---|
| Infrastructure | AKS + Postgres + Redis + Blob + Key Vault + cert-manager + KEDA + ingress | make apply |
| Cluster credentials | Kubeconfig + Kubernetes Secrets from Key Vault | make kubeconfig && make k8s-secrets |
| LangSmith (Helm path) | LangSmith Helm (~17 pods) via shell scripts | make init-values && make deploy |
| LangSmith (Terraform path) | Secrets + SA + Helm release managed in Terraform state | make init-app && make apply-app |
| LangSmith Deployment add-on | host-backend, listener, operator. Bump default_node_pool_min_count to 5 first | make apply && make init-values && make deploy |
| Agent Builder add-on | tool-server, trigger-server, agent-builder LGP | make init-values && make deploy |
| Insights + Polly add-on | Clio analytics, Polly eval agent | make init-values && make deploy |
First-time setup
cd terraform/modules/azure
# 1. Generate terraform.tfvars (interactive wizard)
make quickstart
# 2. Bootstrap secrets (prompts on first run, reads from Key Vault on repeat)
make setup-env
# 3. Preflight (Azure CLI, RBAC, providers, quotas)
make preflight
# 4. Deploy infrastructure (~15 to 20 min)
# Skip `make plan` on a fresh deploy — kubernetes_manifest needs a live cluster
make init
make apply
# 5. Cluster credentials + Kubernetes Secrets
make kubeconfig
make k8s-secrets
# 6. Generate Helm values from Terraform outputs
make init-values
# 7. Deploy LangSmith (~10 min)
make deploy
# 8. Health check
make status
Or run everything after make apply in one shot:
make deploy-all # kubeconfig → k8s-secrets → init-values → deploy
make deploy-all-tf # apply → init-values → init-app → apply-app (Terraform path)
Day-2 operations
make status # 10-section health check
make status-quick # skip Key Vault + K8s secret queries (faster)
make deploy # re-deploy after any Helm value changes
make init-values # re-generate values after Terraform changes
make kubeconfig # refresh cluster credentials
make k8s-secrets # re-create langsmith-config-secret from Key Vault
# Manage Key Vault secrets interactively
make keyvault # interactive menu
make keyvault list # all secrets with timestamps
make keyvault get <secret> # read a secret
make keyvault set <key> <val> # update a secret
make keyvault validate # check all required secrets exist
make keyvault diff # compare KV vs K8s secret
make keyvault delete <key> # soft-delete (recoverable 90 days)
Add-ons
Add-on passes (3 to 5) are controlled by flags in infra/terraform.tfvars. Set the flags, re-run init-values && deploy. init-values.sh copies the matching example file into helm/values/ automatically.
# infra/terraform.tfvars
sizing_profile = "production" # minimum | dev | production | production-large
enable_deployments = true # LangSmith Deployment add-on (listener + operator + host-backend)
enable_agent_builder = true # Agent Builder add-on (requires enable_deployments)
enable_insights = true # Insights / Clio analytics add-on
enable_polly = true # Polly AI eval add-on (requires enable_deployments)
The LangSmith Deployment add-on requires default_node_pool_min_count = 5 first. Operator-spawned pods need node headroom; without it, agent pods stay in Pending indefinitely.
Sizing profiles
Set sizing_profile in terraform.tfvars, then re-run make init-values && make deploy.
| Profile | When to use |
|---|
minimum | Cost parking, CI smoke tests, single-user demos. Expect OOM under real traffic. |
dev | Light non-production for local dev, CI pipelines, integration tests, short-lived POCs. |
production | Recommended for production. Multi-replica with HPA on all stateless components. |
production-large | High-volume starting point based on the scale guide (~50 concurrent users, ~1000 traces/sec). |
kubectl
# Pod health
kubectl get pods -n langsmith
kubectl get pods -n langsmith -w
kubectl describe pod <pod-name> -n langsmith
kubectl logs <pod-name> -n langsmith --tail=100 -f
kubectl logs <pod-name> -n langsmith --previous --tail=50
# Backend logs (live)
kubectl logs -n langsmith deploy/langsmith-backend --tail=100 -f
# Ingress
kubectl get ingress -n langsmith
kubectl describe ingress -n langsmith
# NGINX LoadBalancer external IP
kubectl get svc ingress-nginx-controller -n ingress-nginx
# TLS
kubectl get certificate -n langsmith
kubectl get challenges -n langsmith
kubectl describe certificate <cert-name> -n langsmith
kubectl get clusterissuer
# Workload Identity
kubectl get serviceaccount langsmith-ksa -n langsmith -o yaml | grep annotation -A5
# Helm
helm status langsmith -n langsmith
helm history langsmith -n langsmith
helm get values langsmith -n langsmith
# LangSmith Deployment
kubectl get lgp -n langsmith
kubectl get crd | grep langchain
Azure CLI
# Re-auth
az login
az account set --subscription <subscription-id>
az account show
# AKS
az aks list
az aks show --name <cluster> --resource-group <rg>
az aks get-credentials --name <cluster> --resource-group <rg>
# PostgreSQL
az postgres flexible-server list
az postgres flexible-server show --name <server> --resource-group <rg>
# Redis
az redis list
az redis show --name <cache> --resource-group <rg>
# Blob Storage
az storage account list
az storage container list --account-name <account>
# Key Vault
az keyvault list
az keyvault secret list --vault-name <vault>
az keyvault secret show --vault-name <vault> --name <secret> --query value -o tsv
# Application Gateway (AGIC)
az network application-gateway list
cd modules/azure/infra
terraform init
terraform plan # skip on first run, see deploy notes
terraform apply
terraform apply -target=module.aks
terraform output
terraform output -raw aks_cluster_name
terraform output -raw keyvault_name
terraform output -raw storage_account_name
terraform state list
Key watchouts
- Skip
make plan on a fresh deploy. kubernetes_manifest resources need a live cluster API. Use make apply directly.
- Uninstall Helm before
terraform destroy. The Azure Load Balancer holds a subnet reference; leaving it blocks VNet deletion. Run make uninstall first.
config.deployment.url must include https://. Without it, operator-spawned agents stay stuck in DEPLOYING.
config.deployment.enabled: true is required for the LangSmith Deployment add-on. Setting only the URL without enabled: true silently skips listener and operator.
- Encryption keys must never change after first enable. Rotating
insights_encryption_key or polly_encryption_key permanently breaks existing encrypted data.
- Roll the frontend after first Polly enable.
agentBootstrap creates langsmith-polly-config after registering; frontend pods started earlier do not pick it up.
letsencrypt (HTTP-01) only works with nginx, istio (self-managed), and envoy-gateway. For istio-addon or agic, use dns01 with a custom domain, or none for HTTP-only.
- Key Vault enters 90-day soft-delete after destroy. With
keyvault_purge_protection = false, run az keyvault purge to reclaim the name immediately.
Teardown
make uninstall # removes Helm releases + LGP CRD + namespaces
make destroy # destroys all Azure infrastructure via terraform destroy
make clean # removes generated secrets, helm values, tfstate lock