Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-featse-1779998369-ad736a3.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Command cheat sheet for day-to-day operations against an Azure LangSmith deployment provisioned with the Azure Terraform modules. All make targets run from modules/azure/. Run make help for an inline summary. For the full deployment walkthrough, see the Azure deployment guide.

Deployment overview

StageWhat gets deployedCommand
InfrastructureAKS + Postgres + Redis + Blob + Key Vault + cert-manager + KEDA + ingressmake apply
Cluster credentialsKubeconfig + Kubernetes Secrets from Key Vaultmake kubeconfig && make k8s-secrets
LangSmith (Helm path)LangSmith Helm (~17 pods) via shell scriptsmake init-values && make deploy
LangSmith (Terraform path)Secrets + SA + Helm release managed in Terraform statemake init-app && make apply-app
LangSmith Deployment add-onhost-backend, listener, operator. Bump default_node_pool_min_count to 5 firstmake apply && make init-values && make deploy
Agent Builder add-ontool-server, trigger-server, agent-builder LGPmake init-values && make deploy
Insights + Polly add-onClio analytics, Polly eval agentmake init-values && make deploy

First-time setup

cd terraform/modules/azure

# 1. Generate terraform.tfvars (interactive wizard)
make quickstart

# 2. Bootstrap secrets (prompts on first run, reads from Key Vault on repeat)
make setup-env

# 3. Preflight (Azure CLI, RBAC, providers, quotas)
make preflight

# 4. Deploy infrastructure (~15 to 20 min)
#    Skip `make plan` on a fresh deploy — kubernetes_manifest needs a live cluster
make init
make apply

# 5. Cluster credentials + Kubernetes Secrets
make kubeconfig
make k8s-secrets

# 6. Generate Helm values from Terraform outputs
make init-values

# 7. Deploy LangSmith (~10 min)
make deploy

# 8. Health check
make status
Or run everything after make apply in one shot:
make deploy-all      # kubeconfig → k8s-secrets → init-values → deploy
make deploy-all-tf   # apply → init-values → init-app → apply-app (Terraform path)

Day-2 operations

make status         # 10-section health check
make status-quick   # skip Key Vault + K8s secret queries (faster)
make deploy         # re-deploy after any Helm value changes
make init-values    # re-generate values after Terraform changes
make kubeconfig     # refresh cluster credentials
make k8s-secrets    # re-create langsmith-config-secret from Key Vault

# Manage Key Vault secrets interactively
make keyvault                  # interactive menu
make keyvault list             # all secrets with timestamps
make keyvault get <secret>     # read a secret
make keyvault set <key> <val>  # update a secret
make keyvault validate         # check all required secrets exist
make keyvault diff             # compare KV vs K8s secret
make keyvault delete <key>     # soft-delete (recoverable 90 days)

Add-ons

Add-on passes (3 to 5) are controlled by flags in infra/terraform.tfvars. Set the flags, re-run init-values && deploy. init-values.sh copies the matching example file into helm/values/ automatically.
# infra/terraform.tfvars
sizing_profile       = "production"   # minimum | dev | production | production-large
enable_deployments   = true           # LangSmith Deployment add-on (listener + operator + host-backend)
enable_agent_builder = true           # Agent Builder add-on (requires enable_deployments)
enable_insights      = true           # Insights / Clio analytics add-on
enable_polly         = true           # Polly AI eval add-on (requires enable_deployments)
The LangSmith Deployment add-on requires default_node_pool_min_count = 5 first. Operator-spawned pods need node headroom; without it, agent pods stay in Pending indefinitely.

Sizing profiles

Set sizing_profile in terraform.tfvars, then re-run make init-values && make deploy.
ProfileWhen to use
minimumCost parking, CI smoke tests, single-user demos. Expect OOM under real traffic.
devLight non-production for local dev, CI pipelines, integration tests, short-lived POCs.
productionRecommended for production. Multi-replica with HPA on all stateless components.
production-largeHigh-volume starting point based on the scale guide (~50 concurrent users, ~1000 traces/sec).

kubectl

# Pod health
kubectl get pods -n langsmith
kubectl get pods -n langsmith -w
kubectl describe pod <pod-name> -n langsmith
kubectl logs <pod-name> -n langsmith --tail=100 -f
kubectl logs <pod-name> -n langsmith --previous --tail=50

# Backend logs (live)
kubectl logs -n langsmith deploy/langsmith-backend --tail=100 -f

# Ingress
kubectl get ingress -n langsmith
kubectl describe ingress -n langsmith

# NGINX LoadBalancer external IP
kubectl get svc ingress-nginx-controller -n ingress-nginx

# TLS
kubectl get certificate -n langsmith
kubectl get challenges -n langsmith
kubectl describe certificate <cert-name> -n langsmith
kubectl get clusterissuer

# Workload Identity
kubectl get serviceaccount langsmith-ksa -n langsmith -o yaml | grep annotation -A5

# Helm
helm status langsmith -n langsmith
helm history langsmith -n langsmith
helm get values langsmith -n langsmith

# LangSmith Deployment
kubectl get lgp -n langsmith
kubectl get crd | grep langchain

Azure CLI

# Re-auth
az login
az account set --subscription <subscription-id>
az account show

# AKS
az aks list
az aks show --name <cluster> --resource-group <rg>
az aks get-credentials --name <cluster> --resource-group <rg>

# PostgreSQL
az postgres flexible-server list
az postgres flexible-server show --name <server> --resource-group <rg>

# Redis
az redis list
az redis show --name <cache> --resource-group <rg>

# Blob Storage
az storage account list
az storage container list --account-name <account>

# Key Vault
az keyvault list
az keyvault secret list --vault-name <vault>
az keyvault secret show --vault-name <vault> --name <secret> --query value -o tsv

# Application Gateway (AGIC)
az network application-gateway list

Terraform

cd modules/azure/infra

terraform init
terraform plan        # skip on first run, see deploy notes
terraform apply
terraform apply -target=module.aks

terraform output
terraform output -raw aks_cluster_name
terraform output -raw keyvault_name
terraform output -raw storage_account_name

terraform state list

Key watchouts

  • Skip make plan on a fresh deploy. kubernetes_manifest resources need a live cluster API. Use make apply directly.
  • Uninstall Helm before terraform destroy. The Azure Load Balancer holds a subnet reference; leaving it blocks VNet deletion. Run make uninstall first.
  • config.deployment.url must include https://. Without it, operator-spawned agents stay stuck in DEPLOYING.
  • config.deployment.enabled: true is required for the LangSmith Deployment add-on. Setting only the URL without enabled: true silently skips listener and operator.
  • Encryption keys must never change after first enable. Rotating insights_encryption_key or polly_encryption_key permanently breaks existing encrypted data.
  • Roll the frontend after first Polly enable. agentBootstrap creates langsmith-polly-config after registering; frontend pods started earlier do not pick it up.
  • letsencrypt (HTTP-01) only works with nginx, istio (self-managed), and envoy-gateway. For istio-addon or agic, use dns01 with a custom domain, or none for HTTP-only.
  • Key Vault enters 90-day soft-delete after destroy. With keyvault_purge_protection = false, run az keyvault purge to reclaim the name immediately.

Teardown

make uninstall   # removes Helm releases + LGP CRD + namespaces
make destroy     # destroys all Azure infrastructure via terraform destroy
make clean       # removes generated secrets, helm values, tfstate lock