Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-featse-1779998369-ad736a3.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Provision the AWS cloud foundation and install LangSmith with the public Terraform modules at github.com/langchain-ai/terraform/tree/main/modules/aws. Plan for 30 to 40 minutes end to end on a clean account. The deployment runs in two stages: infrastructure (Terraform provisions VPC, EKS, RDS, ElastiCache, S3, IAM) and application (Helm installs the LangSmith chart against the cluster). Add-ons are enabled with a flag and a redeploy.

Prerequisites

Required tools

ToolVersionPurpose
AWS CLIv2Authenticate, query AWS resources, manage EKS kubeconfig
Terraform1.5Run the infrastructure modules
kubectl1.28Inspect the EKS cluster
Helm3.12Install and manage the LangSmith chart
eksctllatestOptional, handy for kubeconfig and debugging
Install on macOS:
brew install awscli kubectl helm eksctl
brew tap hashicorp/tap && brew install hashicorp/tap/terraform
Verify each tool is on PATH:
aws --version
terraform version
kubectl version --client
helm version
For Linux, follow the AWS CLI install guide and use your distribution’s package manager for the remaining tools.

Required AWS IAM permissions

The IAM user or role running Terraform needs permission to create and manage the cloud foundation. The following managed policies cover the full surface area. Use them as a starting point and trim down to least-privilege once the deployment is stable.
PolicyPurpose
AmazonEKSClusterPolicyCreate and manage EKS clusters
AmazonVPCFullAccessCreate VPC, subnets, route tables, and NAT
AmazonRDSFullAccessCreate and manage RDS PostgreSQL instances
AmazonElastiCacheFullAccessCreate ElastiCache Redis clusters
AmazonS3FullAccessCreate S3 buckets and VPC endpoints
IAMFullAccessCreate IRSA roles and policies
Run make preflight from modules/aws/ after authenticating. The preflight script confirms that the active credentials can perform each required action and reports the first missing permission, which is faster than discovering gaps mid-terraform apply.

Authenticate

Configure AWS credentials with the CLI:
aws configure
Or export environment variables:
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_DEFAULT_REGION="us-west-2"
Confirm the credentials work and the target region is enabled in the account:
aws sts get-caller-identity
aws ec2 describe-availability-zones --query 'AvailabilityZones[].ZoneName' --output table

License key and domain

Two non-AWS items must be ready before terraform apply:
  • LangSmith license key. Contact sales to request one. The key is stored in AWS SSM Parameter Store by the setup script, not in tfvars.
  • Domain or subdomain that resolves to the AWS account, plus an ACM certificate covering it (or letsencrypt / none for the tls_certificate_source variable).

Cluster sizing reference

The Terraform modules pick instance types and node counts based on sizing_profile. Plan capacity for the target tier before deploying.
ProfileEKS nodesRDS instanceElastiCacheUse case
dev2 × m5.xlargedb.t4g.mediumcache.t4g.smallDemos, CI, short-lived POCs
production3 × m5.2xlarge (HPA on)db.m6g.largecache.m6g.largeStandard production
production-large6 × m5.4xlarge (HPA on)db.m6g.2xlargecache.m6g.xlargeHigh-volume, multi-tenant
For production and production-large, also plan to provision external LangChain Managed ClickHouse or a self-managed external ClickHouse cluster. In-cluster ClickHouse is supported for dev only.

Rapid path

For the fastest path from zero to a running LangSmith instance, run these commands in order:
# 1. Clone the public modules
git clone https://github.com/langchain-ai/terraform.git
cd terraform/modules/aws

# 2. Generate terraform.tfvars interactively (Enter accepts current values)
make quickstart

# 3. Load secrets into SSM Parameter Store
#    Must be sourced, not executed
source infra/scripts/setup-env.sh

# 4. Provision infrastructure (~20 to 25 min)
make init
make plan
make apply

# 5. Configure kubectl
make kubeconfig
kubectl get nodes

# 6. Deploy LangSmith via Helm (~5 to 10 min)
make init-values
make deploy

# 7. Confirm
kubectl get pods -n langsmith
kubectl get ingress -n langsmith
To chain infrastructure and application in one command:
make quickdeploy          # interactive, prompts before terraform apply
make quickdeploy-auto     # non-interactive, auto-approves terraform
make quickdeploy runs terraform applykubeconfiginit-valueshelm deploy in sequence. If any step fails, the command exits with instructions for resuming from that step. The sections below cover each phase in detail.

Provision infrastructure

Provisioning the AWS cloud foundation takes 20 to 25 minutes on a clean account. Do not interrupt the apply.

What gets provisioned

ResourcePurpose
VPC + subnets + NATPrivate network for the cluster and managed services
EKS cluster + node groupsKubernetes compute
RDS PostgreSQLLangSmith operational data
ElastiCache RedisQueue and cache
S3 bucket + VPC endpointTrace payload blob storage
ALB + listenersPublic ingress with TLS
SSM Parameter Store entriesApplication secrets, synced into the cluster by External Secrets Operator
IRSA roles + IAM policiesPer-service AWS access
KEDA, cert-manager, ESOBootstrap workloads installed alongside infrastructure

Clone and configure

git clone https://github.com/langchain-ai/terraform.git
cd terraform/modules/aws
All subsequent commands run from modules/aws/. Run make help for the full target list. Generate terraform.tfvars with the interactive wizard:
make quickstart
The wizard prompts for naming prefix, region, EKS sizing, TLS source, external vs in-cluster services, and the optional add-on flags. It writes infra/terraform.tfvars. Re-running the wizard pre-selects existing values; press Enter at each prompt to keep the current config. Prefer to edit by hand? Copy the example and fill in the required fields:
cp infra/terraform.tfvars.example infra/terraform.tfvars
vi infra/terraform.tfvars
The minimum required variables:
name_prefix = "acme"
environment = "prod"
region      = "us-west-2"

eks_cluster_version = "1.31"
eks_managed_node_groups = {
  default = {
    name           = "node-group-default"
    instance_types = ["m5.4xlarge"]
    min_size       = 3
    max_size       = 10
  }
}

postgres_source = "external"
redis_source    = "external"

tls_certificate_source = "acm"
acm_certificate_arn    = "arn:aws:acm:us-west-2:<account-id>:certificate/<cert-id>"
langsmith_domain       = "langsmith.example.com"
See the AWS variables reference for every input variable.
Configure a remote state backend before applying. Edit infra/backend.tf to point at an S3 bucket and DynamoDB lock table you control. The Terraform repo ships a local backend by default for first-time evaluations.

Load secrets into SSM Parameter Store

source infra/scripts/setup-env.sh
The script reads terraform.tfvars, derives the SSM path /langsmith/{name_prefix}-{environment}/, then for each secret either reuses an exported value, reads the existing SSM parameter, auto-generates one (for salts and tokens), or prompts you. The license key and admin password are the two values you supply interactively. The script must be sourced (not executed) because make cannot export environment variables back to the parent shell. The script manages the following SSM parameters:
SSM keyHow it is setNotes
postgres-passwordPromptRDS uses this password
redis-auth-tokenAuto-generated (openssl rand -hex 32)ElastiCache requires hex
langsmith-api-key-saltAuto-generated (openssl rand -base64 32)Never rotate, breaks all API keys
langsmith-jwt-secretAuto-generated (openssl rand -base64 32)Never rotate, invalidates all sessions
langsmith-license-keyPromptFrom your LangChain account team
langsmith-admin-passwordPromptMust contain a symbol
deployments-encryption-keyAuto-generated Fernet keyLangSmith Deployment add-on
agent-builder-encryption-keyAuto-generated Fernet keyAgent Builder add-on
insights-encryption-keyAuto-generated Fernet keyInsights add-on
polly-encryption-keyAuto-generated Fernet keyPolly add-on
Verify the secrets are present and the TF_VAR_* environment variables are exported:
make secrets

Apply

make init
make plan
make apply
make plan shows the proposed diff. Review the output before applying. make apply provisions in dependency order: VPC and security groups, then EKS (about 12 minutes) and RDS (about 8 minutes, in parallel), then node groups, ElastiCache, S3, and the ALB.

Configure kubectl

make kubeconfig
kubectl get nodes
kubectl get pods -n kube-system
All nodes should report Ready and the core add-ons (CoreDNS, kube-proxy, VPC CNI, KEDA, cert-manager, ESO) should be Running.

Deploy LangSmith

Two deployment paths are supported. Pick one. Best for most deployments. Interactive prompts guide you through sizing and product choices.
cd modules/aws

make init-values
make deploy
init-values.sh prompts for the admin email, then reads sizing_profile and the enable_* flags from terraform.tfvars and copies the matching values files from helm/values/examples/ into helm/values/. On re-runs it preserves your choices and refreshes Terraform outputs. make deploy runs helm/scripts/deploy.sh, which:
  1. Refreshes the kubeconfig.
  2. Runs preflight checks (AWS credentials, cluster reachability, the langchain Helm repo).
  3. Applies the External Secrets Operator ClusterSecretStore and ExternalSecret so the cluster reads secrets directly from SSM.
  4. Installs the LangSmith Helm chart with the layered values files.
Expect 5 to 10 minutes for the chart to install and pods to become ready.

Verify

kubectl get pods -n langsmith
kubectl get ingress -n langsmith
When all pods are Running and the ingress shows the ALB DNS name, the deployment is ready. Use the domain you configured in langsmith_domain (or the ALB DNS name) to reach the UI.

Terraform-managed Helm deploy

Best for teams that want the full deployment in Terraform state, or for “bring your own infrastructure” scenarios. The app/ module manages the External Secrets Operator wiring, the helm_release, and feature toggles directly.
cd modules/aws

# Generate Helm values files from templates (required, the app module reads these)
make init-values

# Pull infra outputs into app/infra.auto.tfvars.json
make init-app

# Configure app-specific settings
cp app/terraform.tfvars.example app/terraform.tfvars
# Edit app/terraform.tfvars, set admin_email, sizing, and feature toggles

# Deploy
make plan-app
make apply-app
The app/terraform.tfvars file controls the application configuration:
admin_email          = "admin@example.com"
sizing               = "production"   # production | production-large | dev | none
enable_agent_deploys = true
enable_agent_builder = true
enable_insights      = true
enable_polly         = true
clickhouse_host      = "clickhouse.example.com"
make init-values is required before make plan-app. The app module reads the values files from helm/values/ and init-values populates them from helm/values/examples/ based on the sizing and add-on choices in infra/terraform.tfvars.
For “bring your own infrastructure”, skip make init-app and set all variables manually in app/terraform.tfvars.

Enable add-ons

Each add-on is gated by a flag in infra/terraform.tfvars. Set the flag, re-run make init-values to copy the matching values file, then re-run make deploy.
enable_deployments     = true   # LangGraph Platform (required for Agent Builder and Polly)
enable_agent_builder   = true   # Agent Builder UI
enable_insights        = true   # ClickHouse-backed analytics
enable_polly           = true   # Polly AI eval and monitoring
enable_usage_telemetry = false  # Extended usage telemetry
make init-values
make deploy
For details on each add-on, see LangSmith Deployment.

Optional: private EKS cluster with bastion

For deployments that must run a fully private EKS API endpoint, the modules ship a bastion host pattern:
  1. First, run from your workstation with create_bastion = true and enable_public_eks_cluster = true so the bastion can be created.
  2. After the initial deployment, set enable_public_eks_cluster = false and re-apply. The EKS API endpoint becomes private only.
  3. All subsequent Terraform work happens on the bastion. SSM into it, clone the repo, copy your terraform.tfvars and SSM secrets, then run the deployment from there.
enable_public_eks_cluster = false
create_bastion            = true

# Optional SSH access (SSM is the default and requires no key):
# bastion_key_name          = "my-keypair"
# bastion_enable_ssh        = true
# bastion_ssh_allowed_cidrs = ["203.0.113.0/24"]
Connect via SSM Session Manager:
terraform output bastion_ssm_command
aws ssm start-session --target <instance-id> --region us-west-2
The bastion lives in a public subnet for SSM agent connectivity but does not need a public IP if your VPC has the SSM, SSMMessages, and EC2Messages VPC endpoints. The bastion comes preinstalled with kubectl, helm, terraform, git, and jq, with kubeconfig already configured for the EKS cluster. Install the Session Manager plugin for the AWS CLI on your workstation.

Optional: Envoy Gateway ingress

The default ingress is the AWS Load Balancer Controller (ALB). Set enable_envoy_gateway = true in terraform.tfvars to install Envoy Gateway instead. Envoy Gateway is required for multi-namespace dataplane deployments where the langgraph-dataplane chart runs in its own namespace.
# infra/terraform.tfvars
enable_envoy_gateway = true
source infra/scripts/setup-env.sh
make apply

make init-values
cp helm/values/examples/langsmith-values-ingress-envoy-gateway.yaml helm/values/
make deploy
The deploy script annotates the Envoy Gateway NLB service with the ACM certificate ARN automatically when tls_certificate_source = "acm". TLS terminates at the NLB; Envoy sees plain HTTP internally. When running the dataplane chart in a separate namespace, apply the RBAC manifest once per dataplane namespace:
kubectl apply -f helm/values/dataplane-rbac.yaml
This grants the langsmith-host-backend ServiceAccount read access to pods, pod logs, deployments, and ReplicaSets in the dataplane namespace. Without it, agent run logs do not stream in the LangSmith UI.

Next steps