Packer Build: Timeout waiting for SSH

Back to basics. Building an AMI from the official Amazon Linux 2023 base AMI should be as easy as it gets. Packer 1.9.4 on Mac installed with brew. A simple build script. Nothing complicated. 80% of the time it would fail with this error: Timeout waiting for SSH.

> packer -version
1.9.4
> packer build .
aws-nat-gateway.amazon-ebs.al2023: output will be in this color.

==> aws-nat-gateway.amazon-ebs.al2023: Force Deregister flag found, skipping prevalidating AMI Name
    aws-nat-gateway.amazon-ebs.al2023: Found Image ID: ami-03455155bfe406fa1
==> aws-nat-gateway.amazon-ebs.al2023: Creating temporary keypair: packer_66f61989-f2e2-a39d-6ab7-4f1b7c000d74
==> aws-nat-gateway.amazon-ebs.al2023: Creating temporary security group for this instance: packer_66f6198b-e38d-210f-4aaa-275a0f9b9d04
==> aws-nat-gateway.amazon-ebs.al2023: Creating temporary instance profile for this instance: packer-66f6198b-2fdc-bac5-fbab-8b482c03e797
==> aws-nat-gateway.amazon-ebs.al2023: Creating temporary role for this instance: packer-66f6198b-2fdc-bac5-fbab-8b482c03e797
==> aws-nat-gateway.amazon-ebs.al2023: Attaching policy to the temporary role: packer-66f6198b-2fdc-bac5-fbab-8b482c03e797
==> aws-nat-gateway.amazon-ebs.al2023: Launching a source AWS instance...
    aws-nat-gateway.amazon-ebs.al2023: Instance ID: i-0657925af4d91eea1
==> aws-nat-gateway.amazon-ebs.al2023: Waiting for instance (i-0657925af4d91eea1) to become ready...
==> aws-nat-gateway.amazon-ebs.al2023: Using SSH communicator to connect: localhost
==> aws-nat-gateway.amazon-ebs.al2023: Waiting for SSH to become available...
==> aws-nat-gateway.amazon-ebs.al2023: Timeout waiting for SSH.
==> aws-nat-gateway.amazon-ebs.al2023: Terminating the source AWS instance...
==> aws-nat-gateway.amazon-ebs.al2023: Cleaning up any extra volumes...
==> aws-nat-gateway.amazon-ebs.al2023: No volumes to clean up, skipping
==> aws-nat-gateway.amazon-ebs.al2023: Detaching temporary role from instance profile...
==> aws-nat-gateway.amazon-ebs.al2023: Removing policy from temporary role...
==> aws-nat-gateway.amazon-ebs.al2023: Deleting temporary role...
==> aws-nat-gateway.amazon-ebs.al2023: Deleting temporary instance profile...
==> aws-nat-gateway.amazon-ebs.al2023: Deleting temporary security group...
==> aws-nat-gateway.amazon-ebs.al2023: Deleting temporary keypair...
Build 'aws-nat-gateway.amazon-ebs.al2023' errored after 6 minutes 385 milliseconds: Timeout waiting for SSH.

==> Wait completed after 6 minutes 385 milliseconds

==> Some builds didn't complete successfully and had errors:
--> aws-nat-gateway.amazon-ebs.al2023: Timeout waiting for SSH.

==> Builds finished but no artifacts were created.

Further investigation showed that the AWS Session Manager client wasn’t starting. On successful runs, Fleet Manager would show the EC2 instance within seconds. On failure runs, Fleet Manager would never show the EC2 instance (never is loosely defined as not before the timeout expired).

After a bit of trial and error, I found that adding pause_before_ssm with any value 1 minute or greater would resolve this issue.

source "amazon-ebs" "al2023" {
  ...
  pause_before_ssm = "1m"
  ...
}

If you pass custom userdata to the build, be sure to add additional time to compensate for that.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *