How to run an AWS X-Ray sidecar container in a private subnet

How to run a AWS X-Ray sidecar container in a private subnet

We are running cloud native applications in a private VPC meaning the elastic network interfaces don’t have direct internet access, not even through a NAT Gateway. Instead the AWS Lambda and ECS Fargate tasks targeted in these subnets with there elastic network interfaces (ENI) have to use a custom outbound proxy.

So when we tried to implement X-Ray on AWS Lambda it wend wonderful well. It seems that the lambda functions can connect to X-Ray through an internal AWS mechanism. Since we did not have to whitelist anything in our custom outbound proxy and there is (yet) no such a thing as a VPC endpoint private link for AWS X-Ray.

However when working with ECS and EKS the challenge is there that the sidecar X-Ray container can’t connect to the public Internet endpoint xray.eu-west-1.amazonaws.com:

[Error] Sending segment batch failed with: RequestError: send request failed caused by: Post https://xray.eu-west-1.amazonaws.com/TraceSegments: read tcp 10.0.10.201:52280->52.17.107.19:443: read: connection reset by peer

And as mentioned their is no VPC private link endpoint for X-Ray yet. There is however a thread on the AWS forum here where you can ask for it. But the response of AWS and timelines show it does not have a high priority on their backlog:
VPC endpoint support has been a continuous ask for X-Ray customers, and we absolutely plan on delivering this. However, at the moment, we don’t have exact ETA that we can share. If you’d like to open up a deeper conversation with us, please reach out to your Account team or AWS Support and we’d be happy to have a chat.

So we just had to configure the X-Ray sidecar container with a proxy for now. Which took a little bit of trial-on-error so wanted to share our experience.

We first created our own custom container image

FROM amazonlinux
RUN yum install -y unzip
RUN curl -o daemon.zip https://s3.us-east-2.amazonaws.com/aws-xray-assets.us-east-2/xray-daemon/aws-xray-daemon-linux-3.x.zip
RUN unzip daemon.zip && cp xray /usr/bin/xray
ENTRYPOINT ["/usr/bin/xray", "-t", "0.0.0.0:2000", "-b", "0.0.0.0:2000", "-l", "info", "-p", "http://proxy.int.terra10.nl:3128", "-o"]
EXPOSE 2000/udp
EXPOSE 2000/tcp

Where we used the following custom configuration (more info here):

  • -l sets the log-level to info which is the default I think, but this is really handy when setting things up because you might want dev level initially. Allowed values are: dev, debug, info, warn, error, prod.
  • -p allows us to configure our custom proxy address
  • -o defines local to prevent the failing and pointless attempt to retrieve EC2 metadata (default behavior)

We pushed this image to our image repository and then pulled the sidecar from the ECS Task Definition. Since our container will be running in a VPC the AWS Task Definition networkmode=’awsvpc’. Meaning we don’t need links since all containers can communicate with each-other within the task.

containerDefinitions":[   
{
	"name":"xray-daemon",
	"image":"ourcustomimage",
	"portMappings":[         
	{
		"containerPort":2000,
    	"protocol":"udp"      
	}
	] 
},

That should do the trick.

Logging

The logging is a bit confusing, especially when you hit an error. But this should be the result after a successful initiation:

[Info] Initializing AWS X-Ray daemon 3.2.0
[Debug] Listening on UDP 0.0.0.0:2000
[Info] Using buffer memory limit of 39 MB
[Info] 624 segment buffers allocated
[Debug] Using proxy address: http://proxy.int.terra10.nl:3128
[Debug] Fetch region eu-central-1 from commandline/config file
[Info] Using region: eu-central-1
[Debug] ARN of the AWS resource running the daemon:
[Debug] No Metadata set for telemetry records
[Debug] Using Endpoint: https://xray.eu-west-1.amazonaws.com
[Debug] Telemetry initiated
[Info] HTTP Proxy server using X-Ray Endpoint : https://xray.eu-west-1.amazonaws.com
[Debug] Using Endpoint: https://xray.eu-west-1.amazonaws.com
[Debug] Batch size: 50
[Info] Starting proxy http server on 0.0.0.0:2000

Hope it helps!