Over the past quarter I have been helping our platform team refactor a configuration service that several delivery pipelines depend on. During that review I spotted an XML External Entity (XXE) injection vector that could have exposed environment variables and IAM credentials to any engineer with access to the internal UI. This post documents how we found the issue, why the existing pipeline tests missed it, and how we closed the gap without blocking deploy velocity.
We began by mapping every place that accepted XML uploads. The configuration service relied on the Go encoding/xml package with custom entity expansion enabled for convenience. That meant a crafted payload could read arbitrary files on the server. Our first step was to reproduce the bug using an internal integration test harness—fuel for a quick proof-of-concept exploit that fetched /etc/passwd and the pod’s AWS metadata credentials.
Detection was equally important. We added an Open Policy Agent (OPA) rule that rejects XML uploads containing DOCTYPE declarations and layered it with a Falco policy that alerts on unusual access to the instance metadata URL. These controls gave us confidence that even if a similar bug reappears, we would see it instantly.
Finally, we hardened the service by switching to encoding/xml with Strict mode enabled, verifying the incoming requests with JSON schema, and migrating the UI to send JSON instead of XML. The refactor shipped behind a feature flag, so partner teams could test without downtime. If you maintain internal tools that still rely on XML, schedule an audit—XXE is alive and well in legacy automation.