TABLE OF CONTENTS


Introduction

Base images are a fundamental component of containerized applications. They provide the foundation on which applications are built and play a crucial role in defining the security posture of every derived container. Misunderstanding how base images work, or failing to properly manage them, can lead to gaps in vulnerability detection and compliance reporting. This paper explains what base images are, how Aqua identifies and handles them, and the best practices organizations should adopt to ensure security and traceability.


What is a Base Image?

In container technology, a base image is the starting point of a container build. It contains the initial filesystem, libraries, and runtime environment. All other layers defined in the Dockerfile are stacked on top of this foundation.

For example:

FROM ubuntu:22.04

In this case, the ubuntu:22.04 image acts as the base, and any subsequent commands in the Dockerfile add layers above it.

Base images may be:

  • Operating system images (e.g., Ubuntu, Debian, Alpine).

  • Runtime images (e.g., Python, Node.js, Java).

  • Minimal or empty images (e.g., scratch).


How Aqua Recognizes Base Images

Unlike Docker, which records the FROM line in the image metadata, Aqua takes a content-driven approach. Aqua requires the base image to be scanned with Aqua so it can:

  • Understand the exact contents of the base image.

  • Attribute vulnerabilities and licenses inherited by child images.

  • Establish accurate lineage for compliance and reporting.

Aqua relies on the immutable digest (SHA256) rather than just the name:tag reference. This prevents ambiguity because tags can change, while digests uniquely represent the precise image content.

If a base image is not scanned, Aqua has no reliable way to identify it and therefore cannot display it as the parent of a child image.


Digest Consistency and Lineage Tracking

Base image recognition in Aqua is strictly tied to the digest. If the digest of a base image changes, Aqua no longer associates the original base with its children. This ensures accuracy: from a content perspective, a child image built on a new digest is fundamentally different.

For example:

  • A child image is built from ubuntu:22.04 at digest sha256:abcd.

  • When Aqua scans this image, it records the digest and establishes the relationship between the child and its base image. 

  • Later, ubuntu:22.04 is updated and resolves to digest sha256:efgh.

  • Aqua will not associate this new base image with the existing child image as the child was never built from that updated digest, and maintaining this distinction ensures accurate lineage tracking. 

  • To recognize the new base image, the updated digest must also be scanned. After scanning, Aqua will show only those child images that are built with the updated base image; earlier child images will lose their base image association. 

This approach avoids incorrect mappings and prevents misleading vulnerability inheritance.


Why This Approach Matters

  • Accuracy: Ensures vulnerabilities are tied to the actual content used in production, not a floating tag.

  • Transparency: Gives security teams clear visibility into which base each image depends on.

  • Reliability: Prevents accidental or incorrect attribution of issues when base images are updated.

  • Compliance: Helps maintain evidence that the exact artifact deployed has been scanned and approved.


Best Practices for Customers

  • Always scan base images first
    Ensure that the base image (by digest) is scanned before scanning any child images. This allows Aqua to correctly establish the parent-child relationship.

  • Rescan when base images update
    If a base image is updated and resolves to a new digest, scan the new digest in Aqua. Then, rebuild and rescan child images that depend on it to restore lineage visibility.

  • Monitor and control digest drift
    Implement controls in CI/CD pipelines to detect when a tag points to a new digest. Review and validate changes before allowing them into production.

  • Use consistent build practices
    Standardize how base images are pulled and referenced across teams and pipelines. Avoid mixing tag-only references with digest-pinned references, as this may lead to inconsistent Aqua mappings.

  • Scan images consistently
    Inconsistent scan methods can produce different results, which is why we recommend scanning images in a consistent way — for example, image scans can be initiated from CI/CD pipelines, triggered as a manual scan from the Aqua UI, or executed through registry auto-pull.
    Note: If a filesystem scan of an image is initiated, Aqua cannot correlate the results back to the original base image data.

  • Integrate Aqua scanning early in the pipeline
    Incorporate scanning of base and child images into your CI/CD flow to catch digest changes early, rather than discovering them later in runtime or reporting.

  • Communicate lineage changes
    Educate teams that when a base image digest changes, previously built child images will lose their base association in Aqua. This is expected behavior and ensures accuracy in vulnerability attribution.


Conclusion

Base images form the backbone of containerized applications, but managing them securely requires precision. Aqua’s digest-based scanning approach ensures that only accurately identified base images are associated with child images. By scanning base images, pinning them to digests, and monitoring updates, organizations can maintain a clear, verifiable chain of trust and ensure vulnerabilities are tracked correctly across the container lifecycle.