AIP-4290

Docker interface

A consequence of using individual generators for API client library generation is that each generator has its own set of dependencies and requirements in order to run.

This is reasonable for a user who wishes to generate many libraries for a single environment, but presents challenges for a user wishing to generate a single API for many languages or environments. Users need a way to generate libraries easily and quickly, with minimal ramp-up per language.

Guidance

Client library generators should ship Docker images providing the generator and exposing a common interface, so that generating the same API in multiple languages usually only requires substituting in the appropriate Docker image. Docker images for Google-authored generators will follow a consistent scheme.

CLI usage

The expected user command to invoke the code generator in a Docker image (from the proto import root, on a POSIX machine):

$ docker run --rm --user $UID \
  --mount type=bind,source=`pwd`/a/b/c/v1/,destination=/in/a/b/c/v1/,readonly \
  --mount type=bind,source=/path/to/dest/,destination=/out/ \
  gcr.io/gapic-images/{GENERATOR} \
  [-- additional options...]

Note: Even though each component of this is standard in the Docker ecosystem (other than the destination path issue, which is a result of how protoc handles imports), this is still a rather long command. We can provide a shortcut script to further simplify this, but such a script would be for convenience and not a replacement for this interface.

Container composition

Containers must include:

  • A current version of protoc, the protocol buffer compiler.
  • Common protos permitted to be used by all APIs (api-common-protos).
  • The applicable code generator plugin, as well as any dependencies it requires.
    • The code generator plugin itself should be added using an ADD or COPY statement from the host machine at build time and installed locally; it should not pull from a package manager. (This leads to catch-22 situations when cutting releases.)
    • Images may include either a pre-compiled binary of the plugin, or the installed source code, depending on the needs of the applicable ecosystem.
    • Installation of dependencies should use appropriate package managers.

The common protos and the protoc compiler are supplied by an independent image (gcr.io/gapic-images/api-common-protos). Both protoc and the common protos can be retrieved from this image into a generator’s image using the COPY --from syntax (see multi-stage builds). This is the preferred approach as it follows Docker conventions, and allows the protos to be versioned independently.

Base images

TL;DR: Each language probably wants language:x.y-alpine or language:x.y-slim. For example, ruby:2.5-alpine or python:3.7-slim. (Alpine images are smaller but idiosyncratic.)

The following guidelines apply to selecting base images (sorted roughly from most important to least important):

  • Images should generally be based off an official image for the latest stable version of the language in which the generator is implemented.
  • Images should be able to install required system dependencies from a well-understood package manager.
  • Images should be ultimately based off of Alpine, Debian, or Ubuntu. This is to ensure we benefit from GCR’s vulnerability scanning.
  • Images should endeavor to be as small as possible, in line with the general expectations of the Docker community:
    • Use the smallest base image you can. Alpine-based images are great if possible, but may not always be reasonable. “Slim” Debian images are usually the next best (and probably significantly more feasible in many situations).

Mount points

protoc must read protos (representing the API to be generated) from disk, and must write the final output (the client library) to disk. Because the user has the API protos on the host machine, and will ultimately need the output to go to said host machine, Docker images should use two mount points. This creates a hole in the abstraction layer: the user must mount the appropriate locations on the host machine to the appropriate locations in the container.

The expected locations in the container must be constant, and consistent between all generator container images:

  • /in/: The location of the protos to be generated. This must be the import root.
    • Example: If generating protos for the Language API, the protos in the Docker image must live in /in/google/cloud/language/v1/.
  • /out/: The location to which the client library shall be written.

Plugin options

Some micro-generators support configuration provided via protoc plugin options. In such cases, the options must be routed from the CLI input to the protoc command.

The ultimate protoc invocation could look like the following:

protoc --proto_path {path/to/common/protos} --proto_path /in/ \
       --{LANG}_gapic_out /out/ \
       --{LANG}_gapic_opt "go-gapic-package=GO_PACKAGE_VALUE" \
       `find /in/ -name *.proto`

A resulting invocation of the Docker image would be as follows:

$ docker run --rm --user $UID \
      --mount type=bind,source=`pwd`/a/b/c/v1/,destination=/in/a/b/c/v1/,readonly \
      --mount type=bind,source=/path/to/dest/,destination=/out/ \
      gcr.io/gapic-images/{GENERATOR} \
      --go-gapic-package GO_PACKAGE_VALUE

Thus shortcut scripts written to wrap the Docker image invocation must pass all options occurring after -- to the underlying docker run command. The internal Docker image must provide the conversion from usual shell syntax to the protoc option syntax.

Client library generators that make use of plugin options must accept those options as either flags or key=value pairs. (If a generator receives a string without an = character, that is a flag, and the implied value is true.) If multiple options are provided, they are comma-separated, to conform with the protoc behavior if multiple --opt flags are specified.

Additionally, generators should prefix all understood option keys with the target language for that generator (e.g. go-gapic-package, java-gapic-package), and should use kebab-case for keys (in order to match Docker, since protoc is inconsistent).

Microgenerators must not error on option keys that they do not recognize, although they may issue a warning.

Publishing images

Images for Google-created generators should be published in gcr.io/gapic-images, a dedicated project in Google Container Registry.

Images should follow the naming scheme:

gcr.io/gapic-images/gapic-generator-{lang}

CI should be configured to push a new Docker image to the registry when releases are made. When a release is tagged in GitHub (with a version number, such as 1.0.3), the CI service should build an image based on the code at that tag.

The resulting image should be tagged with each component of the version number, as well as latest, and the resulting tags pushed the registry. (This is in addition to pushing to a package manager if appropriate, which is outside the scope of this AIP.)

This means that a release tag of 1.0.3 in GitHub would result in pushing the following four tags to GCR:

  • gcr.io/gapic-images/gapic-generator-{lang}:1
  • gcr.io/gapic-images/gapic-generator-{lang}:1.0
  • gcr.io/gapic-images/gapic-generator-{lang}:1.0.3
  • gcr.io/gapic-images/gapic-generator-{lang}:latest

Note: These rules assumes that releases have ever-increasing version numbers; this process will need to be amended slightly if a generator needs to maintain multiple version streams simultaneously.