How to share Protobuf definitions for gRPC Services?

How to share the .proto files across the different teams and services? - or to not sharing them at all, but let each service generate and package corresponding client libraries?

How to share Protobuf definitions for gRPC Services?
Photo by Alina Grubnyak / Unsplash

In a recent client project, we want to introduce gRPC to simplify interprocess communication for microservices.

gRPC is a fantastic technology that improves developer productivity. In this case, it would replace the current approach, which is implementing internal REST services using Akka HTTP with a semi-automatic JSON encoder/decoder. A fair amount of custom code was required, that will be generated by the gRPC compiler in the future.

Since all services are written in Scala, the ScalaPB project is a perfect fit to generate scala code from the .proto definition files.

The main question is:

How to share the .proto files across the different teams and services?

- or to not share them at all, but let each service generate and package corresponding client libraries?

Going forward without a proper definition would cause a decreased developer experience ("is the service offering a client library or do they want me to copy the .proto file?"). It might even lead to runtime errors caused by wrong/ outdated definitions.

A lot has been written about this topic. I've linked some of the most useful articles below, where the authors made a wonderful job to talk about their experiences and approach. There is no common sense on how to share the .proto files across services though.

In the following section, I will look into the different ways of sharing .proto definitions across services and try to discuss the pros and cons.

Ways to share protobuf definitions

Single repository for .proto files

Put all .proto definition files in a single repository. Define a folder structure to organize the files. in most cases one folder for each service.

This repository can be used to generate client libraries by a Build Pipeline or can be linked as a git submodule into services to generate the code on the fly.

Summary

  • organize all .proto files in a single repository
  • can be linked as a git submodule into other projects
  • each service has its folder

Advantages

  • All service definitions in a single place
  • Definition of all internal services and models

Disadvantages

  • The repository might become big

Separate repository for each service

Instead of putting all .proto files in a single repository, you create a dedicated repository for each service. Put your service in {service} repository and the corresponding .proto files in a repository called {service}-proto.

Compared to the single repository you have more independence. Also instead of building a single library containing all generated code, your builds are split per service leading to better-scoped dependencies. You only link the proto-libraries for the services you are implementing and/or consuming.

Summary

  • each service has a separate repository like {service}-proto
  • Build client libraries from the repository or link using git submodule

Advantages

  • reduce build time since client libraries are only built once
  • better scoping since a library contains code for a single service only

Disadvantages

  • Lots of small repositories and build pipelines to manage

Copy and Paste .proto files

For completeness let's name the possibility to copy and paste the .proto files across the services implementing or consuming services. This might be suitable for quick prototypes or scenarios where you have only a single server and client. but should be avoided in general.

You will lose track of the different versions of a file. For common definitions, you'll lose track of ownership easily.

Summary

  • Copy and paste the .proto files from the server repository into the client repository
  • generate code locally

Advantages

  • Easy to use, no additional build setup required.

Disadvantages

  • multiple copies of the same .proto definition
  • copies become outdated
  • might become unclear which service "owns" the .proto definition

gRPC Server Reflection Protocol

  • The server exposes the Server Reflection Protocol
  • The client uses the reflection service to get definitions during build or runtime

Advantages

Disadvantages

  • Not (yet) supported by all target languages

Using Libraries vs git submodules

Independent of the fact whether you are using single or dedicated repositories for your .proto files, the question remains if you're building and publishing generated code as libraries or using git submodules to link the .proto files.

Let's look into both scenarios.

Publishing generated code as a library

In that case, you generate your code and package it using the package manager for your target language (e.g. Jar, Gem, npm package, etc.).

The generated library can then be used by the server and clients to implement or consume the services. With this approach, it is important to look into the development cycle.

Let's imagine the use case where you have to add a field to a message—something like this:

message Person {
  optional int32 id = 1;
  optional string name = 2;
  optional string email = 3;
}

For this simple change, you have to build a new library with a new version tag. So let's say your library is at version 1.0.0 you give it version 1.1.0 for the additional field and publish it to your central repository (e.g. Nexus).

The service which implements the server needs to be updated. So update the dependency of the dependency from 1.0.0 to 1.1.0. Then you can apply any required modifications to the code base (e.g. mapping the new field) and build and deploy the new version of the service.

What I like about this approach is the fact, that it almost forces a developer to plan the required data structures and (grpc) services. Each change requires a new version of the library. On the other hand, this can become cumbersome for rapid prototyping.

Linking .proto files as git submodule

Your .proto files are living in a dedicated git repository, let's call it my-proto here. You then link the repository from your services' repository as a git submodule.

To add the my-proto repository to your project, use the following command:

$ cd your-service-repository
$ git submodule add https://github.com/myorg/my-proto
Cloning into 'my-proto'...
remote: Counting objects: 11, done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 11 (delta 0), reused 11 (delta 0)
Unpacking objects: 100% (11/11), done.
Checking connectivity... done.

You'll find a new folder called my-proto in your project. You can now configure your build to generate the code.

When cloning the repository submodules are not initialized by default. You have to issue the following command manually after cloning it:

$ git submodule update --init
Submodule 'my-proto' (https://github.com/myorg/my-proto) registered for 
Cloning into 'my-proto'...
remote: Counting objects: 11, done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 11 (delta 0), reused 11 (delta 0)
Unpacking objects: 100% (11/11), done.
Checking connectivity... done.

You can work with the submodule like a normal git repository. This means you can apply changes that you can commit and push to make available to others.

With this approach, the initial development of a new service might be easier, since you can easily modify and extend the .proto definitions while implementing the service. You're not required to publish each change and modify the dependency version.

On the other hand, working with git submodule needs some care to make sure you're building against the right version. But this can be enforced in your CD pipeline.

Conclusion

We looked into the different ways of sharing .proto files across services and looked at the pros and cons. Then we talked about the differences between sharing generated gRPC code via your package manager versus linking the .proto files as git submodules into your project.

All approaches have their pros and cons. Define your approach and make sure it is followed in your organization. Linking .proto files as git submodules might be the better option for rapid prototyping if you expect frequent changes to your proto files.

References