What is ProtoBuf ?

By yuseferi, 12 July, 2022
Golang protobuf, how to use protobuf in Go

In a large scale miscroservice architecture there are lots of  comunication between services ( Sync and Asyc).
After several days discussion and investigation finally we decided to use "Google Protocol Buffers" as a serializer in our infra.

I would like to share my experience with you.

What is protobuf? 

In Google’s own words: “Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.”

You can use it is a binary interchange format to send data over the wire, to communicate with 3rd party applications that support Protocol Buffers, or even to create a light weight but extensible file format.

Protocol Buffers are:

  • Flexible and extensible: You can update your data structure (or protocol) without breaking deployed programs that use an “older” format.
  • Compact: The serialized format is compact, making it ideal for transfer over (wireless) networks. The output is 3 to 10 times smaller than corresponding data in XML or JSON format.
  • Fast to parse and generate: The binary format can be generated and parsed very quickly. According to Google, it can be 20 to 100 times faster to parse compared to XML.

Of course, there are other binary serialization formats out there, each with its own strengths and weaknesses. For example, BSON, which can be used (among other things) to convert JSON to BSON (binary JSON) and vice versa. BSON is good for representing JSON data in a fast-to-parse format, but it is not designed to be compact. Also, formats like JSON and XML are good for representing unstructured data. Protocol Buffers require that your data is structured (although it is flexible enough to update the structure in the future).

Protobuf is 3 to 10 times smaller and 20 to 100 times faster than XML.

I'm part of the people that samples help them to understand better I would like to explain it with a sample.  

Now, let’s take a football game and represent the data in XML, JSON, and Protobuf.

XML:

<game>
  <home>Real Madrid</home>
  <away>Liverpool</away>
  <venue>Paris Sant Germain Stadium</venue>
  <date>22-04-2022</date>
</game>

JSON:

{
  "home": "Real Madrid",
  "away": "Liverpool",
  "venue": "Paris Sant Germain Stadium",
  "date": "22-04-2022"
}

Protobuf:

[10 11 82 101 97 108 32 77 97 100 114 105 100 18 9 66 97 114 99 101 108 111 110 97 26 2

The above shown protobuf format is the encoded byte of the string, starting from position 2 in the array. The string “Real Madrid” is spelled out: R = 82, e = 101 and so on.

To know about this encoding, you can  check out Google’s official documentation here: Proton Buffer Encoding.

At this scale, the size is mostly similar. But when we scale it to larger data, the size starts to show a huge difference.

So, now let’s bring Protobuf to our Go code.

Installing Packages

go_workspace go get github.com/golang/protobuf
go_workspace go get github.com/golang/protobuf/proto
go_workspace go get -u github.com/golang/protobuf/protoc-gen-go
go_workspace export PATH=$PATH:$GOPATH/bin

This will install the necessary packages and we are now ready.

Now, let’s go and define the protobuf for the game object.

Game.proto

syntax = "proto3";

package main;

message Game {
  string home = 1;
  string away = 2;
  string venue = 3;
  string date = 4;
}

We start by specifying the version syntax of proto. Here we set syntax to ‘proto3’. Next, we define the package in which this object is to be used. After that, we will define the format of our Game object. This consists of our message format of type Game which features the following fields, home, away, venue and date.

As we defined the proto file, we are going to compile it with the protoc.

go_workspace/src protoc --go_out=. *.proto

This will generate a game.pb.go file, which has the auto-generated code. For a detailed understanding of this code, check out Google’s documentation: Go generated code.

main.go

package main

import (
  fmt "fmt"
  "log"
  proto "github.com/golang/protobuf/proto"
)
func main() {
  championsLeaugeFinal := &Game{
    Home:  "Real Madrid",
    Away:  "Liverpool",
    Venue: "Paris Sant Germain Stadium",
    Date:  "22-04-2022",
  }
  data, err := proto.Marshal(championsLeaugeFinal)
  if err != nil {
    log.Fatal("Marshaling error: ", err)
  }
  fmt.Println(data)
  newChampionsLeaugeFinal := &Game{}
  err = proto.Unmarshal(data, newChampionsLeaugeFinal)
  if err != nil {
    log.Fatal("UnMarshaling error: ", err)
  }
  fmt.Println(newChampionsLeaugeFinal.GetHome())
  fmt.Println(newChampionsLeaugeFinal.GetAway())
  fmt.Println(newChampionsLeaugeFinal.GetVenue())
  fmt.Println(newChampionsLeaugeFinal.GetDate())
} 

In the above code, we use the Game struct defined in the game.pb.go to add details to the championsLeaugeFinal object. We use the Marshal function of the proto library to convert the object to the Protobuf format. Then the encoded bytes can be decoded with the Unmarshal function. Now we can use the function GetHome, GetAway generated in the game.pb.go file to get the values from the decoded object.

go_workspace/src go run main.go game.pb.go
[10 11 82 101 97 108 32 77 97 100 114 105 100 18 9 66 97 114 99 101 108 111 110 97 26 26 83 97 110 116 105 97 103 111 32 66 101 114 110 97 98 195 169 117 32 83 116 97 100 105 117 109 34 10 50 54 45 49 48 45 50 48 49 57]
Home:  Real Madrid
Away:  Liverpool
Venue:  Paris Sant Germain Stadium
Date:  22-04-2022

Don’t forget to include the game.pb.go. Now we have a small example up and running. But actually in the real world, the data is not going to be this simple. Now let’s see some nested fields.

We will use the same Game object. In this, we are taking the fields home and away and making it an object Team.

game.proto:

syntax = "proto3";
package main;
message Team {
  string home = 1;
  string away = 2;
}
message Game {
  Team team = 1;
  string venue = 2;
  string date = 3;
}

So again, we create the auto-generated code.

go_workspace/src protoc --go_out=. *.proto

And we update the main.go file correspondingly.

main.go

package main
import (
  fmt "fmt"
  "log"
  proto "github.com/golang/protobuf/proto"
)
func main() {
  championsLeaugeFinal := &Game{
    Venue: "Paris Sant Germain Stadium",
    Date:  "22-04-2022",
    Team: &Team{
    Home:  "Real Madrid",
    Away:  "Liverpool",
    },
  }

  data, err := proto.Marshal(championsLeaugeFinal)
  if err != nil {
    log.Fatal("marshaling error: ", err)
  }
  fmt.Println(data)
  newChampionsLeaugeFinal := &Game{}
  err = proto.Unmarshal(data, newChampionsLeaugeFinal)
  if err != nil {
    log.Fatal("unmarshaling error: ", err)
  }
  fmt.Println("Home: ", newChampionsLeaugeFinal.Team.GetHome())
  fmt.Println("Away: ", newChampionsLeaugeFinal.Team.GetAway())
  fmt.Println("Venue: ", newChampionsLeaugeFinal.GetVenue())
  fmt.Println("Date: ", newChampionsLeaugeFinal.GetDate())
}

Now if we run the program.

go_workspace/src go run main.go game.pb.go
[10 11 82 101 97 108 32 77 97 100 114 105 100 18 9 66 97 114 99 101 108 111 110 97 26 26 83 97 110 116 105 97 103 111 32 66 101 114 110 97 98 195 169 117 32 83 116 97 100 105 117 109 34 10 50 54 45 49 48 45 50 48 49 57]
Home:  Real Madrid
Away:  Liverpool
Venue:  Paris Sant Germain Stadium
Date:  22-04-2022




Conclusion:  if you have lots of communication between services, and in some cases data could be long, using protobuf could helpful 
to decrease network traffic and deliver higher performance. 

References and additional reads:
https://auth0.com/blog/beating-json-performance-with-protobuf/

https://developers.google.com/protocol-buffers/docs/gotutorial

https://blog.grijjy.com/2017/04/25/binary-serialization-with-google-protocol-buffers/