Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6: Interacting with Remote Data Sources

In the last chapter, we talked about dealing with common data formats and showed how we can read and write data in those formats. But in that chapter, we were simply dealing with data that was accessible through a filesystem.

While the filesystem may actually have files that exist on remote devices through services such as the Network File System (NFS) or the Server Message Block (SMB), other remote data sources exist.

In this chapter, we will look at some common ways to send and receive data in remote data sources. This will focus on accessing data on remote systems using the Structured Query Language (SQL), REpresentational State Transfer (REST), and Google Remote Procedure Call (gRPC). You will learn how to access common SQL data stores, with a focus on PostgreSQL. We will also explore how Remote Procedure Call (RPC) services are created and queried using REST- and gRPC-style RPC methodologies.

With the skills you gain here, you will be able to connect and query data in a SQL database, add new entries to the database, request a remote action from a service, and gather information from a remote service.

We will cover the following topics in this chapter:

Accessing SQL databases
Developing REST services and clients
Developing gRPC services and clients

In the next section, we will dive into utilizing data in one of the oldest formats, Comma-Separated Values (CSV).

Let's get started!

Technical requirements

The code files for this chapter can be downloaded from https://github.com/PacktPublishing/Go-for-DevOps/tree/rev0/chapter/6/grpc

Accessing SQL databases

DevOps engineers commonly have a need to access data stored in database systems. SQL is a standard for communicating with database systems that a DevOps engineer will encounter in their day-to-day lives.

Go provides a standard library for interacting with SQL-based systems called database/sql. The interfaces provided by that package, with the addition of a database driver, allow a user to work with several different SQL databases.

In this section, we will look at how we can access a Postgres database to perform basic SQL operations using Go.

Important Note

Examples in this section will require you to set up a Postgres database. This is beyond the scope of this book. This will not be a guide to SQL. Some basic SQL knowledge is required.

You can find information regarding how to install Postgres for your OS at https://www.postgresql.org/download/. If you prefer to run Postgres in a local Docker container, you can find that information at https://hub.docker.com/_/postgres.

Connecting to a Postgres database

To connect to a Postgres database will require using a database driver for Postgres. The currently recommended third-party package is github.com/jackc/pgx. This package implements a SQL driver for database/sql and provides its own methods/types for Postgres-specific features.

The choice to use database/sql or Postgres-specific types will depend on whether you need to ensure compatibility between different databases. Using database/sql allows you to write functions that work on any SQL database, while using Postgres-specific features removes compatibility and makes migration to another database more difficult. We will discuss how to perform our examples using both methods.

Here is how to connect using a standard SQL package without extra Postgres features:

dbURL might look like:

"postgres://username:password@localhost:5432/database_name"

conn, err := sql.Open("pgx", dbURL)

if err != nil {

return fmt.Errorf("connect to db error: %s ", err)

}

defer conn.Close()

ctx, cancel := context.WithTimeout(

context.Background(),

2 * time.Second

)

if err := conn.PingContext(ctx); err != nil {

return err

}

cancel()

Here, we open a connection to Postgres using the pgx driver that will be registered when you import the following package:

_ "github.com/jackc/pgx/v4/stdlib"

This is an anonymous import, meaning we are not using stdlib directly. This is done when we want a side effect, such as when registering a driver with the database/sql package.

The Open() call doesn't test our connection. You will see conn.PingContext() to test that we will be able to make calls to the database.

When you want to use pgx-specific types for Postgres, the setup is slightly different, starting with a different package import:

"github.com/jackc/pgx/v4/pgxpool"

To create that connection, type the following:

conn, err := pgxpool.Connect(ctx, dbURL)

if err != nil {

return fmt.Errorf("connect to db error: %s ", err)

}

defer conn.Close(ctx)

This uses a connection pool to connect to the database for performance. You will notice that we don't have a PingContext() call, as the native connection tests the connection as part of Connect().

Now that you know how to connect to Postgres, let's look at how we can make queries.

Querying a Postgres database

Let's consider making a call to your SQL database to fetch some information about a user that is held in a table.

Using the standard library, type the following:

type UserRec struct {

User string

DisplayName string

ID int

}

func GetUser(ctx context.Context, conn *sql.DB, id int) (UserRec, error) {

const query = `SELECT "User","DisplayName" FROM users WHERE "ID" = $1`

u := UserRec{ID: id}

err := conn.QueryRowContext(ctx, query, id).Scan(&u)

return u, err

}

This example does the following:

Creates UserRec to store SQL data for a user
Creates a query statement called query
Queries our database for a user with the requested ID
Returns UserRec and an error if we had one

We can increase the efficiency of this example by using a prepared statement in an object instead of just a function:

type Storage struct {

conn *sql.DB

getUserStmt *sql.Stmt

}

func NewStorage(ctx context.Context, conn *sql.DB) *Storage

{

return &Storage{

getUserStmt: conn.PrepareContext(

ctx,

`SELECT "User","DisplayName" FROM users WHERE "ID" = $1`,

)

}

func (s *Storage) GetUser(ctx context.Context, id int) (UserRec, error) {

u := UserRec{ID: id}

err := s.getUserStmt.QueryRow(id).Scan(&u)

return u, err

}

This example does the following:

Creates a reusable object
Stores *sql.Stmt, which increases the efficiency when doing repeated queries
Defines a NewStorage constructor that creates our object

Because of the generic nature of using the standard library, in these examples, any implementation of *sql.DB could be used. Switching Postgres for MariaDB would work as long as MariaDB had the same table names and format.

If we use the Postgres-specific library, the same code is written like so:

err = conn.QueryRow(ctx, query).Scan(&u)

return u, err

This implementation looks and works in a similar way to the standard library. But the conn object here is a different, non-interface pgxpool.Conn type and not sql.Conn. And while the functionality looks similar, the pgxpool.Conn object supports queries with Postgres-specific types and syntax, such as jsonb, that sql.Conn does not.

There is no need to use a prepared statement for non-transactions when using Postgres-specific calls. The call information is automatically cached.

The preceding example was simplistic in that we were pulling a specific entry. What if we wanted to also have a method to retrieve all users with IDs between two numbers? We could define this using the standard library:

stmt contains `SELECT "User","DisplayName","ID" FROM users

WHERE "ID" >= $1 AND "ID" < $2`

func (s *Storage) UsersBetween(ctx context.Context, start, end int) ([]UserRec, error) {

recs := []UserRec{}

rows, err := s.usersBetweenStmt(ctx, start, end)

defer rows.Close()

for rows.Next() {

rec := UserRec{}

if err := rows.Scan(&rec); err != nil {

return nil, err

}

recs = append(recs, rec)

}

return recs, nil

}

The Postgres-specific syntax is the same; it just switches s.usersBetweenStmt() for conn.QueryRow().

Null values

SQL has a concept of null values for basic types such as Booleans, strings, and int32. Go doesn't have the convention; instead, it provides zero values for those types.

When SQL allows a column to have a null value, the standard library provides special null types in database/sql:

sql.NullBool
sql.NullByte
sql.NullFloat64
sql.NullInt16
sql.NullInt32
sql.NullInt64
sql.NullString
sql.NullTime

When you design your schema, it is better to use zero values instead of null values. But sometimes, you need to tell the difference between a value being set and the zero value. In those cases, you can use these special types in place of the standard type.

For example, if our UserRec could have a null DisplayName, we can change the string type to sql.NullString:

type UserRec struct {

User string

DisplayName sql.NullString

ID int

}

You can see an example of how the server sets these values depending on the value that the column holds for DisplayName here: https://go.dev/play/p/KOkYdhcjhdf.

Writing data to Postgres

Writing data into a database is simple but requires some consideration of the syntax. The two major operations that a user wants when writing data are as follows:

Updating an existing entry
Inserting a new entry

In standard SQL, you cannot do an update entry if it exists; insert if not. As this is a common operation, each database offers some way to do this with its own special syntax. When using the standard library, you must choose between doing an update or an insert. If you do not know whether the entry exists, you will need to use a transaction, which we will detail in a bit.

Doing an update or insert is simply using a different SQL syntax and the ExecContext() call:

func (s *Storage) AddUser(ctx context.Context, u UserRec) error {

_, err := s.addUserStmt.ExecContext(

ctx,

u.User,

u.DisplayName,

u.ID,

)

return err

}

func (s *Storage) UpdateDisplayName(ctx context.Context, id int, name string) error {

_, err := s.updateDisplayName.ExecContext(

ctx,

name,

id,

)

return err

}

In this example, we have added two methods:

AddUser() adds a new user into the system.
UpdateDisplayName() updates the display name of a user with a specific ID.
Both use the sql.Stmt type, which would be a field in the object, similar to getUserStmt.

The major difference when implementing using the Postgres-native package is the method name that is called and the lack of a prepared statement. Implementing AddUser() would look like the following:

func (s *Storage) AddUser(ctx context.Context, u UserRec) error {

const stmt = `INSERT INTO users (User,DisplayName,ID)

VALUES ($1, $2, $3)`

_, err := s.conn.Exec(

ctx,

stmt,

u.User,

u.DisplayName,

u.ID,

)

return err

}

Sometimes, it is not enough to just do a read or a write to the database. Sometimes, we need to do multiple actions atomically and treat them as a single action. So, in the next section, we will talk about how to do this with transactions.

Transactions

Transactions provide a sequence of SQL operations that are executed on the server as one piece of work. This is commonly used to provide some type of atomic operation where a read and a write are required or to extract data on a read before doing a write.

Transactions are easy to create in Go. Let's create an AddOrUpdateUser() call that will look to see whether a user exists before adding or updating our data:

func (s *Storage) AddOrUpdateUser(ctx context.Context, u UserRec) (err error) {

const (

getStmt = `SELECT "ID" FROM users WHERE "User" = $1`

insertStmt = `INSERT INTO users (User,DisplayName,ID)

VALUES ($1, $2, $3)`

updateStmt = `UPDATE "users" SET "User" = $1,

"DisplayName" = $2 WHERE "ID" = 3`

)

tx, err := s.conn.BeginTx(ctx, &sql.TxOptions{Isolation: sql.LevelSerializable})

if err != nil {

return err

}

defer func() {

if err != nil {

tx.Rollback()

return

}

err = tx.Commit()

}()

_, err := tx.QueryRowContext(ctx, getStmt, u.User)

if err != nil {

if err == sql.ErrNoRows {

_, err = tx.ExecContext(ctx, insertStmt, u.User, u.DisplayName, u.ID)

if err != nil {

return err

}

return err

}

_, err = tx.ExecContext(ctx, updateStmt, u.User, u.DisplayName, u.ID))

return err

}

This code does the following:

Creates a transaction with an isolation level of LevelSerializable
Uses a defer statement to determine whether we had an error:
- If we did, we roll back the entire transaction.
- If not, we attempt to commit the transaction.
Queries to find whether the user exists:
- It determines this by checking the error type.
- If the error is sql.ErrNoRows, we did not find the user.
- If the error is anything else, it was a system error.
Executes an insert statement if we didn't find the user
Executes an update statement if we did find the user

The keys to a transaction are the following:

conn.BeginTx, which starts the transaction
tx.Commit(), which commits our changes
tx.Rollback(), which reverts our changes

A defer statement is an excellent way to handle either Commit() or Rollback() once the transaction has been created. It ensures that when the function ends, either one or the other is executed.

The isolation level is important for a transaction as it affects the performance and reliability of your system. Go provides multiple levels of isolation; however, not all database systems will support all levels of isolation.

You can read more about isolation levels here: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Isolation_levels.

Postgres-specific types

So far in our examples, we have shown you how to use both the standard library and Postgres-specific objects to interact with Postgres. But we haven't really shown a compelling reason to use Postgres objects.

Postgres objects shine when you need to use types or capabilities that aren't a part of the SQL standard. Let's rewrite our transaction example, but instead of storing data across standard columns, let's have our Postgres database only have two columns:

An ID of the int type
Data of the jsonb type

jsonb is not part of the SQL standard and cannot be implemented with the standard SQL library. jsonb can greatly simplify your life, as it allows you to store JSON data while querying using JSON fields:

func (s *Storage) AddOrUpdateUser(ctx context.Context, u UserRec) (err error) {

const (

getStmt = `SELECT "ID" FROM "users" WHERE "ID" = $1`

updateStmt = `UPDATE "users" SET "Data" = $1 WHERE "ID" = $2`

addStmt = `INSERT INTO "users" (ID,Data) VALUES ($1, $2)`

)

tx, err := conn.BeginTx(

ctx ,

pgx.TxOptions{

IsoLevel: pgx.Serializable,

AccessMode: pgx.ReadWrite,

DeferableMode: pgx.NotDeferrable,

)

defer func() {

if err != nil {

tx.Rollback()

return

}

err = tx.Commit()

}()

_, err := tx.QueryRow(ctx, getUserStmt, u.ID)

if err != nil {

if err == sql.ErrNoRows {

_, err = tx.ExecContext(ctx, insertStmt, u.ID, u)

if err != nil {

return err

}

return err

}

_, err = tx.Exec(ctx, updateStmt, u.ID, u)

return err

}

This example is different in a few ways:

It has additional AccessMode and DeferableMode parameters.
We can pass our object, UserRec, as our Data jsonb column.

The access and deferable modes add extra constraints that are not available directly with the standard library.

Using jsonb is a boon. Now, we can do searches on our tables with WHERE clauses that can filter on the jsonb field values.

You will also notice that pgx is smart enough to know our column type and automatically convert our UserRec into JSON.

If you'd like to know more about Postgres value types, you can visit https://www.postgresql.org/docs/9.5/datatype.html.

If you'd like to know more about jsonb and functions to access its values, visit https://www.postgresql.org/docs/9.5/functions-json.html.

Other options

Besides the standard library and database-specific packages are Object-Relational Mappings (ORMs). ORMs are a popular model for managing data between your services and data storage.

Go's most popular ORM is called GORM, which can be found here: https://gorm.io/index.html.

Another popular framework that also includes support for REST and web services is Beego, which you can find here: https://github.com/beego/beego.

Storage abstractions

Many developers are tempted to use storage systems directly in their code, passing around a connection to a database. This is not optimal in that it can cause problems when you need to do the following:

Add caching layers before storage access.
Migrate to a new storage system for your service.

Abstracting storage behind an internal Application Programming Interface (API) of interfaces will allow you to change storage layers later by simply implementing the interfaces with the new backend. You can then plug in the new backend at any time.

A simple example of this might be adding an interface for getting user data:

type UserStorage interface {

User(ctx context.Context, id string) (UserRec, error)

AddUser(ctx context.Context, u UserRec) error

UpdateDisplayName(ctx context.Context, id string, name string) error

}

This interface allows you to implement your storage backend using Postgres, local files, SQLite, Azure Cosmos DB, in-memory data structures, or any other storage medium.

This has the benefit of allowing migration from one storage medium to another by plugging in a new implementation. As a side benefit, you can decouple tests from using a database. Instead, most tests can use an in-memory data structure. This allows you to test your functionality without bringing up and tearing down infrastructure, which would be necessary with a real database.

Adding a cache layer becomes a simple exercise of writing a UserStorage implementation that calls the cache on reads and when not found calls your data store implementation. You can replace the original and everything keeps working.

Note that everything described here for abstraction behind an interface applies to access to service data. A SQL API should only be used for your application to store and read data. Other services should use a stable RPC interface. This provides the same type of abstraction, allowing you to move data backends without migrating users.

Case study – data migration of an orchestration system – Google

One of the systems I was involved with during my tenure at Google was an orchestration system for automating network changes. The system received automation instructions and executed them against various targets. These operations might involve pushing files via Secure File Transfer Protocol (SFTP), interacting with network routers, updating authoritative data stores, or running state verifications.

With operations, it is critical that data representing the state of a workflow is always up to date. This includes not only the currently running workflows but also the states of previous workflows, which are used to create new workflows.

To ease our operational burden, we wanted to move the storage system for workflows from Bigtable to Spanner. Bigtable required a more complicated setup to handle failover to a backup cell when problems occurred, while Spanner was designed to handle this as part of the system design. This removed the need for us to intervene when cells had problems.

The storage layer was hidden behind a storage interface. Storage was initialized in our main() and passed around to other modules that required it. This meant that we could replace the storage layer with a new implementation.

We implemented a new storage interface that wrote data to both Bigtable and Spanner while reading from them both, using the latest data stamp and updating the records if needed.

This allowed us to operate using both data stores while our historical data was being transferred. Once synchronization was complete, we moved our binaries to a version that only had a Spanner implementation. Our migration was complete with no service downtime while thousands of critical operations were running.

So far in this chapter, we have learned about how to use database/sql to access generic data stores and Postgres specifically. We learned how to read and write to Postgres and implement transactions. The benefits of using database/sql versus a database-specific library such as pgx were discussed. And finally, we showed how hiding your implementations behind interface abstractions can allow you to change storage backends more easily and test code relying on storage hermetically.

Next, we will look into accessing RPC services using REST or gRPC.

Developing REST services and clients

Before the web and distributed systems that now permeate the cloud space, standards for communicating between systems were not in widespread use. This communication is often called an RPC. This simply means that a program on one machine has a method to call a function running on a different machine and receive any output.

Monolithic applications were the norm and servers tended to either be silo'd per application and vertically scaled or were run as jobs on larger, more specialized hardware from companies such as IBM, Sun, SGI, or Cray. When systems did need to communicate with each other, they tended to use their own custom wire formats, such as what you would see with Microsoft SQL Server.

With the web defining the internet of the 2000s, large monolithic systems could not provide the compute power behind services such as Google Search or Facebook at any reasonable cost point. To power these services, companies needed to treat large collections of standard PCs as a single system. Where a single system could communicate between processes using Unix sockets or shared memory calls, companies needed common and secure ways to communicate between processes running on different machines.

As HTTP became the de facto standard for communication between systems, RPC mechanisms of today use some form of HTTP for data transport. This allows the RPC to transit systems more easily, such as load balancers, and easily utilize security standards, such as Transport Layer Security (TLS). It also means that as the HTTP transport is upgraded, these RPC frameworks can leverage the hard work of hundreds if not thousands of engineers.

In this section, we are going to talk about one of the most popular RPC mechanisms, REST. REST uses HTTP calls and whatever messaging format you want, although the majority of cases use JSON for messaging.

REST for RPCs

Writing REST clients in Go is fairly simple. Chances are that if you have been developing applications in the last 10 years, you have either used a REST client or written one. Cloud APIs for services such as Google Cloud Platform's Cloud Spanner, Microsoft's Azure Data Explorer, or Amazon DynamoDB use REST to communicate with the services via their client libraries.

REST clients can do the following:

Use GET, POST, PATCH, or any other type of HTTP method.
Support any serialization format (although this is normally JSON).
Allow for data streaming.
Support query variables.
Support multiple versions of an API using URL standards.

REST in Go also has the luxury of not requiring any framework to implement on the server side. Everything that is required lives in the standard library.

Writing a REST client

Let's write a simple REST client that accesses a server and receives a Quote of the Day (QOTD). To do this, the server has the following endpoint using POST – /v1/qotd.

First, let's define the message we need to send to the server:

type getReq struct {

Author string `json:"author"`

}

type getResp struct {

Quote string `json:"quote"`

Error *Error `json:"error"`

}

Let's talk about what each of these does:

getReq details the arguments to the server's /v1/qotd function call.
getResp is what we expect as a return from the server's function call.

We are using field tags to allow conversion from lowercase keys into our public variables that are capitalized. For the encoding/json package to see these values for serialization, they must be public. Private fields will not be serializable:

type Error struct {

Code ErrCode

Msg string

}

func (e *Error) Error() string {

return fmt.Errorf("(code %v): %s", e.Code, e.Msg)

}

This defines a custom error type. This way, we can store error codes to return to the user. This code is defined next to our response object, but it isn't used until much later in the code we are defining.

Let's now define a QOTD client and a constructor that does some basic checks on the address and creates an HTTP client to allow us to send data to the server:

type QOTD struct {

addr string

client *http.Client

}

func New(addr string) (*QOTD, error) {

if _, _, err := net.SplitHostPort(addr); err != nil {

return nil, err

}

return &QOTD{addr: addr, client: &http.Client{}}

}

The next step is to make a generic function for making REST calls. Because REST is so open-ended, it is hard to make one that can handle any type of REST call. A best practice to use when writing REST servers is to only support the POST method; never use query variables and simple URLs. However, in practice, you will deal with a wide variety of REST call types if you don't control the service:

func (q *QOTD) restCall(ctx context.Context, endpoint string, req, resp interface{}) error {

if _, ok := ctx.Deadline(); !ok {

var cancel context.CancelFunc

ctx, cancel = context.WithDeadline(ctx, 2 * time.Second)

defer cancel()

}

b, err := json.Marshal(req)

if err != nil {

return err

}

hReq, err := http.NewRequestWithContext(

ctx,

http.POST,

endpoint,

bytes.NewBuffer(b),

)

if err != nil {

return err

}

resp, err := q.client.Do(hReq)

if err != nil {

return err

}

b, err := io.ReadAll(resp.Body)

if err != nil {

return err

}

return json.Unmarshal(b, resp)

}

This code does the following:

Checks our context for a deadline:
- If it has one, it is honored
- If not, a default one is set
- cancel() is called after the call is done
Marshals a request into JSON.
Creates a new *http.Request that does the following:
- Uses the POST method
- Talks to an endpoint
- Has io.Reader storing the JSON request
Uses the client to send a request and get a response.
Retrieves the response from the body of http.Response.
Unmarshals JSON into the response object.

You will notice that req and resp are both interface{}. This allows us to use this routine with any struct that will represent a JSON request or response.

Now, we will use that in a method that gets a QOTD by an author:

func (q *QOTD) Get(ctx context.Context, author string) (string, error) {

const endpoint = `/v1/qotd`

resp := getResp{}

err := q.restCall(ctx, path.Join(q.addr, endpoint), getReq{Author: author}), &resp)

switch {

case err != nil:

return "", err

case resp.Error != nil:

return "", resp.Error

}

return resp.Quote, nil

}

This code does the following:

Defines an endpoint for our get function on the server.
Calls our restCall() method, which does the following:
- Uses path.Join() to unite our server address and URL endpoint.
- Creates a getReq object as the req argument of restCall().
- Reads the response into our resp response object.
- If *http.Client returns an error, we return that error.
- If resp.Error is set, we return it.
Returns the response's quote.

To see this running now, you can go here: https://play.golang.org/p/Th0PxpglnXw.

We have shown how to make a base REST client here using HTTP POST calls and JSON. However, we have only scratched the surface of making a REST client. You may need to add authentication to the header in the form of a JSON Web Token (JWT). This used HTTP and not HTTPS, so there was no transport security. We did not try to use compression such as Deflate or Gzip.

While using http.Client is easy to do, you may want a more intelligent wrapper that handles many of these features for you. One that is worth looking at would be resty, which can be found here: https://github.com/go-resty/resty.

Writing a REST service

Now that we have a client written, let's write a REST service endpoint that can receive the request and send the user the output:

type server struct {

serv *http.Server

quotes map[string][]string

}

This code does the following:

Creates the server struct, which will act as our server
Uses *http.Server to server HTTP content
Has quotes, which stores authors as keys and values that are a slice of quotes

Now, we need a constructor:

func newServer(port int) (*server, error) {

s := &server{

serv: &http.Server{

Addr: ":" + strconv.Itoa(port),

quotes: map[string][]string{

// Add quotes here

}

mux := http.NewServeMux()

mux.HandleFunc(`/qotd/v1/get`, s.qotdGet)

// The muxer implements http.Handler

// and we assign it for our server’s URL handling.

s.serv.Handler = mux

return s, nil

}

func (s *server) start() error {

return s.serv.ListenAndServe()

}

This code does the following:

Creates a newServer constructor:
- This has an argument of port, which is the port to run the server on.
Creates a server instance:
- Makes an instance of *http.Server running at :[port]
- Populates our quotes map
Adds *http.ServeMux to map URLs to methods.
Note
We will create the qotdGet method in a moment.
Creates a method called start() that will start our HTTP server.

*http.ServeMux implements the http.Handler interface that is used by *http.Server. ServeMux uses pattern matching to determine which method is called for which URL. You can read about pattern-matching syntax here: https://pkg.go.dev/net/http#ServeMux.

Now, let's create the method to answer our REST endpoint:

func (s *server) qotdGet(w http.ResponseWriter, r *http.Request) {

req := getReq{}

if err := req.fromReader(r.Body); err != nil {

http.Error(w, err.Error(), http.StatusBadRequest)

return

}

var quotes []string

if req.Author == "" {

// Map access is random, this will randomly choose a // set of quotes from an author.

for _, quotes = range s.quotes {

break

}

} else {

var ok bool

quotes, ok = s.quotes[req.Author]

if !ok {

b, err := json.Marshal(

getResp{

Error: &Error{

Code: UnknownAuthor,

Msg: fmt.Sprintf("Author %q was not found", req.Author),

)

if err != nil {

http.Error(w, err.Error(), http.StatusBadRequest)

return

}

w.Write(b)

return

}

i := rand.Intn(len(quotes))

b, err := json.Marshal(getResp{Quote: quotes[i]})

if err != nil {

http.Error(w, err.Error(), http.StatusBadRequest)

return

}

w.Write(b)

return

This code does the following:

Implements the http.Handler interface.
Reads the HTTP request body and marshals it to our getReq:
- This uses HTTP error codes with http.Error() if the request was bad
If the request did not contain an "author," randomly chooses an author's quotes.
Otherwise, finds the author and retrieves their quotes:
- If that author did not exist, responds with getResp containing an error
Randomly chooses a quote and returns it to the client.

Now, we have a REST endpoint that can answer our client's RPCs. You can see this code running here: https://play.golang.org/p/Th0PxpglnXw.

This just scratches the surface of building a REST service. You can build authentication and compression on top of this, performance tracing, and so on

To help with bootstrapping features and removing some boilerplate, here are a few third-party packages that might be helpful:

Gin: https://github.com/gin-gonic/gin:
- A REST example: https://golang.org/doc/tutorial/web-service-gin
Revel: https://revel.github.io

Now that we have talked about using REST for RPCs, let's take a look at the faster alternative that is being adopted by large companies everywhere, gRPC.

Developing gRPC services and clients

gRPC provides an entire framework for RPCs based on HTTP and utilizing Google's protocol buffer format, a binary format that can convert into JSON but provides both a schema and, in many cases, a 10x performance improvement over JSON.

There are other formats in this space, such as Apache's Thrift, Cap'n Proto, and Google's FlatBuffers. However, these are not as popular and well supported, or satisfy a particular niche, while also being hard to use.

gRPC, like REST, is a client/server framework for making RPC calls. Where gRPC differs is that it prefers a binary message format called protocol buffers (proto for short).

This format has a schema stored in a .proto file that is used to generate the client, server, and messages in a native library for the language of your choice using a compiler. When a proto message is marshaled for transport on the wire, the binary representation will be the same for all languages.

Let's talk more about protocol buffers, gRPC's message format of choice.

Protocol buffers

Protocol buffers define RPC messages and services in one location and can generate a library for every language with the proto compiler. Protocol buffers have the following advantages:

They write once and generate for every language.
Messages can be converted to JSON as well as binary.
gRPC can use a reverse proxy to provide REST endpoints, which is great for web apps.
Binary protocol buffers are smaller and can encode/decode at 10x the rate of JSON.

However, protocol buffers do have some negatives:

You must regenerate the messages on any change to the .proto file to get the changes.
Google's standard proto compiler is painful and confusing to use.
JavaScript does not have native support for gRPC, even though it supports protocol buffers.

Tooling can help with some of the negatives, and we will be using the new Buf tools, https://buf.build, to help with proto generation.

Let's take a look at what a protocol buffer .proto file looks like for a QOTD service:

syntax = "proto3";

package qotd;

option go_package = "github.com/[repo]/proto/qotd";

message GetReq {

string author = 1;

}

message GetResp {

string author = 1;

string quote = 2;

}

service QOTD {

rpc GetQOTD(GetReq) returns (GetResp) {};

}

The syntax keyword defines which version of the proto language we are using. The most common version is proto3, the third iteration of the language. All three have the same wire format but have different feature sets and generate different language packages.

package defines the proto package name, which allows this protocol buffer to be imported by another package. We have put [repo] as a placeholder to represent the GitHub repository.

go_package defines the package name specifically when generating Go files. While this is marked as option, it is not optional when compiling for Go.

message defines a new message type, which in Go is generated as struct. Entries inside message detail the fields. string author = 1 creates a field in struct GetReq called Author of the string type. 1 is the field position in the proto. You cannot have repeated field numbers in a message, a field number should never change, and a field should not be removed (although it can be deprecated).

service defines a gRPC service with one RPC endpoint, GetQOTD. This call receives GetReq and returns GetResp.

Now that we have defined this protocol buffer file, we can use a proto compiler to generate packages for languages we are interested in. This will include all of our messages and the code needed to use the gRPC client and server.

Let's look at generating the Go packages from the protocol buffer file.

Stating the prerequisites

To use protocol buffers in this tutorial, you will need to install the following:

The protocol buffer compiler: https://grpc.io/docs/protoc-installation/
The Go plugins for the compiler: https://grpc.io/docs/languages/go/quickstart/
The Buf tooling: https://docs.buf.build/installation

With these installed, you will be able to generate code for C++ and Go. Other languages require additional plugins.

Generating your packages

The first file we need to create is the buf.yaml file. We can generate the buf.yaml file inside the proto directory by entering it and issuing the following command:

buf config init

This should generate a file that has the following content:

version: v1

lint:

use:

- DEFAULT

breaking:

use:

- FILE

Next, we need a file that tells us what output to generate. Create a file called buf.gen.yaml and give it the following contents:

version: v1

plugins:

- name: go

out: ./

opt:

- paths=source_relative

- name: go-grpc

out: ./

opt:

- paths=source_relative

This indicates that we should generate our go and go-grpc files in the same directory as our .proto file.

Now, we should test that our proto will build. We can do this by issuing the following command:

buf build

If there is no output, then our proto file should compile. Otherwise, we will get a list of errors that we need to fix.

Finally, let's generate our proto files:

buf generate

If you named the proto file qotd.proto, this should generate the following:

qotd.pb.go, which will contain all your messages
qotd_grpc.pb.go, which will contain all the gRPC stubs

Now that we have our proto package, let's build a client.

Writing a gRPC client

In the root folder of your repository, let's create two directories:

client/, which will hold our client code
internal/server/, which will hold our server code

Now, let's create a client/client.go file with the following:

package client

import (

"context"

"time"

"google.golang.org/grpc"

pb "[repo]/grpc/proto"

)

type Client struct {

client pb.QOTDClient

conn *grpc.ClientConn

}

func New(addr string) (*Client, error) {

conn, err := grpc.Dial(addr, grpc.WithInsecure())

if err != nil {

return nil, err

}

return &Client{

client: pb.NewQOTDClient(conn),

conn: conn,

}, nil

}

func (c *Client) QOTD(ctx context.Context, wantAuthor string) (author, quote string, err error) {

if _, ok := ctx.Deadline(); !ok {

var cancel context.CancelFunc

ctx, cancel = context.WithTimeout(ctx, 2 * time.Second)

defer cancel()

}

resp, err := c.client.GetQOTD(ctx, &pb.GetReq{Author: wantAuthor})

if err != nil {

return "", "", err

}

return resp.Author, resp.Quote, nil

}

This is a simple wrapper around the generated client with our connection to the server established in our New() constructor:

grpc.Dial() connects to the server's address:
- grpc.WithInsecure() allows us to not use TLS. (In real services, you need to use TLS!)
pb.NewQOTDClient() takes a gRPC connection and returns our generated client.
QOTD() uses the client to make a call defined in our GetQOTD() proto:
- This defines a timeout if one was not defined. The server receives this timeout.
- This uses the generated client to call the server.

Creating a wrapper to use as a client isn't strictly required. Many developers prefer to have the user directly interact with the service using the generated client.

In our opinion, this is fine for simple clients. More complicated clients generally should ease the burden by either moving logic to the server or having custom client wrappers that are more language-friendly.

Now that we have defined a client, let's create our server package.

Writing a gRPC server

Let's create a server file at internal/server/server.go.

Now, let's add the following content:

package server

import (

"context"

"fmt"

"math/rand"

"net"

"sync"

"google.golang.org/grpc"

"google.golang.org/grpc/codes"

"google.golang.org/grpc/status"

pb "[repo]/grpc/proto"

)

type API struct {

pb.UnimplementedQOTDServer

addr string

quotes map[string][]string

mu sync.Mutex

grpcServer *grpc.Server

}

func New(addr string) (*API, error) {

var opts []grpc.ServerOption

a := &API{

addr: addr,

quotes: map[string][]string{

// Insert your quote mappings here

grpcServer: grpc.NewServer(opts...),

}

a.grpcServer.RegisterService(&pb.QOTD_ServiceDesc, a)

return a, nil

}

This code does the following:

Defines our API server:
- pb.UnimplementedQOTDServer is a generated interface that contains all the methods that our server must implement. This is required.
- addr is the address our server will run on.
- quotes contains quotes the server is storing.
Defines a New() constructor:
- This creates an instance of our API server.
- This registers the instance with our grpcServer.

Now, let's add methods to start and stop our API server:

func (a *API) Start() error {

a.mu.Lock()

defer a.mu.Unlock()

lis, err := net.Listen("tcp", a.addr)

if err != nil {

return err

}

return a.grpcServer.Serve(lis)

}

func (a *API) Stop() {

a.mu.Lock()

defer a.mu.Unlock()

a.grpcServer.Stop()

}

This code does the following:

Defines Start() to start our server, which does the following:
- Uses Mutex to prevent stops and starts concurrently
- Creates a TCP listener on the address passed in New()
- Starts the gRPC server using our listener
Defines Stop() to stop our server, which does the following:
- Uses Mutex to prevent stops and starts concurrently
- Tells the gRPC server to stop gracefully

Now, let's implement the GetQOTD() method:

func (a *API) GetQOTD(ctx context.Context, req *pb.GetReq) (*pb.GetResp, error) {

var (

author string

quotes []string

)

if req.Author == "" {

for author, quotes = range s.quotes {

break

}

} else {

author = req.Author

var ok bool

quotes, ok = s.quotes[req.Author]

if !ok {

return nil, status.Error(

codes.NotFound,

fmt.Sprintf("author %q not found", req.author),

)

}

return &pb.GetResp{

Author: author,

Quote: quotes[rand.Intn(len(quotes))],

}, nil

}

This code does the following:

Defines the GetQOTD() method that the client will call
Includes similar logic to our REST server
Uses gRPC's error type defined in the google.golang.org/grpc/status package to return gRPC error codes

Now that we have our client and server packages, let's create a server binary to run our service.

Creating a server binary

Create a file called qotd.go that will hold our server's main() function:

package main

import (

"flag"

"log"

"github.com/[repo]/internal/server"

pb "[repo]/proto"

)

var addr = flag.String("addr", "127.0.0.1:80", "The address to run on.")

func main() {

flag.Parse()

s, err := server.New(*addr)

if err != nil {

panic(err)

}

done := make(chan error, 1)

log.Println("Starting server at: ", *addr)

go func() {

defer close(done)

done <-s.Start()

}()

err <- done

log.Println("Server exited with error: ", err)

}

This code does the following:

Creates a flag, addr, that the caller passes to set the address that the server runs on.
Creates an instance of our server.
Writes that we are starting the server.
Starts the server.
If the server exists, the error is printed to the screen:
- This might be something saying the port is already in use.

You can run this binary by using this command:

go run qotd.go --addr="127.0.0.1:2562"

If you do not pass the --addr flag, this will default to 127.0.0.1:80.

You should see the following on your screen:

Starting server at: 127.0.0.1:2562

Now, let's create a binary that uses the client to fetch a QOTD.

Creating a client binary

Create a file called client/bin/qotd.go. Then, add the following:

package main

import (

"context"

"flag"

"fmt"

"github.com/devopsforgo/book/book/code/1/4/grpc/client"

)

var (

addr = flag.String("addr", "127.0.0.1:80", "The address of the server.")

author = flag.String("author", "", "The author whose quote to get")

)

func main() {

flag.Parse()

c, err := client.New(*addr)

if err != nil {

panic(err)

}

a, q, err := c.QOTD(context.Background(), *author)

if err != nil {

panic(err)

}

fmt.Println("Author: ", a)

fmt.Printf("Quote of the Day: %q ", q)

}

This code does the following:

Sets up a flag for the address of the server
Sets up a flag for the author of the quote you want
Creates a new instance of client.QOTD
Calls the server using the QOTD()client method
Prints the results or an error to the terminal

You can run this binary by using this command:

go run qotd.go --addr="127.0.0.1:2562"

This will contact the server running at this address. If you are running the server at a different address, you will need to change this to match.

If you do not pass the --author flag, this randomly chooses an author.

You should see the following on your screen:

Author: [some author]

Quote: [some quote]

Now we've seen how to use gRPC to make a simple client and server application. But this is just the beginning of the features available to you in gRPC.

We are just scratching the surface

gRPC is a key piece of infrastructure for cloud technology such as Kubernetes. It was built after years of experience with Stubby, Google's internal predecessor. We have only scratched the surface of what gRPC can do. Here are some additional features:

Running a gRPC gateway to export REST endpoints
Providing interceptors that can deal with security and other needs
Providing streaming data
TLS support
Metadata and trailers for extra information
Client-side server load balancing

Here are just a few of the big companies that have made the switch:

Square
Netflix
IBM
CoreOS
Docker
CockroachDB
Cisco
Juniper Networks
Spotify
Zalando
Dropbox

Let's talk a little about how best to provide REST or gRPC services inside your company.

Company-standard RPC clients and servers

One of the keys to Google's tech stack success has been a consolidation around technologies. While there is certainly a lot of duplication in technology, Google standardizes on certain software and infrastructure components. Inside Google, it is rare to see a client/server not using Stubby (Google's internal gRPC).

The libraries that engineers use for RPC are written to work the same in every language. In recent years, there have been pushes by Site Reliability Engineering (SRE) organizations to have wrappers around Stubby that offer a breadth of features and best practices to prevent every team from reinventing the wheel. This includes features such as the following:

Authentication
Compression handling
Distributed service rate limiting
Retries with backoff (or circuit breaking)

This removes a lot of threats to infrastructure by having clients retrying without any backoffs, removing the cost of teams figuring out a security model, and allowing fixes to these items to be done by experts. Changes to these libraries benefit everyone and lower the cost of discovering already-made services.

As a DevOps engineer or SRE who likely carries a pager, pushing for standardization in your RPC layer can provide innumerable benefits, such as not being paged!

While choice is often seen as a good thing, having limited choices can allow development teams and operators to continue to focus on their product and not infrastructure, which is key in having robust products.

If you decide on providing a REST framework, here are a few recommended practices:

Only use POST.
Do not use query variables.
Use JSON only.
Have all arguments inside your request.

This will greatly reduce the needed code within your framework.

In this section, we learned what RPC services are and how to write clients using two popular methods, REST and gRPC. You also learned how REST has a looser set of guidelines while gRPC prefers schema types and generates the components required to use the system.

Summary

This ends our chapter on interacting with remote data sources. We looked at how to connect to SQL databases with examples using Postgres. We looked at what RPCs are and talked about the two most popular types of RPC services, REST and gRPC. Finally, we have written servers and clients for both frameworks.

This chapter has given you the ability to connect to the most popular databases and cloud services to get and retrieve data. Now you can write your own RPC services to develop cloud applications.

In the next chapter, we will utilize this knowledge to build tooling that controls jobs on remote machines.

So, without further ado, let's jump into how to write command-line tools.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Interacting with Remote Data Sources

Create new playlist

Sign In

Sign Up

Chapter 6: Interacting with Remote Data Sources

Technical requirements

Accessing SQL databases

Connecting to a Postgres database

Querying a Postgres database

Null values

Writing data to Postgres

Transactions

Postgres-specific types

Other options

Storage abstractions

Case study – data migration of an orchestration system – Google

Developing REST services and clients

REST for RPCs

Writing a REST client

Writing a REST service

Developing gRPC services and clients

Protocol buffers

Stating the prerequisites

Generating your packages

Writing a gRPC client

Writing a gRPC server

Creating a server binary

Creating a client binary

We are just scratching the surface

Company-standard RPC clients and servers

Summary

Table of Contents for
Chapter 6: Interacting with Remote Data Sources