Friday, May 7, 2021

Google cloud APIs: Design guide | my 60 minutes study

May 7, 2021

Here is the article. 

API Design Guide

Published on 2017-02-21Changelog

This is a general design guide for networked APIs. It has been used inside Google since 2014 and is the guide that Google follows when designing Cloud APIs and other Google APIs. This design guide is shared here to inform outside developers and to make it easier for us all to work together.

Cloud Endpoints developers may find this guide particularly useful when designing gRPC APIs, and we strongly recommend such developers use these design principles. However, we don't mandate its use. You can use Cloud Endpoints and gRPC without following the guide.

This guide applies to both REST APIs and RPC APIs, with specific focus on gRPC APIs. gRPC APIs use Protocol Buffers to define their API surface and API Service Configuration to configure their API services, including HTTP mapping, logging, and monitoring. HTTP mapping features are used by Google APIs and Cloud Endpoints gRPC APIs for JSON/HTTP to Protocol Buffers/RPC transcoding.

This guide is a living document and additions to it will be made over time as new style and design patterns are adopted and approved. In that spirit, it is never going to be complete and there will always be ample room for the art and craft of API design.

The requirement level keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" used in this document are to be interpreted as described in RFC 2119.

In this document, such keywords are highlighted using bold font.

Resource Oriented Design

The goal for this Design Guide is to help developers design simple, consistent and easy-to-use networked APIs. At the same time, it also helps converging designs of socket-based RPC APIs with HTTP-based REST APIs.

RPC APIs are often designed in terms of interfaces and methods. As more and more of these are added over time, the end result can be an overwhelming and confusing API surface due to the fact that developers must learn each method individually. Obviously this is both time consuming and error-prone.

The architectural style of REST was introduced, primarily designed to work well with HTTP/1.1, but also to help tackle this problem. Its core principle is to define named resources that can be manipulated using a small number of methods. The resources and methods are known as nouns and verbs of APIs. With the HTTP protocol, the resource names naturally map to URLs, and methods naturally map to HTTP methods POSTGETPUTPATCH, and DELETE. This results in much fewer things to learn, since developers can focus on the resources and their relationship, and assume that they have the same small number of standard methods.

On the Internet, HTTP REST APIs have been hugely successful. In 2010, about 74% of public network APIs were HTTP REST (or REST-like) APIs, most using JSON as the wire format.

While HTTP/JSON APIs are very popular on the Internet, the amount of traffic they carry is smaller than traditional RPC APIs. For example, about half of Internet traffic in America at peak time is video content, and few people would consider using HTTP/JSON APIs to deliver such content for obvious performance reasons. Inside data centers, many companies use socket-based RPC APIs to carry most network traffic, which can involve orders of magnitude more data (measured in bytes) than public HTTP/JSON APIs.

In reality, both RPC APIs and HTTP/JSON APIs are needed for various reasons, and ideally, an API platform should provide best support for all types of APIs. This Design Guide helps you design and build APIs that conform to this principle. It does so by applying resource-oriented design principles to general API design and defines many common design patterns to improve usability and reduce complexity.

A REST API is modeled as collections of individually-addressable resources (the nouns of the API). Resources are referenced with their resource names and manipulated via a small set of methods (also known as verbs or operations).

Standard methods for REST Google APIs (also known as REST methods) are ListGetCreateUpdate, and DeleteCustom methods (also known as custom verbs or custom operations) are also available to API designers for functionality that doesn't easily map to one of the standard methods, such as database transactions.

The Design Guide suggests taking the following steps when designing resource- oriented APIs (more details are covered in specific sections below):

  • Determine what types of resources an API provides.
  • Determine the relationships between resources.
  • Decide the resource name schemes based on types and relationships.
  • Decide the resource schemas.
  • Attach minimum set of methods to resources.

A resource-oriented API is generally modeled as a resource hierarchy, where each node is either a simple resource or a collection resource. For convenience, they are often called a resource and a collection, respectively.

  • A collection contains a list of resources of the same type. For example, a user has a collection of contacts.
  • A resource has some state and zero or more sub-resources. Each sub-resource can be either a simple resource or a collection resource.

For example, Gmail API has a collection of users, each user has a collection of messages, a collection of threads, a collection of labels, a profile resource, and several setting resources.

While there is some conceptual alignment between storage systems and REST APIs, a service with a resource-oriented API is not necessarily a database, and has enormous flexibility in how it interprets resources and methods. For example, creating a calendar event (resource) may create additional events for attendees, send email invitations to attendees, reserve conference rooms, and update video conference schedules.

The key characteristic of a resource-oriented API is that it emphasizes resources (data model) over the methods performed on the resources (functionality). A typical resource-oriented API exposes a large number of resources with a small number of methods. The methods can be either the standard methods or custom methods. For this guide, the standard methods are: ListGetCreateUpdate, and Delete.

Where API functionality naturally maps to one of the standard methods, that method should be used in the API design. For functionality that does not naturally map to one of the standard methods, custom methods may be used. Custom methods offer the same design freedom as traditional RPC APIs, which can be used to implement common programming patterns, such as database transactions or data analysis.

The following sections present a few real world examples on how to apply resource-oriented API design to large scale services. You can find more examples in the Google APIs repository.

In these examples, the asterisk indicates one specific resource out of the list.

The Gmail API service implements the Gmail API and exposes most of Gmail functionality. It has the following resource model:

  • API service: gmail.googleapis.com
    • A collection of users: users/*. Each user has the following resources.
      • A collection of messages: users/*/messages/*.
      • A collection of threads: users/*/threads/*.
      • A collection of labels: users/*/labels/*.
      • A collection of change history: users/*/history/*.
      • A resource representing the user profile: users/*/profile.
      • A resource representing user settings: users/*/settings.

The pubsub.googleapis.com service implements the Cloud Pub/Sub API, which defines the following resource model:

  • API service: pubsub.googleapis.com
    • A collection of topics: projects/*/topics/*.
    • A collection of subscriptions: projects/*/subscriptions/*.

The spanner.googleapis.com service implements the Cloud Spanner API, which defines the following resource model:

  • API service: spanner.googleapis.com
    • A collection of instances: projects/*/instances/*.
      • A collection of instance operations: projects/*/instances/*/operations/*.
      • A collection of databases: projects/*/instances/*/databases/*.
      • A collection of database operations: projects/*/instances/*/databases/*/operations/*.
      • A collection of database sessions: projects/*/instances/*/databases/*/sessions/*.

Resource Names

In resource-oriented APIs, resources are named entities, and resource names are their identifiers. Each resource must have its own unique resource name. The resource name is made up of the ID of the resource itself, the IDs of any parent resources, and its API service name. We'll look at resource IDs and how a resource name is constructed below.

gRPC APIs should use scheme-less URIs for resource names. They generally follow the REST URL conventions and behave much like network file paths. They can be easily mapped to REST URLs: see the Standard Methods section for details.

collection is a special kind of resource that contains a list of sub-resources of identical type. For example, a directory is a collection of file resources. The resource ID for a collection is called collection ID.

The resource name is organized hierarchically using collection IDs and resource IDs, separated by forward slashes. If a resource contains a sub-resource, the sub-resource's name is formed by specifying the parent resource name followed by the sub-resource's ID - again, separated by forward slashes.

Example 1: A storage service has a collection of buckets, where each bucket has a collection of objects:

API Service NameCollection IDResource IDCollection IDResource ID
//storage.googleapis.com/buckets/bucket-id/objects/object-id

Example 2: An email service has a collection of users. Each user has a settings sub-resource, and the settings sub-resource has a number of other sub-resources, including customFrom:

API Service NameCollection IDResource IDResource IDResource ID
//mail.googleapis.com/users/name@example.com/settings/customFrom

An API producer can choose any acceptable value for resource and collection IDs as long as they are unique within the resource hierarchy. You can find more guidelines for choosing appropriate resource and collection IDs below.

By splitting the resource name, such as name.split("/")[n], one can obtain the individual collection IDs and resource IDs, assuming none of the segments contains any forward slash.

A scheme-less URI consisting of a DNS-compatible API service name and a resource path. The resource path is also known as relative resource name. For example:

"//library.googleapis.com/shelves/shelf1/books/book2"

The API service name is for clients to locate the API service endpoint; it may be a fake DNS name for internal-only services. If the API service name is obvious from the context, relative resource names are often used.

A URI path (path-noscheme) without the leading "/". It identifies a resource within the API service. For example:

"shelves/shelf1/books/book2"

A resource ID typically consists of one or more non-empty URI segments (segment-nz-nc) that identify the resource within its parent resource, see above examples. The non-trailing resource ID in a resource name must have exactly one URL segment, while the trailing resource ID in a resource name may have more than one URI segment. For example:

Collection IDResource ID
filessource/py/parser.py

API services should use URL-friendly resource IDs when feasible. Resource IDs must be clearly documented whether they are assigned by the client, the server, or either. For example, file names are typically assigned by clients, while email message IDs are typically assigned by servers.

A non-empty URI segment (segment-nz-nc) identifying the collection resource within its parent resource, see above examples.

Because collection IDs often appear in the generated client libraries, they must conform to the following requirements:

  • Must be valid C/C++ identifiers.
  • Must be in plural form with lowerCamel case. If the term doesn't have suitable plural form, such as "evidence" and "weather", the singular form should be used.
  • Must use clear and concise English terms.
  • Overly general terms should be avoided or qualified. For example, rowValues is preferred to values. The following terms should be avoided without qualification:
    • elements
    • entries
    • instances
    • items
    • objects
    • resources
    • types
    • values

While full resource names resemble normal URLs, they are not the same thing. A single resource can be exposed by different API versions, API protocols, or API network endpoints. The full resource name does not specify such information, so it must be mapped to a specific API version and API protocol for actual use.

To use a full resource name via REST APIs, it must be converted to a REST URL by adding the HTTPS scheme before the service name, adding the API major version before the resource path, and URL-escaping the resource path. For example:

// This is a calendar event resource name.
"//calendar.googleapis.com/users/john smith/events/123"

// This is the corresponding HTTP URL.
"https://calendar.googleapis.com/v3/users/john%20smith/events/123"

Google APIs must represent resource names using plain strings, unless backward compatibility is an issue. Resource names should be handled like normal file paths. When a resource name is passed between different components, it must be treated as an atomic value and must not have any data loss.

For resource definitions, the first field should be a string field for the resource name, and it should be called name.

For example:

service LibraryService {
 
rpc GetBook(GetBookRequest) returns (Book) {
   
option (google.api.http) = {
      get
: "/v1/{name=shelves/*/books/*}"
   
};
 
};
 
rpc CreateBook(CreateBookRequest) returns (Book) {
   
option (google.api.http) = {
      post
: "/v1/{parent=shelves/*}/books"
      body
: "book"
   
};
 
};
}

message Book {
 
// Resource name of the book. It must have the format of "shelves/*/books/*".
 
// For example: "shelves/shelf1/books/book2".
 
string name = 1;

 
// ... other properties
}

message GetBookRequest {
 
// Resource name of a book. For example: "shelves/shelf1/books/book2".
 
string name = 1;
}

message CreateBookRequest {
 
// Resource name of the parent resource where to create the book.
 
// For example: "shelves/shelf1".
 
string parent = 1;
 
// The Book resource to be created. Client must not set the `Book.name` field.
 
Book book = 2;
}

Note: For consistency of resource names, the leading forward slash must not be captured by any URL template variable. For example, URL template "/v1/{name=shelves/*/books/*}" must be used instead of "/v1{name=/shelves/*/books/*}".

For any large system, there are many kinds of resources. To use resource IDs to identify a resource, we actually use a resource-specific tuple to identify a resource, such as (bucket, object) or (user, album, photo). It creates several major problems:

  • Developers have to understand and remember such anonymous tuples.
  • Passing tuples is generally harder than passing strings.
  • Centralized infrastructures, such as logging and access control systems, don't understand specialized tuples.
  • Specialized tuples limit API design flexibility, such as providing reusable API interfaces. For example, Long Running Operations can work with many other API interfaces because they use flexible resource names.

The resource name field is named after the concept of resource "name". In general, we find the concept of name is confusing to developers. For example, is filename really just the name or the full path? By reserving the standard field name, developers are forced to choose a more proper term, such as display_name or title or full_name.

Resource names behave like file paths. You can use printf() to generate resource names from resource ids. You can use split() to parse resource names into resource ids. If a resource id contains /, such as ** in the path template, you need to manually handle it.


Standard Methods

This chapter defines the concept of standard methods, which are ListGetCreateUpdate, and Delete. Standard methods reduce complexity and increase consistency. Over 70% of API methods in the Google APIs repository are standard methods, which makes them much easier to learn and use.

The following table describes how to map standard methods to HTTP methods:

Standard MethodHTTP MappingHTTP Request BodyHTTP Response Body
ListGET <collection URL>N/AResource* list
GetGET <resource URL>N/AResource*
CreatePOST <collection URL>ResourceResource*
UpdatePUT or PATCH <resource URL>ResourceResource*
DeleteDELETE <resource URL>N/Agoogle.protobuf.Empty**

*The resource returned from ListGetCreate, and Update methods may contain partial data if the methods support response field masks, which specify a subset of fields to be returned. In some cases, the API platform natively supports field masks for all methods.

**The response returned from a Delete method that doesn't immediately remove the resource (such as updating a flag or creating a long-running delete operation) should contain either the long-running operation or the modified resource.

A standard method may also return a long running operation for requests that do not complete within the time-span of the single API call.

The following sections describe each of the standard methods in detail. The examples show the methods defined in .proto files with special annotations for the HTTP mappings. You can find many examples that use standard methods in the Google APIs repository.

The List method takes a collection name and zero or more parameters as input, and returns a list of resources that match the input.

List is commonly used to search for resources. List is suited to data from a single collection that is bounded in size and not cached. For broader cases, the custom method Search should be used.

A batch get (such as a method that takes multiple resource IDs and returns an object for each of those IDs) should be implemented as a custom BatchGet method, rather than a List. However, if you have an already-existing List method that provides the same functionality, you may reuse the List method for this purpose instead. If you are using a custom BatchGet method, it should be mapped to HTTP GET.

Applicable common patterns: paginationresult ordering.

Applicable naming conventions: filter fieldresults field

HTTP mapping:

  • The List method must use an HTTP GET verb.
  • The request message field(s) receiving the name of the collection whose resources are being listed should map to the URL path. If the collection name maps to the URL path, the last segment of the URL template (the collection IDmust be literal.
  • All remaining request message fields shall map to the URL query parameters.
  • There is no request body; the API configuration must not declare a body clause.
  • The response body should contain a list of resources along with optional metadata.

Example:

// Lists books in a shelf.
rpc ListBooks(ListBooksRequest) returns (ListBooksResponse) {
 
// List method maps to HTTP GET.
 
option (google.api.http) = {
   
// The `parent` captures the parent resource name, such as "shelves/shelf1".
    get
: "/v1/{parent=shelves/*}/books"
 
};
}

message ListBooksRequest {
 
// The parent resource name, for example, "shelves/shelf1".
 
string parent = 1;

 
// The maximum number of items to return.
 
int32 page_size = 2;

 
// The next_page_token value returned from a previous List request, if any.
 
string page_token = 3;
}

message ListBooksResponse {
 
// The field name should match the noun "books" in the method name.  There
 
// will be a maximum number of items returned based on the page_size field
 
// in the request.
 
repeated Book books = 1;

 
// Token to retrieve the next page of results, or empty if there are no
 
// more results in the list.
 
string next_page_token = 2;
}

The Get method takes a resource name, zero or more parameters, and returns the specified resource.

HTTP mapping:

  • The Get method must use an HTTP GET verb.
  • The request message field(s) receiving the resource name should map to the URL path.
  • All remaining request message fields shall map to the URL query parameters.
  • There is no request body; the API configuration must not declare a body clause.
  • The returned resource shall map to the entire response body.

Example:

// Gets a book.
rpc GetBook(GetBookRequest) returns (Book) {
 
// Get maps to HTTP GET. Resource name is mapped to the URL. No body.
 
option (google.api.http) = {
   
// Note the URL template variable which captures the multi-segment resource
   
// name of the requested book, such as "shelves/shelf1/books/book2"
    get
: "/v1/{name=shelves/*/books/*}"
 
};
}

message GetBookRequest {
 
// The field will contain name of the resource requested, for example:
 
// "shelves/shelf1/books/book2"
 
string name = 1;
}

The Create method takes a parent resource name, a resource, and zero or more parameters. It creates a new resource under the specified parent, and returns the newly created resource.

If an API supports creating resources, it should have a Create method for each type of resource that can be created.

HTTP mapping:

  • The Create method must use an HTTP POST verb.
  • The request message should have a field parent that specifies the parent resource name where the resource is to be created.
  • The request message field containing the resource must map to the HTTP request body. If the google.api.http annotation is used for the Create method, the body: "<resource_field>" form must be used.
  • The request may contain a field named <resource>_id to allow callers to select a client assigned id. This field may be inside the resource.
  • All remaining request message fields shall map to the URL query parameters.
  • The returned resource shall map to the entire HTTP response body.

If the Create method supports client-assigned resource name and the resource already exists, the request should either fail with error code ALREADY_EXISTS or use a different server-assigned resource name and the documentation should be clear that the created resource name may be different from that passed in.

The Create method must take an input resource, so that when the resource schema changes, there is no need to update both request schema and resource schema. For resource fields that cannot be set by the clients, they must be documented as "Output only" fields.

Example:

// Creates a book in a shelf.
rpc CreateBook(CreateBookRequest) returns (Book) {
 
// Create maps to HTTP POST. URL path as the collection name.
 
// HTTP request body contains the resource.
 
option (google.api.http) = {
   
// The `parent` captures the parent resource name, such as "shelves/1".
    post
: "/v1/{parent=shelves/*}/books"
    body
: "book"
 
};
}

message CreateBookRequest {
 
// The parent resource name where the book is to be created.
 
string parent = 1;

 
// The book id to use for this book.
 
string book_id = 3;

 
// The book resource to create.
 
// The field name should match the Noun in the method name.
 
Book book = 2;
}

rpc CreateShelf(CreateShelfRequest) returns (Shelf) {
 
option (google.api.http) = {
    post
: "/v1/shelves"
    body
: "shelf"
 
};
}

message CreateShelfRequest {
 
Shelf shelf = 1;
}

The Update method takes a request message containing a resource and zero or more parameters. It updates the specified resource and its properties, and returns the updated resource.

Mutable resource properties should be mutable by the Update method, except the properties that contain the resource's name or parent. Any functionality to rename or move a resource must not happen in the Update method and instead shall be handled by a custom method.

HTTP mapping:

  • The standard Update method should support partial resource update, and use HTTP verb PATCH with a FieldMask field named update_maskOutput fields that are provided by the client as inputs should be ignored.
  • An Update method that requires more advanced patching semantics, such as appending to a repeated field, should be made available by a custom method.
  • If the Update method only supports full resource update, it must use HTTP verb PUT. However, full update is highly discouraged because it has backwards compatibility issues when adding new resource fields.
  • The message field receiving the resource name must map to the URL path. The field may be in the resource message itself.
  • The request message field containing the resource must map to the request body.
  • All remaining request message fields must map to the URL query parameters.
  • The response message must be the updated resource itself.

If the API accepts client-assigned resource names, the server may allow the client to specify a non-existent resource name and create a new resource. Otherwise, the Update method should fail with non-existent resource name. The error code NOT_FOUND should be used if it is the only error condition.

An API with an Update method that supports resource creation should also provide a Create method. Rationale is that it is not clear how to create resources if the Update method is the only way to do it.

Example:

// Updates a book.
rpc UpdateBook(UpdateBookRequest) returns (Book) {
 
// Update maps to HTTP PATCH. Resource name is mapped to a URL path.
 
// Resource is contained in the HTTP request body.
 
option (google.api.http) = {
   
// Note the URL template variable which captures the resource name of the
   
// book to update.
    patch
: "/v1/{book.name=shelves/*/books/*}"
    body
: "book"
 
};
}

message UpdateBookRequest {
 
// The book resource which replaces the resource on the server.
 
Book book = 1;

 
// The update mask applies to the resource. For the `FieldMask` definition,
 
// see https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#fieldmask
 
FieldMask update_mask = 2;
}

The Delete method takes a resource name and zero or more parameters, and deletes or schedules for deletion the specified resource. The Delete method should return google.protobuf.Empty.

An API should not rely on any information returned by a Delete method, as it cannot be invoked repeatedly.

HTTP mapping:

  • The Delete method must use an HTTP DELETE verb.
  • The request message field(s) receiving the resource name should map to the URL path.
  • All remaining request message fields shall map to the URL query parameters.
  • There is no request body; the API configuration must not declare a body clause.
  • If the Delete method immediately removes the resource, it should return an empty response.
  • If the Delete method initiates a long-running operation, it should return the long-running operation.
  • If the Delete method only marks the resource as being deleted, it should return the updated resource.

Calls to the Delete method should be idempotent in effect, but do not need to yield the same response. Any number of Delete requests should result in a resource being (eventually) deleted, but only the first request should result in a success code. Subsequent requests should result in a google.rpc.Code.NOT_FOUND.

Example:

// Deletes a book.
rpc DeleteBook(DeleteBookRequest) returns (google.protobuf.Empty) {
 
// Delete maps to HTTP DELETE. Resource name maps to the URL path.
 
// There is no request body.
 
option (google.api.http) = {
   
// Note the URL template variable capturing the multi-segment name of the
   
// book resource to be deleted, such as "shelves/shelf1/books/book2"
    delete
: "/v1/{name=shelves/*/books/*}"
 
};
}

message DeleteBookRequest {
 
// The resource name of the book to be deleted, for example:
 
// "shelves/shelf1/books/book2"
 
string name = 1;
}



No comments:

Post a Comment