Sunday, 31 March 2013

Things to remember when REST-ing

There are many good blogs and books written on designing and developing RESTful system. I have read some of them. In this post I am sharing some notes that I have taken during that time. Please let me know if I miss anything important (as it is a huge subject). I will update this post accordingly. 

Resource Identification

Every important resource in a RESTful system must have an identifier. Resource identification is very important step in developing RESTful system. We can use URI to identify a resource. A URI uniquely identifies a resource. A resource's URI distinguishes it from any other resources. URI can identify a single resource or a collection of resources. For example:    [identify course no 123]  [identify all orders of Mar, 2013]

A resource can have more than one URI, i.e. a resource can be identified in more than one way, but a URI always identifies one resource. Try to use simple URI. Simple URI is always good, no matter whether the resource will be comprehended by human or processed by machine and it is easy to remember.

Resource Representation

Support one or more representations of a resource. What is representation?

A representation is a transformation or a view of a resource's state at an instant in time.

Each resource's identifier (for example: URI) is associated with one or more representations. We can use XHTML, Atom, XML, JSON, plain text, CSV, MP3, or JPEG to achieve this. These are called transferable or representation formats. In web, different systems exchange representations. They do not access the underlying resource directly. URIs relate, connect, and associate representations with their resources on the web.

Try not to give any indication to the consumers to terminate URIs with .json or .xml to get a resource representation in preferred format, rather use content negotiation. Consumers can use content negotiation to negotiate for specific representation formats from a service. They will use HTTP Accept request header with a list of media types they're prepared to process. But careful, service does not have to oblige the consumer's request. Service may send resource representation in xml even though consumer has requested in json format. So check the content type in response.

Utilize Link

Utilize links to drive application state. It is the core of HATEOAS (Hypermedia as the engine of application state). In a hypermedia system, application states are communicated through representations of uniquely identifiable resources. The client submits an initial request to the entry point of the service. The service handles the request and responds with a resource representation populated with links. The client chooses one of these links to transition to the next step in the interaction. The client progresses toward its goal by making several such interactions. In this process, application's state changes. So we can say, the change of application's state depends on the service, client, exchange of hypermedia-enabled resource representations, and the advertisement and selections of links. Link approach is beautiful because links help us to point to a resource provided by a different application or may be by a different company.

Resource state is not same as Application state

Roy Fielding mentioned this (in comment section) in one of his blog:

Don't confuse application state (the state of the user's application of computing to a given task) with resource state (the state of the world as exposed by a given service). They are not the same thing.

Resource state and application state are two different things. They should not be confused. When the service and consumer interact, they exchange representations of resource state, not application state. Application state is defined by a representation that is handed to a consumer by the service. When a consumer makes a request, it gets a small subset of the overall server state and certain transitions (represented as link) to other application states that are offered by the service. Please see another comment by Roy here in this regard.

Do not ignore or misuse HTTP status codes

When a client makes a request to the server, server returns an HTTP status code in response to the request. This status code is important because it provides information about the status of the request. There are five categories of these status codes: 1XX range for informational, 2XX range for success, 3XX range for redirection, 4XX range for client error and 5XX range for server error. It is not good to mix them up because they are helpful to deal with different scenarios. For example: it is not good to send 200 status code and an error message in the response body in case of an error happens. If these codes are used properly, they increase re-usability, better interoperability, and loose coupling.

Use caching

Caching helps us to increase the scalability of a RESTful system by storing copies of frequently accessed data in several places along the request-response path. There are benefits of using caching. It reduces bandwidth, latency, load on servers and helps to hide network failure. Using HTTP headers, an origin server indicates whether a response can be cached, and if so, by whom, and how long. Caches along the response path can take a copy of a response (provided that caching metadata allows it). The caches can then use these copies to satisfy subsequent requests.

There are two main HTTP response headers that can be used to control the caching behaviour:

Expires: The Expires HTTP header specifies an absolute expiry time for a cached representation. After that time, a cached representation is considered stale and must be revalidated with the origin server. A service can indicate that a representation has already expired by including an Expires value equal to Date header or a value to 0. To indicate that a representation never expires, a service can include a time up to one year in the future.

Cache-Control: The Cache-Control header can be used in both requests and responses to control the caching behaviour of the response. The header value comprises one or more comma-separated directives, that are used to determine whether a response is cacheable, by whom, and for how long.

Cacheable responses (GET/POST) should also include a validator either an ETag or a Last-Modified header:

ETag: ETag is useful to validate the freshness of cached representation of a resource. An ETag value is an opaque string token that a server associates with a resource to uniquely identify the state of the resource over its lifetime. When the resource changes, the ETag changes accordingly.

Last-Modified: Last-Modified header indicates when the associated resource last changed. The Last-Modified value cannot be later than the Date value.

If a consumer wants to revalidate a response, it should include a Cache-Control: no-cache directive in its request. This ensures that the conditional request travels all the way to the origin server, rather than being satisfied by a cache.

When doing validation, use conditional GETs. A conditional GET only sends and receives just HTTP headers rather than headers and entities bodies. It only exchanges entity bodies when a cached resource representation is out of date. Conditional GETs are useful only when the client making the request has previously fetched and held a copy of a resource representation along with ETag or Last-Modified value. Consumer or cache uses a previously received ETag value with an If-None-Match header, or a previously supplied Last-Modified value with an If-Modified-Since header. If the resource hasn't changed (that means its ETag or Last-Modified value is the same as the one supplied by the consumer), the service replies with 304 Not Modified (plus any ETag or Location headers). If the resource has changed, the service sends back a full representation with a 200 OK status code.

Consumers can also influence cache behaviour by sending Cache-Control directives in requests: max-age, max-stale, min-fresh, only-if-cached, no-cache, no-store.

Other things to consider:
  • Use a hypermedia-aware media type such as HTML, XHTML, SVG, Atom
  • Do not tunnel updates through GET
  • Use self-descriptive message
  • Try to ignore chattiness with many round trips
  • Try to accept and support compression as defined by the HTTP 1.1 specification
  • Try to render relative links where possible
  • Try to implement paged representation where applicable
  • Do not misuse cookies
  • Think about security