How to fix Microsoft’s Two-Tier Service Application Scenario (REST)

January 25, 2009 Comments Edit practices

Microsoft has released a beta version of a guidance talking about REST in 2-tier applications. I’ve had many rants about Microsoft’s attitude towards REST and the marketing branding they put on (some teams being much worse than others by arrogantly or unknowingly putting the word REST on the name of their framework).

This entry is no such rant, but an effort to outreach to the authors. The document has issues, but it is my belief that with the right corrections, it could be made accurate.

Of the importance of patterns

First thing come first, let’s talk about patterns. Here’s the definition for design pattern from Wikipedia.

A design pattern [..] is a formal way of documenting a solution to a design problem in a particular field of expertise.

From this, we would expect a pattern that is referenced or talked about to have been documented, contextual to a field of expertise and used to solve a design problem.

Let’s review what patterns are referenced from the document, as it will help us later to analyze the proposed guidance. Whenever a pattern is provided without references, I either assume the first documentation of a pattern as applying, or try, as a reader would, to google it and find what it could mean.

The router pattern

I assume this mean the pattern by which a URI is mapped to a component processing the request. My searching has returned an IBM article defining the router pattern as “[…] routing requests to specific pieces of business logic based on some defined criteria”. Anyone trying to search for the router pattern will be inundated with various definitions and hundreds of sub-patterns (content-based router pattern, dynamic router pattern, etc). By failing to reference which variant of a pattern is being included, and where the documentation for such pattern is located, a reader will be none the wiser. And they are, after all, looking for guidance.

The proposed fix: either reference which pattern is being talked about, or document what use you refer to.

The REST Entity pattern

A quick google search for “REST Entity pattern” will return only two results, the first one being the proposed guidance, and the second one being a presentation by Ganes Gunasegaran on a site called sagework, available as a pdf. That presentation does provide one slide defining the pattern as follows.

Resource can be read with a GET operation
Resource can be changed only by PUT and DELETE operations

Now further detective work returns a MindTouch page defining the Entity pattern. Reading the rest of Ganes’ presentation, it becomes very obvious that the rest of the patterns he presents are just pulled out of the MindTouch REST patterns page. And reading the description of the patterns in the Microsoft document, you will also notice the exact same definitions.

We now have a documented pattern, within the correct field of expertise, aka MindTouch. However, a quick search for the use of this pattern being referenced outside of MindTouch’s web presence returns very little. Furthermore, it doesn’t define the problem it is designed to solve. I would question the validity of such a pattern.

The proposed fix: again, reference the correct pattern you intended to include to start with, and keep it’s original name. In this specific instance, also make sure that this pattern matches the definition of what a pattern is, or redefine and document such a pattern yourself (or get the original authors to do it).

The entity translator pattern

This one is defined by Microsoft themselves, in the context of web services:

Implement an entity translator that transforms message data types to business types for requests and reverses the transformation for responses.

The Facade pattern

See http://en.wikipedia.org/wiki/Facade_pattern, “A facade is an object that provides a simplified interface to a larger body of code, such as a class library.” or P&P Pattlets “Provides a unified interface to a set of interfaces in a subsystem. Facade defines a higher-level interface that makes the subsystem easier to use.”

The Repository pattern

Fowler: “A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection.”

Domain Entity pattern

A google search on Domain Entity Pattern triggers an interesting result, from the same authors as the proposed guidance: Three-Tier Web Application Scenario. If you have a look at both, you will see that a lot of recycling between the two guidance documents has happened. Now of course, neither documents match our definition of a design pattern. An obscure page on wikipedia says that domain entities “are a super-set of Data Layer Entities or Data Transfer Objects, and may aggregate zero or more DLEs/DTOs” which I’ll consider compatible with the P&P definition. The naming is however confusing, and not widely in use.

Quick fix: If you want to use the same definition, aka “A set of objects modeled after a domain that represents the relationship between entities in the domain but do not contain behavior or rules related to the entities”, call it what the rest of the world calls them: an Anemic Data Model. If you are happy with the anti-pattern but are worried that it may reflect badly on the practices you suggest, call it a Data Model. If what you really mean is that it’s a DTO with the same definition as wikipedia, call it a DTO and change your definition to match what a DTO is.

Lets review the guidance

Now that we’ve cleared out the confusion introduced by the guidance document’s use of patterns, let’s review the various layers and how they relate to REST.

The lack of context

The first thing that strikes me is the complete lack of any references to recognized litterature introducing REST. If you’re going to talk about an architecture, just like with a pattern, you *have* to provide the references.

Fix: Introduce REST and Roy Fielding’s PhD thesis. Provide links to well-known restafarian web-sites, such as the excellent http://restpatterns.org that provides guidance in implementing rest architectures.

The routing

In a RESTful architecture, everything is mapped as a Resource. This is the thing you want to operate upon. Anything can be a resource. For the sake of this entry, let’s imagine that I define a resource as being my computer’s hard drive, the physical hardware equipment that sits inside my laptop’s case.

To be able to operate on a resource, I need to be able to address it. And in REST, I can do so by giving it an identifier. In the case of HTTP, this is a URI. Let’s give a URI to my hard-drive: http://mysite.example/harddrive/fujitsu.

If I type the Uri in my browser, said browser will send an http request to my server. The server is now responsible for knowing what the heck it is that I want. This process is called URI dereferencing. It’s a big word, but it is what it is and what the common definition is. It’s the process by which a URI is matched to a Resource.

It is assumed that a handler will be responsible for doing this dereferencing process. Once the resource has been dereferenced, it is time to do something with it, and this is what an http method such as GET or POST does. It defines the operation that is to be done against the Resource.

We now have the elements to understand what Microsoft talks about when they mention the Router idea. In their scenario, the router uses both the Identifier and the Operation to call some bit of code, commonly referred to as a handler. It is a sad fact that Microsoft chooses, in a REST document, to disregard completely the existing and meaningful descriptions of a web operation.

Resources, representations and the “REST entity”

Let’s say that I want to add a song I just heard on the radio to my hard-drive. As you probably know, music is heard because the air between the singer’s vocal chords and my tympanic membrane vibrates. This vibration gets turned into an electrical signal and gets processed by my brain to let me make sense of the words that were transported as a vibration. When talking with a human, I would identify the song I just head as “If you seek Amy from Britney Spears”. If I talk with a computer, I may need to assign it a name too, so let’s do that. http://www.britneyspears.com/songs/ifyouseekamy.

As far as I know, my hard-drive cannot persist air vibrations to disk. We need a binary stream, because that’s what hard-drives can persist. That binary stream would probably be an mp3 downloaded from a music service. This byte stream is not the song itself (as in the air vibrating), it’s a file in binary format that my computer can process. If the song is a resource, the mp3 file is a Representation of that resource.

What this means for my adding that file to my hard drive is that to download the song, I would need an mp3 file. If I dereference the URI for the song, I may get a representation of this song as an mp3 file. I never transmit the resource itself.

This is why REST is called Representational State Transfer. Now that I have my file, when I want to add it to my hard drive, I could do a POST to http://www.serialseb.example/harddrive/fujitsu and include the Representation of the song. I have effectively changed the state of my resource (my hard drive, the physical thing) by sending it a representation (the mp3 file).

Microsoft says “In REST a resource is an object that represents a specific state”. As you can probably tell by now, my hard-drive doesn’t represent a state, it has a state because it is a resource. I changed it’s state by sending it a Representation. I didn’t sing to my hard-drive to make it persist an mp3.

Quick fix: clear-up the definition to “In REST a resource is a thing that can have state. You can change that state by performing operations on the resources through transferring representations.” You can probably make it more obvious by stipulating that you recommend your business entities to be your resources, acted upon by a representation (your DataContract).

Furthermore, remember that MindTouch’s definition of a REST entity that Microsoft has included is defined A Resource that gets modified only through PUT and DELETE.

We’ve seen that a representation has by definition no behavior, as it is only a byte stream, and cannot be operated upon. Because Microsoft has specified that their use of the word REST Entity is a representation of a resource, it becomes obvious that they have wrongly applied the pattern proposed by MindTouch, which applies to resources.

Confusing resources and representations is a common problem for people new to REST, and one Microsoft has fallen into.

Fix: Drop the Entity (REST) naming. You have misunderstood the original meaning of the pattern. What you are talking about is a Representation that you would probably advise to be a DataContract.

Finally, we reach the Entity Translator. Microsoft proposes that such a component “translate[s] between business entities and REST entities exposed by the service”. We’ve now seen that REST entities are in fact representations. What is proposed here is a component that can turn a resource (aka the business entity) into a representation (aka your DataContract).

It is not surprising then that the definition “Resources exposed by the service represent an external contract while business entities are internal to the service” is inaccurate.

Fix: “Resources exposed by the service can only be retrieved and modified through Representations, which represent an external contract”.

And indeed, translators are required to move data from one format (your representation) to another (your business entity as an object living in memory).

The business layer

I won’t comment much on the architectural choices of Transaction Scripts and Facades, I have little interest in entering this debate. I will however take note of the definition of the service implementation (which, as we’ve seen, is usually called a handler):

“The service implementation is responsible for translating between external contracts and internal entities and then passing the request on to the business layer façade.”

This seems to indicate that the facade deals with external contracts, but the Entity Translator has already been introduced to deal with such a translation. This seems redundant and is probably a mistake.

Furthermore, if a business facade implements the logic of acting upon a business entity, and the translator maps between datacontracts and business entities, it would seem to me that you’d end up with an anemic service implementation. The only reason I can think of is to map Resource operations to business processes.

While I think such an infrastructure is redundant, here’s a proposed fix: “The service implementation is responsible for mapping operations on resources to business processes in your layer facade.”

Do we really need Messages?

I’m very confused by the proposed implementation of the business layer.

REST over http is often considered to be a Resource-Oriented architecture. The first, if not the most fundamental, design issue you will face is modeling your resources well. Like any domain modeling activity, this is not an easy process to get done right.

Provided you have thought of your architecture in terms of exposed resources, you then spend some time defining your representations, aka what goes on the wire. As we’ve seen, that will end up being your DataContract design.

There is a lot of inherent knowledge in resource instances: they have al the information you need to process your request. When I send a POST to http://www.serialseb.com/harddrives/fujitsu, the request contains the representation of the file I want to persist, the location in which to persist it. Nothing outside of that operation is required for the processing of the operation to happen.

Why then would one wrap the notion of adding a file onto a hard-drive into a message, pass it to a facade that dispatches the message, to finally get processed by an operation that reads the message to act upon data structures?

Any time you convert between various data structures, you introduce more complexity. Anytime you de-normalize and renormalize, you introduce potential bugs. The proposed solution does the following:

Get the request, and transform the datacontract into an instance of the business entity
Encapsulate the business entity into an untyped Message and pass it to a Facade
The facade reads the message to dispatch it to a process
The process opens-up the message to get back to the business entity in the operation and call the entity framework.

You have achieved absolutely nothing by having a facade. All the data that was required to dispatch the request to a business process was already in the service! You end up with an anemic service that does little if nothing, a business facade that’s not really a facade but a broker, and a business process that has to open-up the encapsulation format for no valid reason at all.

As this document is supposed to be a guidance as to best architect a solution, the proposed solution is just not acceptable. Proposing an anemic service is a symptom of a bigger problem: the guidance doesn’t talk at all about resource modeling, and assume that there will be redundant services mapping to the same business process, aka two services doing the same thing.

It looks to me as trying to slap a message oriented architecture in which the endpoint receives a message to be processed, on a resource-oriented architecture in which the endpoint is the resource on which to apply an operation. The semantics of a resource-oriented architecture eliminate the need for message dispatching.

Proposed fix: Get rid of the business facade as is, and let the service call the business process itself. The message is redundant.

Bad practice: Promote(object[] data) is just bad practice as you’ve now removed any single bit of semantics that were associated with the process.

Alternative fix

There is a simple alternative fix that can be applied to the document, by removing the REST references and describing it as a POX architecture.

Conclusion

I do not know if this guidance is the result of an incomplete understanding of REST architectures (which is quite widespread in Microsoft’s literature) or an attempt at over-simplification.

What I do know is that much needs to be modified before this document can be proposed as a best practice for delivering a RESTful solution. It lacks the proper and accepted terminology, completely bypasses architectural concerns around resource modeling, misrepresent what a REST architecture would look like. It also ignores caching (one of the REST constraints) and proposes an architecture that would make leveraging such caching difficult.

I also call for P&P to involve the communities that have formed around topics such as DDD and REST when they deliver beta versions of their guidance documents. Those errors I’ve highlighted could have been taken care of much earlier in the process, and save me the 5 hours I spent this afternoon writing this blog entry. That this is considered a beta 2 is however completely unacceptable.

Hopefully we will see an updated version of this guidance. If not, hopefully my blog entry will have enough google juice to start fixing the inaccuracies that Microsoft seems to spread about REST way too often.

Does history repeats itself?

January 20, 2009 Comments Edit

I’ve been following Opera’s reactions to the EU antitrust regulations against Microsoft’s bundling IE in windows, which they have been calling for…

Anyone remembers what happened last time a browser vendor tried to leverage antitrust laws to explain their sinking (or in the case of Opera, their fairly constant and unimpressive) market share? No?

Bah.