TechEd: Metropolis - Interchangeability of Operations
[EDIT: Unbelievable but you have to sneak in the conference rooms to get access to the one power point that's hidden under traps. Hopefully no one will ask me to unplug myself because of Health and Safety but you never know... An IT conference without power plugs, what a great idea!]
I've used the term Metropolis for a set of talks. Look at Cities, Transportation, manufacturing and try and find some trends. From looking at cities we can learn a lot about what will happen in IT.
At a high level, IT shops map to cities. You have evolution that occurs in isolation, and then things get connected, which causes change. Factories and buildings are comparable to applications. Each application is part of an IT shop. Retail maps to business process. It's about tying together deliveries that come from different origins.
Manufacturing goods and structured data and operations are similar. That what we see as data structures and operations are made to fit better together.
I don't believe it's an analogy, it's not helping to think about a subject, it's the same drive behind both technologies.
What are the challenges in SOA? Explicit boundaries; Autonomy - you can replace your service without impacting the other services.
How do you perform actions across these boundaries. What trust do you use? SOAs are behaving the way you see people in business behave in the real world. Its an opening up of the autonomy and explicit boundaries.
Those that know of me know I've been doing transactions for a long time; but I've grown to believe that transactions across boundaries are very fragile. I don't think services as boundary of trusts will engage into transactions with other services. Some say two-phase transactions will happen across services, some say they shouldn't because of trust issue. For this talk, each service will have a transaction but will not hold locks for another service.
When you look at manufacturing, there was a transition from hand-crafted to automated. A machine creates the part. The shape and form was so crude with the machine that you had to manually change them before assembling. Interchangeability was driven by being able to exchange pieces between two broken guns on a battlefield to make one that work.
In computing we're massively stuck because we don't do very well on how to exchange services. The important points that you need to take from this transaction:
- Autonomy
- Agreement is different: Attempt and confirm/cancel
- Interchangeability: One operation is as good as the next. Fullfill my requirement, be it that you use A or B. You execute your operation on a pool of services.
- Semantics: How do you make the request to rearrange and reorder. You need Precision but you don't want the Intricacy.
- Variety: It can be made out of interchangeable of the operations. If you put different goods in WalMart, the combination is astonishing, but they come from a limited number of pools.
Let's see the history of manufacturing. If you go back to the early nineteenth century, you'd order one thing and you'll get one gun. You have a lot of craftsmen, each of them builds a complete item. But each person made a different gun because it was based on their own approach. Completing anything needed fitting. Soft parts were stamped because they couldn't be exchanged.
The American way of manufacturing comes from shortage of labour. They started building machines to do the work. The problem was that the machine created parts were still inaccurate. Because of that, they need adjustment. Fitting was still required.
With the armory system of manufacturing. LeBlanc proposed interchangeable small parts but it was dropped, but it was brought back in America, or at least the idea. By creating Jugs, Gauges, and the precision work John Hall, the price of manufacturing the pieces increased because you had to be picky with the pieces you get. You had interchangeability but the economics didn't work.
From armories to bicycles to sewing machines, all the way to cars with Ford, impacts what we do today. The disassembly plant: the meat arrived in Chicago, and each butcher was responsible for cutting one piece of meat. There was very high throughput. In assembly, Ford moved the parts to the people rather than the people to the parts.
The Ford factory made sure there was no fitting, every part had to be fully designed, custom machines were made, so all the pieces could be interchangeable. The machine was giving the precision and the accuracy. Even the building was shaped to optimize the production. But it became impossible to change the car design as you would now have to knock the building down.
Interchangeability is great but there are other requirements! Which brings us to General Motors. GM gradually introduced the concept of a yearly model, which introduced the requirement of being agile. You suddenly need multi-purpose machine tools. For this you had to be able to re-arrange the components of the machine to create a new piece. Mechanization, interchangeability and rapid changeover to new models.
Let's talk about the American system of transaction processing. You start your activities, be it a human or a machine, and it goes to your database, you calculate ad you return your answer. But you have to do fitting inside the transaction. There's no boundaries when you update two databases within a transaction. Within each transaction you need to fit the work. But it gives you volume.
We need transactional "machine tools". No precision tools: our current tools can be used in different ways. Different programmers get different results. They're wonderful for labour savings, but not to separate their behaviour. Today, we have applications that have external behavior that are similar but the inside is different. If we are to take our applications apart, we need to make the internals of our applications interchangeable.
Services are connected by messaging. You don't know how the operation is executed, but you know the contract of the messages and the contract of the order in which messages are exchanged. Interaction is based on business functionality. The services share operations, not data. Sometimes there's reference data. It's like a department store catalog, you use that data to connect to your business.
What about optimistic concurrency control? Can you do optimistic concurrency work with your bank? No, because there's a security boundary. You need to think about trust.
Autonomy means independent control. My local business logic decides how i change my local data. I decide what changes. If you want a change, ask me to do a business operation.
Long-running work. Services don't share transactions. How do they cooperatively make decisions? how do independent businesses take decisions? There's tentative operations, with reservations, cancelable orders, or confirmation. If you buy a house in the states, as a buyer you give everything to the Escrow company. The bank, the buyer, the seller, everyone puts their trust in the escrow company. However the sell is only a reservation until you know if you will get it. The only guarantee is that you'll get your money back if the deal doesn't work out.
Coordinating n-systems requires that at least n-1 accept uncertainty. You system needs to be in a confused state. In 2-phase commit, the database maintains locks. With cancelable operations, uncertainty is the reservation you have for a hotel room. There's an intrinsic conflict between consistency and availability. Two Phase commit is the anti-availability protocol.
Because operations are cancelable, you can reorder them. Your cancellation is not an undo but an action you receive to cope with the failure. So what are the semantics for cancellation / confirmation?
Cancellation is about coping with the operation not being done. You accept the right to cancel. Confirmation is the guy that has the right to cancel saying that he won't, for example hotel room confirmations in the morning. Airline companies clear up their overbooking once the plain leave.
To make your operation cancelable, you want to reserve its effect until the confirmation. If an operation is unique, you must lock its effect until you cancel it. You need to make a decision between provisioning and overbooking. How do you manage the pool of resources?
One operation in a class is the same as another. If you reserve a king-size room, it could be any of the rooms. Interchangeability comes by increasing commonality.
It's a pain to offer cancelable operations. But you want to do it because your customers need that in order to function correctly. Its annoying that the customer can be fickle but it's the reality of a complex application.
We used to make decisions atomically by committing or rolling back your transactions. You want to now think about how things compose, but for that you need to have each service being part of a long running work.
In the armory system of transaction processing, you have a complete solution when you assemble small services. In manufacturing you need precision. You need to understand the constraints of your operation. We've made components that were multi-purpose. What you need to do is to have the least amount of functionality to make them interchangeable.
Resource-oriented data lives longer but is changed by long-running operations, like bank records. You have activity-oriented data, that gets created when you start doing a job, and retires when the operation is complete, like an order.
These two classes of data have different characteristics.
In both cases, the data is encapsulated within the service, and there's no optimistic concurrency control externally. You mediate access through business logic.
Let's take a Foo operation. First case, the data is isolated. If I change something about someone's order, I only change a bit of data. But for Joe's order, that may modify my resource-oriented data by changing my inventory. Activity oriented data is used for a single long-running activity and won't span several. You can create a tentative operation, it's easy to track down in case the operation is canceled, but nothing is shared so there's no impact. What happens if you share data?
It becomes harder in resource-oriented services. What is interacting with what? They make commitments and these commitments need to be tracked. You have an association between activity data, the reservation, and the resource oriented data, the list of hotel rooms that are available every day.
Specificity is not interchangeability, Variety is the enemy of interchangeability.
There's a lot of work around DSLs. You express the intent of the business domain with a domain language. That's a great foundation to think about the issues that have been raised.
This is heavy-duty reconfigureable manufacturing machines. You don't need much skills to use DSLs if they're constrained enough.
Service operations are like standard parts, whereas tentative operations compose. That lets you do standard software components and use them across several products. All the products look different on the outside but they're all the same inside.
We should learn about that manufacturing part, and do some rambling philosophy.
Atomic transactions are singularities. New challenges happen as you spread work across space and time. Interchangeability helps relax space and time. You can interleave and reorder. Variety means you have lots of options and choices, but you can't keep stocks. Interchangeability needs precision rather than variety.
EDIT: Some form of question about Wal-Mart using Child labour and how it relates to services. Sometimes reality is better than science fiction!
Update 26th November 2007: http://blogs.msdn.com/pathelland/archive/2007/11/25/presentation-of-metropolis-interchangeability-of-operations-at-teched-emea-in-barcelona.aspx