Originally posted on: http://geekswithblogs.net/DougLampe/archive/2014/10/08/why-all-web-apps-should-use-soa-and-what-that.aspx
In the beginning, there was GET and POST. In those days, most of the World Wide Web was read-only and browsed with text-based browsers. Then came graphical browsers, CSS, e-commerce, SSL, streaming video, mobile devices, etc. Two things withstood the test of time: GET and POST (we’ll ignore the other request methods for poetic license). Web applications are, and have always been, “service-oriented”. But what exactly does that mean and how does it change your design?
What Do I Mean by SOA?
SOA stands for “service-oriented architecture”. It does not stand for “Enterprise Service Bus” (ESB), “Universal Description Discovery and Integration” (UDDI), “Web Service Definition Language” (WSDL), or “Simple Object Access Protocol” (SOAP). However, some people (mostly software or training vendors) will say you are “doing it wrong” if you say you are using SOA but aren’t using these things. I strongly disagree. If the acronym were DEXPSBA for “Discoverable Enterprise Cross-Platform SOAP Based Architecture”, then you would have an argument. As long as SOA stands for service-oriented, I take it to mean exactly that – the functionality of the application can be provided as a service. Keeping with the theme, note that I did not say, “must be provided as a service”. SOA just implies an orientation to services.
The next obvious question is, “what is a service?” That is a little more difficult to answer. Clearly Windows services and *nix daemons are “software services” which are services. Software as a Service (SaaS) and Infrastructure as a Service (IaaS) offerings claim to be “services”. The “service industry” typically refers to restaurant and bar workers. I’m sure SOA does not necessarily have to be oriented at waiting tables. Traditionally, the word “service” in SOA refers to discrete units of functionality that can be “consumed” by other applications. I emphasized the word “can” because of the “O” in SOA standing for “oriented”.
Your Web Application is Already SOA
So now let’s put the pieces together and take a look at a typical web application. Typically, a web application runs through an application tier (ASP.Net, JSP, PHP, etc.) that sits on top of the web server tier (IIS, Apache, etc.). That web application handles HTTP requests (typically GET or POST) and provides a response (typically HTML, XML, or JSON). Other than the self-described “agent” in the header of the HTTP request, your application does not know what browser (if any) is submitting that request. The reason I say “if any” is because any software capable of submitting a valid HTTP request will be processed by the server and application. This includes bots, viruses, malware, and of course browsers – all of which can either honestly represent themselves or report a different agent. By being agnostic to the “application” it is responding to, your web application is inherently “service-oriented” in the broadest term. In more traditional terms, your application uses a closed (or obfuscated) API to provide responses to a set of known requests. Your inbound message format is HTTP GET and POST and your outbound message format is typically HTML or JSON. Some of those requests may modify the state of data stored by the application. Any application capable of producing properly-formatted requests can consume your “services”.
What Should You Do Differently?
Since your application already is made up of a series of discrete services which are ambivalent to their consumers, you should validate each of the following for every transaction:
1. The person making the request is who they say they are.
2. The person making the request is authorized to perform the transaction.
3. The input provided for the request is valid.
4. The current state of the data allows for the transaction to occur.
5. The transaction has completed successfully.
We will look at each of these in more detail, but these steps don’t really do anything to make your application more service-oriented, they just deal with the fact that it already is. So before we go more in-depth, let’s add one more thing to the list:
6. The transaction can be consumed by other applications.
1. Verifying Identity
The first thing you need to do is secure your “service methods”. Each piece of code that handles incoming HTTP requests needs to assume the requestor is malicious. If you don’t care who processes a transaction gets the information returned in response, then you have nothing to worry about. For some transactions such as browsing products in an e-commerce application or possibly browsing data in an intranet application, you may not care who is performing the transaction or you may assume adequate security is provided by your infrastructure. However, for more sensitive transactions like accessing bank information, making credit card purchases, or accessing proprietary company data, you will want to be more careful. For public applications, this means you may want to use security certificates, an additional password challenge, multi-factor authentication, IP validation, CAPTCHA, and other techniques to reduce the risk of malicious impersonation. None of these methods are foolproof, but they are much better than assuming that anyone providing a valid session token is the person that token represents. The steps you take to secure the transaction should be proportional to the risk of having the wrong person perform the transaction.
2. Verifying Authorization
Just like there is no guarantee that the person making a request is who they say they are, there is also no guarantee that they are allowed to make the request. Caching user roles on the client, even if encrypted, may speed up transactions it really just means that whoever is making the request has a valid encrypted token and not necessarily that you gave it to them. If you’ve done step one above, you can be more certain that the person is who they say they are, but I recommend double-checking the database to see if they are really a system admin or not.
3. Verifying Data Inputs
If you want to aggravate a developer, find a URL that has a number as a parameter, pass in a letter and see if you get a nice error message saying that they are very sorry for your problem and that the error was recorded and the system administrator has been notified. Then write a script to hit that URL a few hundred thousand times. I’m not actually encouraging anyone to make DoS attacks but filling up a database transaction log with error messages is an easy way to do that. Filling up a developer’s e-mail isn’t very nice either. If real users are getting real errors, just filtering through the nonsense to find the real errors is a real nuisance. Just because your HTML and client-side script prevents users from submitting malicious HTML, SQL, or JavaScript in your request doesn’t mean it can’t be in the HTTP request. Of course the price of a product stored in a cookie in the users shopping cart can’t be assumed to be the correct price for a product. Just like number 1 above, based on the criticality of the transaction, you should validate data. Some of this can be done at the application tier or framework level and some needs to be done at the transaction level. The key is that you don’t assume that the inputs provided by a user are valid even if your client UI limits the possible inputs. At the end of the day you are just receiving a HTTP request.
4. Verifying the State of Data
Once again, just because previous transactions only allowed users to select “valid” data doesn’t mean that the current request is still valid. Users could have kept a page open for a long period of time allowing data to be changed by other users, users can resubmit submitted forms, and of course malicious users or code can submit what appears to be a valid request. The safest thing to do is to retrieve the latest data and validate that applicable rules are met.
5. Verify the Transaction Completed
Unless you are villain in a spy movie or a spoof of said spy movies, you shouldn’t just assume everything went according to plan and walk away. If you are using an API that returns a message, make sure you look at the message. If your data access or database tiers have error handling that could result in invalid data being ignored, make sure the transaction happened as planned. Don’t simply return the data submitted by the user and assume that it matches the record of authority.
6. Make it Consumable
If you have done 1-5, you have already made your application more secure and more reliable. You have also reduced the reliance of each individual HTTP request on “validation” done in your UI. Therefore each transaction can be considered a service since now any well-formatted request will result in the desired response. To make your “services” more consumable, you just need to be more sensitive to how a different application might want your service to respond. For example, you may respond to a request with an entire fresh page of HTML data when only one value is changed. That may be fine for your application but meaningless to most other applications. Consider providing another version of that transaction that either returns only the modified HTML or a JSON or JSONP result with the impacted data. By taking this approach from the get-go, your users may get faster response times, less “back button” angst, and an overall better experience.
In order to make “another version” of anything, you need an appropriate level of abstraction. This is a completely different architectural discussion and a topic for another day. At a minimum, you should separate your database transactions from your user interface. This allows for some degree of code reuse. The amount of abstraction you need is proportional to the amount of reuse. If you want to use multiple types of data stores and clients then you will need a high level of abstraction. If you just want multiple versions of the same transaction to produce multiple types of HTTP responses, you just need those methods to be extracted appropriately. Once you have achieved the proper level of abstraction, you may choose to create a SOAP or REST service to expose those abstracted methods. Most web application frameworks make this relatively easy assuming you already have the correct level of abstraction. Then you might want to publish the WSDL, update your UDDI, and fire up your USB. Then again you might not (http://www.zdnet.com/blog/gardner/dont-use-an-esb-unless-you-absolutely-positively-need-one-mule-cto-warns/3060).
Putting it All Together
There is nothing worse than getting good advice when it is too late to act on it. That is why the emphasis on this article is on architecture. Like most architectural decisions, these concepts are best put in action when the determination to enact them is made up front. Retroactively making your web application more service-oriented may not pay off dividends. However, when building a web application, consider the potential for reuse and weigh that vs. the impact of implementation. There is a finite amount of logic in every application. There are a finite number of transactions with a finite number of parameters. Your decision to expose those transactions as a service is the equivalent of putting one dish in the dishwasher instead of in the sink and if you are already writing the code the dishwasher is already open. Make your significant other/roommate (AKA other coders) happy by putting that fork in the dishwasher.