Today we cover the basics of Web services and information security and the way Web services security builds on existing security technology.
Web services are a transformational technology for integrating information sources from both inside and outside an enterprise. Web services are the newest incarnation of middleware for distributed computing. Unlike all previous forms of middleware, however, this is a simpler, standards-based, and more loosely coupled technology for connecting data, systems, and organizations. That is good news for architects and developers wanting to quickly become proficient in this technology and deploy real systems. It is also somewhat bad news for architects and developers because all middleware needs strong security practices, and Web services need it more than any middleware of the past. Why more? Because Web services create loosely coupled integrations. Because Web services are not just being used to integrate internal systems, but they are also integrating data sources from outside the organization. Because Web services are based on the passing of readable and self-describing business messages represented in XML. Because Web services are based on underlying Web technologies that already had their own set of security challenges.
It is true that Web services—like most new transformational information technologies introduced—have been overhyped, and it is true that fears of security problems have also been overblown, which has impeded the development and deployment of Web services. It is also true that the standards for Web services are either quite new or in some cases not even fully baked yet. This book is designed to take the mystery and fear out of how to build secure Web services and to shed light on these new standards and how to best use them. It is also designed to show you the richness and complexity of the security issues around Web services so that you, the designers, builders, and operators of Web services, can fully exploit all the capabilities of Web services to your best advantage but do so knowing full well what all the security challenges are and how to face them.
This chapter covers the basics of Web services and information security and the way Web services security builds on existing security technology. This sets the stage for a deeper understanding of the major standards for information security associated with Web services.
our technologies form the basis of Web services: eXtensible Markup Language (XML); SOAP (See Footnote 1 at end of article); Web Services Description Language (WSDL); and Universal Description, Discovery, and Integration (UDDI) (Footnote 2).
XML and XML Schema
XML was created as a structured self-describing way to represent data that is totally independent of application, protocol, vocabulary, operating system, or even programming language. Many call XML the lingua franca of business because it is being used so broadly across all industries to portably transmit business data. The use of XML presents a broad set of security challenges.
XML Schema is a way of describing the rules for a particular XML instance (also known as a document). XML can be used independently of XML Schema; however, in Web services and most business situations, the XML that you work with will be governed by an XML Schema (perhaps created by a development tool and put into your Web services WSDL file for you).
XML is the foundation of the Web services standards. All standards for describing, discovering, and invoking Web services are based on XML. SOAP and WSDL are described using XML Schema. The core security standards of XML Encryption, XML Signature, Security Assertion Markup Language (SAML), and WS-Security are XML-based and are also described by an XML Schema.
XML and HTML are both text-based formats that came from the same roots. XML was initially developed to overcome the limitations of HTML, which is good at describing how things should be displayed but is poor at describing what the data is that is being displayed. XML being text-based is very important to Web services; because it is human-readable, no tools are needed to parse and render the data, and simple text tools and editors are sufficient for its manipulation. XML documents are very wordy, and although you can easily become lost in the depth and richness of the tags, its markup format of tagged elements arranged in a hierarchical structure makes XML documents easy to comprehend. But there is a security price to pay for the open, text-based structure of XML. As you will see later, to provide data integrity—a guarantee that not one bit in the original document has been changed—with XML, you have to guarantee that not one character—even whitespace—of an XML message has been changed. Verifying data integrity is particularly challenging when using XML since differences in platforms and XML parsers can result in logically equivalent documents being physically different; consequently, a process of canonicalization is necessary to make a valid comparison with the originally signed document. This is just one example of the special considerations needed when considering Web Service Security.
SOAP
SOAP was created as a way to transport XML from one computer to another via a number of standard transport protocols. HTTP is the most common of those transports and is, of course, the most prevalent transport used by the Web itself.
SOAP itself is defined using XML, and it provides a simple, consistent, yet extensible mechanism that allows one application to send an XML message to another. SOAP is what makes application integration possible, because after XML defines the contents of a message, it is SOAP that moves the data from one place to another over the network. SOAP allows the sender and receiver of XML documents to support a common data transfer protocol. SOAP allows you to treat XML messages as requests for remote services. The SOAP model allows a clean separation between infrastructure processing and application processing of messages. Figure 1.1 shows the basic structure of a SOAP message.
SOAP provides an envelope into which an XML message is placed. This envelope is just a container to hold XML data. The idea is for SOAP to create a uniform container that can then be carried by a variety of transports. SOAP prevents applications from caring about the transport; the applications see consistency in the SOAP envelope and its contents.
Inside the SOAP envelope are two parts: the header and the body.
· SOAP header—Contains information about the SOAP message (as opposed to the XML message contained in the SOAP body). This information is used to manage or secure the package. SOAP is designed to be extensible, and a major area for extension is the SOAP header. Chapter 7, "Building Security into SOAP," which describes WS-Security, provides more detail on SOAP header security extensions.
· SOAP body—Contains the message payload. This information is being sent from one application to another. It might be a full document such as a purchase order or contract, or it might be a description of remote-procedure call information, including the methods to call and parameters to those method calls.
The simple SOAP message in Listing 1.1 shows an envelope that contains both a SOAP header and a SOAP body.
To understand SOAP, you need to understand the different "styles" of SOAP bodies. RPC-style SOAP bodies tend to be simple parameters to facilitate calling a remote method. Document-style SOAP bodies tend to be rich XML documents. Document style, in our view, is more appropriate for B2B Web services because it is usually more optimal to have "chunky," coarse-grained calls across a slow network rather than the fine-grained type of RPC call that you might use locally or on a fast network. This is not just due to the network but also due to the cost of marshalling and unmarshalling the XML and performing security-related operations.
SOAP needs to be secured. The messages it carries must be kept secret from unintended recipients. The remote service being called must know who is calling it and know the caller is authorized to do so. SOAP is a packaging mechanism for XML messages and documents. Like any package, it needs to describe important information about its contents, such as who it is from, how a recipient can trust that it really is the sender, what the sender is allowed to do, and much more. These are identity- and trust-related issues; they are the core of SOAP security discussed in detail later in this book.
WSDL
WSDL is an XML language that defines the set of operations that a Web service provides and the structure of their related SOAP messages. That is, the WSDL defines what the input and output structure will be for a Web service, and that will define what you expect to see in the payload XML message. WSDL is how one service tells another which way to interact with it, where the service resides, what the service can do, and how to invoke it. WSDL directly supports developers and is absorbed at application development time into developer tools. WSDL's definitions of remote services are presented to programmer-like local objects that can be acted upon as if they were methods in classes, just like any of their other local objects.
When you publish a WSDL for one of your services, you are creating a contract for how other services may interact with you to utilize your service. WSDL is what you publish to describe your Web service and the rules for how to work with it. You might think that security would also be described in WSDL because this is part of the rules for working with a particular Web service; however, the security options (security policy) available are richer than what you typically see in WSDL, so the standards are evolving toward using WS-Policy to describe a Web services security policy and then referring to this policy from the WSDL. Chapter 8, "Communicating Security Policy," goes into more depth on WS-Policy.
A WSDL file has a what section, a how section, and a where section. The what section specifies the input and output messages. The how section defines how the messages should be packaged in the SOAP envelope and how to transfer it. It also defines what information should be included in the SOAP header. The where section describes a specific Web service implementation and ways to find its endpoint.
UDDI is typically the fourth leg of the stool used to define Web services. Although we view UDDI as a useful standard, we do not see its usefulness beyond internal promotion of reuse inside large organizations. Given that, we do not put it front and center as a part of our discussions of Web services security and will not treat it further in this book.
Before you dive into the security implications of each of these Web services standards, you need some context: What are Web services really for? The answer is, among other uses that undoubtedly will develop as this new paradigm matures, application integration, B2B business process integration, portals, and service-oriented architectures.
Application Integration
Application integration is critical to organizations large and small because information integration is so fundamentally important. When organizations integrate all their applications that deal with customers (CRM, ERP, accounting, billing), they are trying to create a single view of all the information about those customers. When they integrate all their trading partners into a single supply chain, they are attempting to create a holistic view of their entire supply chain and all the information that describes their trading processes. This kind of information integration is fundamental to the business process. Rarely does a business process (product development, product marketing, product manufacturing, product ordering, product fulfillment, customer relationships, partner relationships, financials, and so on) utilize one and only one source of information (an application). It is because business processes cross application boundaries and even enterprise boundaries that Web services are needed to create those bridges.
Application integration is hard because systems were not designed with the same data structures, protocols, or even the same vocabulary for describing the items they manipulate. Applications were built at different times by different vendors using different technologies. However, many of these different applications need to communicate to perform certain functions. This is where XML comes in. It makes information easy to interchange and therefore easier to integrate.
The glue used to communicate from one application to another has traditionally been called middleware. Middleware has never been pervasive and was always very expensive. Rarely has anyone ever tried to use middleware between enterprises because simply using it within a single enterprise's boundaries is hard enough. SOAP is a critical step in taking XML messages toward being Web services middleware. In one of its modes, SOAP makes XML into a request/response paradigm that is published via WSDL. Web services are becoming pervasive because they are middleware based on the Web, and the Web is pervasive.
B2B Business Process Integration
Business processes don't stop at your company's firewall. Just as internal application integration is partly motivated by the need to break down application barriers, inter-organization business processes motivate B2B application integration. A driving need to integrate across organizations comes from management of supply chains and demand chains with trading partners. Traditional middleware could never be employed to solve this need because it never worked across the Internet.
The good news is that the Internet is pervasive. Most Internet communication occurs via text (for example, HTML is text, email is text), and virtually all applications have some form of text interface. XML is text-based and is designed to make business information transportable and self-describing. XML, plus the fact that all vendors support Web services, has moved us closer to solving the heterogeneous communication problems of different languages, different platforms, and different applications than any middleware technology of the past.
For these reasons, Web services technology is built on XML and Web technologies, which makes it the first middleware that can address the B2B business process integration challenge.
Portals
On the Internet and within intranets, portals are the entry point for customers into a site. Portals have been growing in utility and importance for some time as a way to aggregate information and applications into a single site that is accessible by browsers. Portals as major business models have been common for years. Amazon.com, Yahoo!, and Orbitz are all fundamentally portals. They pull information from their partners' repositories into a single site that consumers with browsers visit to buy books, music, and consumer goods; plan trips; and the like. Most of these major Web e-commerce companies built their portals long before Web services standards existed. They effectively built Web services to integrate all their information content using home-grown approaches. Companies trying to do what they have done now can do it much more cheaply and easily and remain much more interoperable by using the new Web services standards.
Companies are rapidly turning their corporate intranets into portals to provide a wide range of company-related information and services to employees, shareholders, and partners. One type of service they are providing employees is a unified benefits information resource. To make that a complete service, the 401k information from third-party providers must be integrated into the corporate portal. That is a perfect use for secured Web services that bring the employees' 401k account information into the corporate benefits portal by accessing the external services of the 401k provider.
Integrated customer information is so much the lifeblood of all companies that both customer relationship management (CRM) and customer information portals have represented large corporate investments for many years. Naturally, because Web services are less expensive and less complex than any previous form of middleware, they have been brought to bear on this common need.
Service-Oriented Architectures
In a service-oriented architecture (SOA), the interface is completely separated from the implementation. The software is provided strictly as a service that does not have to be downloaded and installed. SOA promotes reuse and sharing of services by numerous applications and even by different organizations. People describe an "SOA nirvana" when all systems are built as SOA and all applications are composite—built by stitching together several useful shared service components into powerful applications. Many people look to Web services to bring us to this SOA nirvana (Footnote 3).
The idea of services as being a powerful computing paradigm is not new. A service is an application that can be consumed by software as opposed to a human at a browser. It is software that does work for other software. It is how RPC mechanisms work. This was the premise of the client/server computing revolution of the early '90s. In this model, the server provided the service.
The benefits of a service-oriented architecture are legion. The complexities of a software system are hidden behind its interface. A complex software system becomes a simpler black box defined only by its external interface. The service so constructed becomes a shared resource that can support many applications.
Web services combine the concept of software-as-a-service with the ubiquity and connectedness of the Web. This is what makes Web services so compelling and so exciting: They create a Web API. We are talking about building applications with a broadly accepted standard API based on Web technologies. Web services enable you to wrap legacy applications with this Web API and turn them into shared services. Now these applications can be integrated with other applications and with trading partners. Previously inaccessible information resident in these legacy applications can be brought out to portals and combined with other application information and all made accessible to any user with a browser. Any application can be modified to provide this type of Web API and therefore can be integrated with any other application, allowing you to use the entire Web for application-to-application integration. This, then, is the power of Web services.
Definition of Web Services
A good working definition of a Web service, then, is an application that provides a Web API. The API enables the software resource to act as a service. Being a Web API means that this service is accessible at an Internet URI. Further, an API supports application integration, so a Web API allows application-to-application integration using XML over Web protocols and infrastructure. All the security and trust issues of being part of the open Web infrastructure will concern us. All the information security and message security issues inherent in sending messages from one network point to another will concern us. All the authorization and authority security issues inherent in middleware that performs RPCs will concern us. Now, let's cover some security basics to build a foundation for our deeper discussions in later chapters.
The Web is an interconnected global information system that provides resources suitable for consumption directly by humans. In this model, security is critical for many of these resources (login-password authentication at restricted sites, SSL encryption of credit cards and other personally identifiable confidential information). It only makes sense, then, that application-to-application Web services need at least this much security as well.
In fact, because Web services expose critical and valuable XML-encoded business information, Web services security is a critically important concept to fully understand. For one thing, trade secret pilfering is already a large problem, and without security, Web services might even make this situation worse. The reason is that Web services can be thought of as allowing in strange, new users who might take your company's valuable business secrets out.
This section covers basic security concepts to establish the vocabulary that will be used throughout this book. Keeping communications secret is the heart of security. The science of keeping messages secret is called cryptography. Cryptography is also used to guarantee trust in a known identity across a network by "binding" that identity to a message that you can see, interpret, and trust. An identity asserting itself must be authenticated by a trust authority to a previously established identity known to the authority for the binding to be valid. After you know the identity, authorization allows you to specify what the individual with that identity is allowed to do. When you receive a secret message, you need to know that nothing in the message has been changed in any way since it was published, an attribute called integrity. When cryptography successfully keeps a message secret, it has satisfied the requirement for confidentiality. At times, you might want to know that someone who received confidential information cannot deny that she received it, an important security concept called non-repudiation.
Most of these core security concepts depend on encryption technologies, so before you look at any of them more closely, take a look at the fundamentals of encryption.
No comments:
Post a Comment