Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. XML Issues

In this chapter:
Testing Non-XML Security Issues in XML Input Files
Testing XML-Specific Attacks
Simple Object Access Protocol
Testing Tips
Summary

Extensible Markup Language (XML) is a text format designed to represent data so that it can easily be shared between different computer systems. Although XML has existed for almost 10 years, over the last few years the format has been become extremely popular. Many Web browsers, word processors, databases, and Web servers use XML today. The XML format is used to send data across the network or to store data as a local file. In this chapter, you’ll learn how to security test applications that interact with XML. The first part of the chapter describes how to test for non-XML vulnerabilities such as HTML scripting, spoofing, and buffer overflows when the data input is through XML. The second part of the chapter describes specific security issues with XML and how to test for these.

Note

XML includes security features such as signatures and Web Service Security; however, these issues are beyond the scope of this book.

Testing Non-XML Security Issues in XML Input Files

Applications that take XML as input typically send the data through an XML parser first. The application then accesses the parsed version of the data. If the XML data can’t be parsed, the application usually doesn’t access the input. For this reason, it is important to craft input that will be parsed successfully, but that input might find security issues in the application consuming the XML. For example, because XML is a tag-based format similar to HTML, sending the <img> tag in the XML input seems logical. Because XML expects a corresponding </img> tag, however, simply sending in <img> causes the XML parser to fail. For XML data to be parsed successfully, the data should be both well formed and valid.

Applications that use an XML parser that supports data streams might obtain parts of an XML document before the document is deemed well formed. For example, the Microsoft .NET Framework XmlReader class can parse XML streams. An application that requests the value of the Name element (innerXML) for the following XML would receive “User1”. If the application continues to read the XML stream, the XmlReader class would return an error because the XML isn’t well formed (the closing tag </p> should be </phone>).

<customer>
 <name>User1</name>
 <phone>425-882-8080</p>
</customer>

The fact that some XML parsers allow access to the data even when it is not well formed creates situations in which an attacker’s data can enter the application through the parser when there are constraints and the attacker is not able to form the XML input correctly. Other classes that do not support data streams—for example, the XmlDocument class—do not have this issue.

Important

The XML parser can be tested by creating malformed XML and sending it through the parser. This chapter focuses on testing scenarios where the XML is well formed and valid because most readers are probably more interested in testing their applications than they are the XML parser.

Well-Formed XML

XML is well formed if it is syntactically correct. This means that the following points hold true:

The document has exactly one root element (also known as a document entity).
Elements must have a start and an end. Whereas some tags in HTML have only a begin tag (such as <img>), XML must contain a begin tag and an end tag. For example, <tester>Tom</tester> is correct. XML tags also can contain the start and end in the same tag. For example, <br />. Note that XML tag names are case-sensitive.
Elements must be nested properly. Unlike HTML, XML isn’t forgiving. <center><b> Test</center></b> would be rendered correctly as HTML, but would be rejected by an XML parser.
Attributes must be quoted. Attributes of a tag must be enclosed in quotation marks. For example: <tester name= “Tom” /> is correct, but <tester name=Tom /> is incorrect.

Valid XML

XML authors can apply a set of constraints that are used when parsing the XML data known as a schema. There are several different ways to specify a schema, including Document Type Definition (DTD), XML Schema Definition (XSD), and RELAX NG.

The following XSD specifies that the validated XML contains an element with an attribute named id that is exactly 10 digits long (specified in line 5). This element should contain children elements named “CATEGORY” and “DESCRIPTION” whose values are strings (specified in lines 11 and 12).

<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="testCaseIDType">
 <xs:restriction base="xs:string">
  <xs:pattern value="[0-9]{10}"/>
 </xs:restriction>
</xs:simpleType>
<xs:element name="TESTCASE">
 <xs:complexType>
  <xs:sequence>
   <xs:element name="CATEGORY" type="xs:string"/>
   <xs:element name="DESCRIPTION" type="xs:string"/>
  </xs:sequence>
  <xs:attribute name="id" type="testCaseIDType" use="required"/>
 </xs:complexType>
</xs:element></xs:schema>

Important

Programmers can perform high-level validation of data by using an XML schema. If you use a schema, the possibility that malicious or malformed input will make it through the parser and into the application is greatly reduced, making it easier to secure the application.

Including Nonalphanumeric Data in XML Input

You often need to include nonalphanumeric data when testing an application that accepts XML input. For example, to test an application for script injection, you frequently must include HTML tags and quotations marks. (Script injection attacks are discussed in depth in Chapter 10.) However, HTML tags included as part of XML data often cause the parser to fail (not well-formed XML). The following sections discuss how to include arbitrary data in XML data.

CDATA

A CDATA section begins with <![CDATA[ followed by free-form unescaped character data. The section is ended with ]]>. The data within the CDATA section is not interpreted by the parser. Consider the case in which the attacker specifies the name of the car in XML data <CAR color=“purple”>Car Name</CAR>, and the car’s name (text between the <CAR> and </CAR> tags) is stored and later displayed as HTML; a script injection attack should be attempted. The following XML causes the parser to fail:

<?xml version="1.0" encoding="UTF-8"?>
<CAR color="purple"><IMG SRC="javascript:alert(document.domain)">Car Name</CAR>

The problem is <IMG SRC=“javascript:alert(document.domain)”>: it is invalid XML because it has no ending tag. The Microsoft XML parser (MSXML) displays the error “End tag ‘CAR’ does not match the start tag ‘IMG’,” as shown in Figure 11-1.

Figure 11-1. Including angle brackets in content of the Car element, which causes a parser error (not well-formed XML)

A CDATA section can be used to include the image tag as shown here:

<?xml version="1.0" encoding="UTF-8"?>
<CAR color="purple">Car Type<![CDATA[<IMG SRC="javascript:alert(document.domain)">]]></CAR>

Character References

Another way to include character data in XML is by using a Character Entity Reference or a Numeric Character Reference (NCR). Just as characters, such as angle brackets, could be encoded in HTML, they can be encoded in XML. Table 11-1 shows characters and their predefined character entity reference.

Arbitrary characters can be represented as numeric character references by using the characters &#x[character’s hex value]. For example, a null character (hex 00) could be embedded in the XML data by using &#x00.

For printing blocks of printable characters, it is easier to use a CDATA section. Character references are good for representing a few characters at a time and nonprintable characters. Character references can also be used as an attribute of a tag (where CDATA sections aren’t permitted).

Table 11-1. Character Entity References

Character	Predefined entity representation
<	<
>	>
&	&
’	'
”	"

Tip

XML parsers understand CDATA and character references in XML data and return the decoded equivalents to the caller of the parser. For example, the value of the text attribute would be returned as “1 < 3” by the parser for the following XML:

<EXAMPLE text= "1 &lt; 3" />

Programs doing additional decoding after parsing likely contain a double decoding bug.

Testing Really Simple Syndication

Really Simple Syndication (RSS) is a feature that reads an XML document known as a feed on a Web site. RSS readers interpret and display the feed to the user. RSS feeds are commonly used to publish news, mailing lists, and Podcasts. Hidetake Jo and I (Gallagher) recently tested parts of an RSS reader written in C++. The data controlled by the attacker was the RSS feed. In addition to attacking the parser itself, we also tried quite a few other test cases. Here is a partial list (the full list is too long for this text):

HTML scripting attacks. Many RSS readers render items in HTML. Often these HTML rendering engines support script. Sometimes this script runs in an elevated security context (example: the My Computer zone). We tried the following:
```
<description>Test <![CDATA[
"><SCRIPT>alert(document.location);</SCRIPT>]]></description>
```
This test case uses a CDATA section to attempt to close off another tag and inject the <script> tag. We also tried some similar cases using javascript protocol URLs, as discussed in Chapter 10.
Directory traversal. One of the features of RSS is called enclosures. Enclosures are file attachments associated with an RSS item. An RSS item containing an enclosure has a URL of an item to download and store to a local directory. We tried cases in which the enclosure name attempted to escape from the enclosure directory using traversal tricks discussed in Chapter 12.
User interface spoofing. We tried various cases to spoof the look and feel of the RSS reader. As discussed in Chapter 6, user interface (UI) spoofing cases often involve using control characters. To include these characters in the RSS feed, we tried both the character itself and the NCR version of the character. For example, attempting to insert a tab character can be done by using &#x09.
Buffer overflow. We looked at the RSS reader’s code to understand what the application did with each part of the RSS feed once it was returned by the XML parser. We created RSS feeds with specific fields that contained data larger than the code expected. We understood the size limitations by inspecting the code first.
Format strings. We attempted to put format strings in the various fields of the RSS field. For more information about format strings, see Chapter 9.

These test cases help stress the importance of testing for non-XML vulnerabilities in applications that interact with XML. Although RSS feeds are XML files, all of the bugs we discovered were non-XML bugs. Our test cases had to take into account the fact that we were dealing with XML because we knew the RSS reader used an XML parser to access the RSS feed.

Testing XML-Specific Attacks

Attacks specific to XML should be considered whenever an attacker controls XML input or input that is used to create XML. Many of these attacks take advantage of specific XML functionality. The following sections briefly describe various aspects of XML and related functionality in addition to details of the associated security concerns. For a more detailed discussion of these aspects and other XML functionality, please consult the XML specification (http://www.w3.org/TR/REC-xml).

Entities

Similar to a predefined character entity reference where > represents the right angle bracket (>), DTD schemas allow entities where user-defined names and replacement text can be created to provide an easy way to represent text of choice. When an XML parser encounters an entity, the entity name is replaced with its replacement text. C/C++ programmers will find this similar to the notion of #define. The string “New Orleans, Louisiana” can be represented as nola by using the following XML:

<!ENTITY nola "New Orleans, Louisiana">

where nola is the entity name and “New Orleans, Louisiana” is the replacement text. XML also allows entities to reference the contents of files defined by a URL. For example, the contents of http://www.microsoft.com/windowsxp/pro/eula.mspx can be represented as eula with the following XML:

<!ENTITY eula SYSTEM "http://www.microsoft.com/windowsxp/pro/eula.mspx">

Three attacks related to entities are infinite entity reference loops, XML bombs, and external entity attacks.

Infinite Entity Reference Loops

It is possible to create an infinite loop of entities referring to themselves. Consider the following XML that defines two entities named xx and zz:

<!ENTITY % xx '&#x25;zz;'>
<!ENTITY % zz '&#x25;xx;'>
%xx;

The last line of this XML causes % xx to become % zz; and then % zz becomes % xx. Now % xx should be converted again. As you can see, the entity conversion is now in an infinite loop.

This can be used as a denial of service (DoS) against the XML parser. For this reason, the XML specification states XML must not contain recursive entity references to itself (either directly or indirectly). Parsers should not assume XML input is according to spec. Many XML parsers detect this today. When testing an XML parser, it is important to verify recursive entities are not processed.

Tip

This example uses the percent sign (%) in the entity declaration. This is called a parameter entity. These entities can be used only within the DTD.

XML Bombs

Similar to an infinite entity reference loop, an entity can refer to two or more additional entities that also reference several more entities. The following XML is a great example from Rami Jaamour’s article, “Securing Web Services” (http://www.infosectoday.com/IT%20Today/webservices.pdf):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE something [
  <!ENTITY x0 "Developers!">
  <!ENTITY x1 "&x0;&x0;">
  <!ENTITY x2 "&x1;&x1;">
  <!ENTITY x3 "&x2;&x2;">
  <!ENTITY x4 "&x3;&x3;">
  ...
  <!ENTITY x100 "&x99;&x99;">
]>
<something>&x100;</something>

The preceding XML first replaces “&x100;” with “&x99;&x99;” which is then replaced with “&x98;&x98;&x98;&x98;”. This string is next replaced with “&x97;&x97;&x97;&x97;&x97; &x97;&x97;&x97;”. This replacement chain would continue until the replacement string eventually became the string “Developers!” repeated 2^100 times! This is a huge string, and a fair amount of processing occurred to create it. This is another DoS attack against the parser.

Important

XML bombs are a form of decompression bombs. Decompression bombs are discussed in Chapter 14.

Tip

Another important DoS test case to attempt against an application that allows XML input is to include complex XML as input. Complex XML includes XML data that contains a document structure that is heavily nested. The complex structure often requires more processing and memory when compared to a less complex XML structure of the same size.

External Entities

As previously stated, an entity can refer to the contents of a file. The file is specified by the URL in the XML file. The security concern is that attackers might be able to specify an XML file that gets processed under a different security context (by the server, by another user, etc.). By using an external entity, attackers can specify files that they can’t already access. For example, if the XML is processed on the server, the attacker could specify “c:dirSecretPlans.txt” as the URL to retrieve the contents of SecretPlans.txt on the server’s hard drive, which occurs when the server loads the XML and processes external entity references in the DTD.

Sverre H. Huseby discovered an XML external entity (XXE) bug in Adobe Reader. He found that he could read files from victims’ machines when they opened his PDF document. He accomplished this by including XML that referenced the file of his choice on his victim’s hard drive and then used JavaScript contained in the PDF to submit the contents of the referenced file to his server. More information about this vulnerability is available on Sverre’s Web site (http://shh.thathost.com/secadv/adobexxe/). This issue has been fixed in Adobe Reader 7.0.2.

Important

XXE attacks aren’t limited to accessing the contents of the victim’s local hard disk. Because URLs are used to reference the external entity, any URL, including http URLs, that the victim can access, the attacker can then access. This could be used to access Web servers behind a firewall, a Web site on which the victim has been authenticated, and so on.

If you are testing an application that takes XML input, verify that you cannot gain access to files normally not accessible by using XML similar to the following.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE myTest [
  <!ELEMENT secTest ANY>
  <!ENTITY xxe SYSTEM "c:/boot.ini">
]>
<secTest>&xxe;</secTest>

XML Injection

XML is vulnerable to attacks similar to the HTML script injection issues discussed in Chapter 10 where output contains attacker-supplied data. Three common XML injection attacks are XML data injection, Extensible Stylesheet Language Transformations (XSLT) injection, and XPath/XQuery injection.

XML Data Injection

XML is commonly used to store data. If user-supplied data is stored as XML, it might be possible for an attacker to inject extra XML that the attacker would not normally control. Consider the following XML in which the attacker controls only the text Attacker Text:

<?xml version="1.0" encoding="UTF-8"?>
<USER role="guest">Attacker Text</USER>

What is an interesting test case to try as your input instead of Attacker Text? If developers aren’t cautious, they could mistakenly allow XML injection. If User1</USER><USER role=“admin”>User2 is the input, the following XML would be generated (user input is in bold text):

<?xml version="1.0" encoding="UTF-8"?>
<USER role="guest">User1</USER>
<USER role="admin">User2</USER>

If an application reads this file to determine what level of access to give each user, User2 would receive administrative privileges!

Tip

If you are able to inject data into part of the XML, it is worth attempting to send duplicate elements and attributes specified in the earlier part of the XML that you couldn’t control. Some XML parsers will take the last instance of the element/attribute, so you might be able to overwrite the previous values with those of your choice.

Extensible Stylesheet Language (XSL)

In addition to injecting data into the XML, it is possible to get code to run as a result of XML injection. XSL consists of XSL Transforms (XSLT), XML Path Language (XPath) expressions, and XSL Formatting Objects (XSL-FO) and allows a style sheet to be applied to an XML file. This style sheet can transform existing XML data to new XML data. This new XML document is often HTML that is rendered in the Web browser. In this situation, an attacker can inject data that would result in script running in the browser. For example, the following XML is part of an RSS feed that renders a hyperlink:

<link>Attacker Text</link>

To render the preceding XML, an XSLT is applied to return the following HTML to the Web browser:

<A HREF="Attacker Text">Attacker Text</A>

Can you think of a way to run script if you control the text Attacker Text? Even if the programmer HTML-encodes the attacker-supplied text, a scripting protocol can be used to run script with input such as javascript:alert(). If the HTML is rendered within a site or zone that is different from the origin of the RSS feed, this is an HTML scripting attack that occurs through XML data.

Important

When testing for XML injection, try sending angle brackets and quotation marks to escape out of the current XML attribute/tag. A correctly protected application will not allow user-supplied data to escape from XML tags or attributes to prevent XML injection. Generally, the same test cases that applied to script injection and cross-site scripting (XSS) apply to XML injection.

XPath/XQuery Injection

XPath and XQuery are languages that allow querying an XML document in ways similar to SQL. In fact, many popular databases allow querying the database using XPath or XQuery. In many scenarios, an attacker cannot access the XML data directly, but some part of the attacker’s data is used to create an XPath or XQuery statement to query the XML. An attacker can carefully construct input to inject arbitrary queries to retrieve data that the attacker would not normally be allowed to access.

An XML file can contain several different pieces or fields of information. Sometimes only certain parts should be exposed to the end user. For example, the following XML contains our names and social security numbers:

<?xml version='1.0'?>
<staff>
  <author>
    <name>Tom Gallagher</name>
    <SSN>123-45-6789</SSN>
  </author>
  <author>
    <name>Bryan Jeffries</name>
    <SSN>234-56-7890</SSN>
  </author>
  <author>
    <name>Lawrence Landauer</name>
    <SSN>012-345-6789</SSN>
  </author>
</staff>

This XML is stored in a location on a Web server not directly accessible to the end user. A Web page on the server that queries the XML is accessible to the end user. Only the author names should be displayed through the Web page. The XML data is retrieved using the following XPath expression:

//*[contains(name,'Attacker-Data')]/name

where Attacker-Data is data specified by the end user. As you can see, an attacker can control parts of the XPath query. By specifying x’)] | //*| //*[contains(name,‘y as the data, an attacker can return the entire contents of the XML file. This input creates the following XPath expression:

//*[contains(name,'x')] | //*| //*[contains(name,'y')]/name

Notice that the pipe character (|) is used to represent the or operator and two forward slashes and an asterisk (//*) represents all nodes. The preceding XPath expression looks for any of the following three conditions:

Any name that contains x
Any node in the XML file
Any name that contains y

Because the second condition returns all nodes, the attacker will receive all data from the XML file!

More Info

For more information about XPath injection, see Amit Klien’s paper “Blind XPath Injection” at https://www.watchfire.com/securearea/whitepapers.aspx?id=9.

Tip

The same concepts used in XPath/XQuery injection apply to SQL. SQL injection is discussed in depth in Chapter 16.

Large File References

XML files can reference other files specified by a URL. Sometimes the parser will load and parse these files. Two examples of file references are schemas and XML signatures. An attacker can send XML to the victim’s machine and reference additional files in that XML. The additional files can be extremely large in size and consume resources on the victim’s machine if that file is parsed. For example, an attacker might specify the following XML fragment in hopes that the parser visits http://server/file.html containing a large amount of data and computes the digest on the large file:

<Reference URI="http://server/file.html">
 <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
<DigestValue>qZk+NkcGgWq6PiVxeFDCbJzQ2J0=</DigestValue>
</Reference>

Simple Object Access Protocol

Simple Object Access Protocol (SOAP) is a way for a client to call into functions on the server defined in a World Wide Web Consortium (W3C) specification (http://www.w3.org/TR/soap12-part1/). A SOAP request (also called a SOAP message) is composed of XML that contains the following:

An envelope that defines a framework for describing what is in a message and how to process it
A set of encoding rules for expressing instances of application-defined data types
A convention for representing remote procedure calls and responses

SOAP frameworks parse requests and call into the function that defines the SOAP method. Frameworks include the Microsoft .NET Framework and Apache Axis. A Web request is processed first through the Web server, next by the framework, and then runs the specific code for the SOAP method requested. The SOAP framework handles parsing the request and sending the specified parameters to the requested SOAP method.

The following is a sample HTTP request that includes a SOAP message. Notice that it is part of an HTTP POST and the contents of the POST data are the SOAP XML.

POST /soap HTTP/1.1
Content-Type: text/xml; charset=utf-8
SOAPAction: "urn:xmethods-delayed-quotes#getQuote"
Content-Length: 676
Host: services.xmethods.net

<?xml version="1.0" encoding="utf-16"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:soapenc="http://
schemas.xmlsoap.org/soap/encoding/" xmlns:tns="http://www.themindelectric.com/wsdl/
net.xmethods.services.stockquote.StockQuote/" xmlns:types="http://www.themindelectric.com/
wsdl/net.xmethods.services.stockquote.StockQuote/encodedTypes" xmlns:xsi="http://www.w3.org/
2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Body soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
    <q1:getQuote xmlns:q1="urn:xmethods-delayed-quotes">
      <symbol xsi:type="xsd:string">MSFT</symbol>
    </q1:getQuote>
  </soap:Body>
</soap:Envelope>

Important

The SOAP specification states that the SOAPAction header must be present in requests and that servers and firewalls can filter SOAP requests by looking at this header. Some servers do not enforce this requirement and process SOAP requests that are missing the SOAPAction header. This enables attackers to bypass filters that look at this header.

If you test code that attempts to determine whether network traffic contains a SOAP request, verify that the code doesn’t rely solely on the presence of the SOAPAction header. If you test a SOAP framework or other code that directly parses and processes requests, verify that the code does not process requests that omit this header.

The server parses the SOAP message (XML), executes the specified code on the server, and returns the results to the client in an XML response. SOAP messages are usually sent over HTTP. The basic idea is similar to HTML forms, as discussed in Chapter 4, except the method of communication is well defined. SOAP methods have parameters and data types associated with them. Unlike HTML forms, the methods are often published so clients can learn how to call in using a Web Services Description Language (WSDL) file. A WSDL can be requested over the network. For example, browsing to http://www.xmethods.net/sd/StockQuoteService.wsdl returns the following WSDL:

  <?xml version="1.0" encoding="UTF-8" ?>
  <definitions name="net.xmethods.services.stockquote.StockQuote" targetNamespace="http://
www.themindelectric.com/wsdl/net.xmethods.services.stockquote.StockQuote/" xmlns:tns="http://
www.themindelectric.com/wsdl/net.xmethods.services.stockquote.StockQuote/
" xmlns:electric="http://www.themindelectric.com/" xmlns:soap="http://schemas.xmlsoap.org/
wsdl/soap/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenc="http://
schemas.xmlsoap.org/soap/encoding/" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/
" xmlns="http://schemas.xmlsoap.org/wsdl/">
  <message name="getQuoteResponse1">
    <part name="Result" type="xsd:float" />
  </message>
  <message name="getQuoteRequest1">
    <part name="symbol" type="xsd:string" />
  </message>
  <portType name="net.xmethods.services.stockquote.StockQuotePortType">
    <operation name="getQuote" parameterOrder="symbol">
      <input message="tns:getQuoteRequest1" />
      <output message="tns:getQuoteResponse1" />
    </operation>
  </portType>
  <binding name="net.xmethods.services.stockquote.StockQuoteBinding" type="tns:net.xmethods.
services.stockquote.StockQuotePortType">
    <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http" />
      <operation name="getQuote">
        <soap:operation soapAction="urn:xmethods-delayed-quotes#getQuote" />
        <input>
          <soap:body use="encoded" namespace="urn:xmethods-delayed-
quotes" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" />
        </input>
        <output>
          <soap:body use="encoded" namespace="urn:xmethods-delayed-
quotes" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" />
        </output>
      </operation>
  </binding>
  <service name="net.xmethods.services.stockquote.StockQuoteService">
    <documentation>net.xmethods.services.stockquote.StockQuote web service</documentation>
    <port name="net.xmethods.services.stockquote.StockQuotePort" binding="tns:net.xmethods.
services.stockquote.StockQuoteBinding">
    <soap:address location="http://services.xmethods.net/soap" />
  </port>
  </service>
</definitions>

This WSDL contains information stating that a SOAP method named getQuote is exposed.

The WSDL isn’t easily human-readable. A free useful tool that enables interactive testing of SOAP methods is WebService Studio (http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=65a1d4ea-0f7a-41bd-8494-e916ebc4159c). WebService Studio takes a URL of a WSDL and displays each method exposed and calls the method with parameters of your choice. Figure 11-2 shows WebService Studio calling the getQuote method to retrieve the Microsoft stock price.

Important

WSDL files provide a great deal of information to attackers, who often would not have access to this information. If it isn’t necessary to expose the WSDL, you should talk to the product’s programmers to see if it can be removed or edited to help reduce information disclosure.

Figure 11-2. An easy way to read a WSDL and send custom values when calling SOAP methods in WebService Studio

Testing SOAP

It is important to be able to place arbitrary data in a SOAP request and/or response when testing. An HTTP proxy like Web Proxy Editor or Paros Proxy can be used for this testing. Although such tools as WebService Studio do make a good starting point, total control of the network traffic is not possible. For example, you cannot insert binary data. A good approach is to use a combination of tools including WebService Studio and an HTTP proxy for ease and complete access to the network traffic.

Two big areas to check for SOAP security bugs are malicious client/server bugs where the attacker sends unexpected input to or from the server (discussed in Chapter 4 and Chapter 5) and XML-specific attacks. Two attacks that directly target the SOAP framework are SOAP array denial of service attacks (DoS) and SOAP XML bombs.

Tip

A tool named WSBang (http://www.isecpartners.com/tools.html) can fuzz each SOAP method listed in a given WSDL and report failures.

SOAP Array DoS Attacks

The second part of the SOAP specification (http://www.w3.org/TR/2001/WD-soap12-part2-20011217) describes how SOAP data can be encoded. Included in this part of the specification is information about SOAP arrays. A SOAP array of six integers looks like the following:

<unluckyNumbers xmlns:xs="http://www.w3.org/2001/XMLSchema"
                   xmlns:enc="http://www.w3.org/2001/12/soap-encoding"
                   enc:arrayType="xs:int[6]" >
  <number>4</number>
  <number>8</number>
  <number>15</number>
  <number>16</number>
  <number>23</number>
  <number>42</number>
</unluckyNumbers>

Some servers allocate memory to prepare for the array following the array size specification (int[6] in the preceding example). This allows for a potential DoS where the attacker specifies a large size that results in the server consuming large amounts of memory. For example, you can test for this condition by using XML similar to the preceding example, but by specifying the array size is 500,000.

More Info

Web Service Security (WS-Security) is a security feature that can be used to mitigate some SOAP attacks. However, additional attacks are specific to WS-Security. Karthik Bhargavan, Cédric Fournet, and Andy Gordon of Microsoft Research have done extensive research to identify these attacks. Their research is available at http://securing.ws.

SOAP XML Bombs

As previously discussed, DTDs can be used to build strings dynamically on the victim’s machine and consume large amounts of memory. Just as many of the XML attacks discussed in this chapter apply to SOAP, XML bombs also apply. However, the SOAP 1.1 specification states that a SOAP message must not contain a DTD. Although this should make XML bombs nonexistent against SOAP, some programs parsing SOAP XML don’t disable DTDs. Microsoft learned the hard way about this. Amit Klein found that he could create XML bombs and send them to the .NET Framework to cause a DoS. Microsoft released an update (http://support.microsoft.com/default.aspx?kbid=826231) that allows disabling DTDs in SOAP messages.

If you test code that parses SOAP XML, it is worth testing to see if DTDs are allowed. If you don’t need to support DTDs, removing this functionality will help reduce your attack surface. If your application relies on another component such as the .NET Framework for SOAP XML parsing, verify that you have disabled DTD processing functionality if possible.

Important

SOAP calls are also potentially susceptible to cross-site request forgery attacks, which are discussed in Chapter 19.

Testing Tips

Use the following tips when you are testing products that use XML:

When you test an application that consumes XML input, do not limit testing to XML-specific cases. Most non-XML-specific attacks (HTML scripting attacks, spoofing, buffer overflows, information disclosure, etc.) can occur through XML.
Use CDATA and character references to include arbitrary characters as part of the XML, while still creating well-formed and valid XML.
When creating XML input, it is important to use an editor that allows complete control of all aspects of the data. For example, an XML-specific editor might not allow you to create certain fields or might automatically change data when saving it. A basic text or binary editor is ideal for XML files and a Web proxy for SOAP messages.
Don’t forget the XML- and SOAP-specific tests, including infinite entity reference loops, XML bombs, complex XML, external entities, XML injection, large file references, and SOAP array DoS.

Summary

XML usage is becoming popular in both client and server applications. XML data sent to an application should be treated just as other input code paths. Most attacks that are possible in traditional input data are also possible with XML input (HTML scripting attacks, spoofing, buffer overflows, etc.). Testing for these types of issues can require that you encode certain characters so that the test case is seen by the parser as well-formed and valid XML. As discussed, you should also test XML-specific attacks. When testing SOAP requests, it is important to create custom requests to perform malicious client testing against the server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11. XML Issues

Create new playlist

Sign In

Sign Up

Chapter 11. XML Issues

Note

Testing Non-XML Security Issues in XML Input Files

Important

Well-Formed XML

Valid XML

Important

Including Nonalphanumeric Data in XML Input

CDATA

Character References

Tip

Testing Really Simple Syndication

Testing XML-Specific Attacks

Entities

Infinite Entity Reference Loops

Tip

XML Bombs

Important

Tip

External Entities

Important

XML Injection

XML Data Injection

Tip

Extensible Stylesheet Language (XSL)

Important

XPath/XQuery Injection

More Info

Tip

Large File References

Simple Object Access Protocol

Important

Important

Testing SOAP

Tip

SOAP Array DoS Attacks

More Info

SOAP XML Bombs

Important

Testing Tips

Summary

Table of Contents for
11. XML Issues