15.5 Ensuring Web Security Properties

Here we examine the challenge of achieving the CIA properties on websites. We examine confidentiality and integrity here. A separate section discusses availability and a final section examines web privacy.

Web Confidentiality

Web services face a range of confidentiality problems and related security objectives. Although many web servers offer their contents to any and all potential visitors, some try to restrict access. Any site that collects confidential information from visitors is obliged to protect that information. We examine these cases here.

Serve Confidential Data

Most sites address this problem by identifying trustworthy users. If the user is authorized to retrieve the information, then we trust users to protect it when it arrives on their clients. We can’t really prevent a user from making copies of data they retrieve from a website. Sites that implement this type of confidentiality rely on server-based authentication and access control to protect the data on the server and SSL protection on the data while in transit.

A variant of this arises when the site wishes to enforce digital rights management. SSL only protects data while in transit. It won’t prevent sharing through Transitive Trust once a user retrieves the data from the site. (See Section 7.5.) Most DRM implementations apply the security measures directly to the data. Each authorized user receives a credential that allows decryption under controlled circumstances. We saw that credentials for DVD buyers were built into DVD player hardware in Section 8.2.4.

Collect Confidential Data

Sites often collect confidential data as part of a retail or financial transaction, although it may occur in other circumstances as well. The banking industry has developed a broad range of standards and recommendations for protecting web-based transactions, but the best-known standards come from the payment card industry: PCI DSS. (See Section 4.5.2.) These standards specify protection within retail establishments that use credit cards, as well as websites that use credit cards.

Typical implementations rely on a secure server host and SSL protection on network traffic. The client must support SSL, but otherwise has few restrictions.

Web Integrity

There are essentially two integrity problems in the web environment: The first is ensuring integrity of data presented by the server; the second is ensuring integrity of data transmitted between the server and client.

We briefly addressed the server integrity problem when discussing static websites: If we can prevent outsiders from modifying the site’s files, we can prevent most problems. The problem becomes more complex as sites allow visitors to store data, because this brings up XSS risks and related attacks. The notion of site integrity should indicate that the site does not pose a risk to visitors.

We maintain the integrity of data in transit by using SSL. Even though TCP by itself ensures integrity in the face of random communications errors, it cannot protect against a malicious attempt to modify traffic. Such assurance requires cryptographic integrity measures.

15.5.1 Web Availability

We classify levels of web service availability into four categories. These categories often suggest both the level of effort and the types of techniques used to improve availability.

  1. Routine availability: The system may suffer occasional downtime, either expected or unexpected. The system managers take no special steps to ensure availability.

  2. High availability: The system may experience scheduled downtime, but it never experiences unscheduled downtime. Such systems require special designs to mask unexpected failures, possibly through redundancy.

  3. Continuous operation: The system is designed to operate continuously with no scheduled outages; however, it might experience unscheduled outages. Such systems often rely on special hardware to allow routine maintenance and component swapping without shutting down the system.

  4. Continuous availability: The system is designed to experience no outages at all, either scheduled or unscheduled. This requires a combination of the techniques used to achieve both high availability and continuous operation.

Redundancy plays a fundamental role in achieving higher levels of availability. To avoid software-based failures, sites may deploy upgraded software in test environments. The sites then perform functional and stability tests before deploying the revised software. Such testing naturally requires separate systems to host the test environment.

High Availability

For high availability, a system generally incorporates redundant components for processing, storage, and networking. Application processing is distributed among an integrated group of processors called a cluster. The system may “load balance” among the processors to increase overall performance while reducing the risk of a crash. If one processor in the cluster crashes, another takes up its work.

Although RAID provides redundancy on a small scale, large-scale sites often rely on storage area networks (SANs). These systems use network protocols to distribute disk storage across multiple devices. A modern SAN may distribute hardware geographically to provide higher assurance against a site-specific disaster.

Network redundancy relies on multiple links and routers. In some cases, the site simply deploys multiple devices with standard internetworking protocols. For some types of load-balancing, however, the network may use transport-layer header information to route traffic to particular servers.

Continuous Operation

By itself, high availability does not ensure continuous operation. A RAID-configured hard drive system still may need to be taken offline in order to replace a failed drive. Other types of routine maintenance may require the whole system be taken offline, including power system modifications and some network modifications.

We rely on redundancy and special architectures to achieve continuous operation. We build separate systems that rely on separate networking and power connections. We use a SAN to distribute data and to allow ongoing maintenance of the storage system. We use load balancing to distribute the work between separate components, and the load balancing components themselves also must be redundant to allow routine maintenance.

Continuous Availability

We achieve continuous availability by eliminating every reason for the system to shut down, either intentionally or by accident. This combines high-availability design tricks with continuous operation techniques. By itself, continuous operation provides the procedures to perform maintenance without shutting the entire system down. To achieve continuous availability, we must ensure that a second failure can’t bring the system down before we fix the first one.

Continuous availability often relies on geographic dispersion; we place redundant servers in geographically separate locations. This reduces the risk of losing service due to a local problem like a power failure or a local disaster like a flood, tornado, or hurricane.

It can be tricky to deploy a dynamic website across geographically distributed locations. Dynamic sites rely on a database that also must be geographically distributed, and database contents may vary in different locations. If the site implements user sessions or shopping carts, then the overall system must ensure that a single session takes place using a particular database.

15.5.2 Web Privacy

One shortcoming of a rigorous, enterprise-focused security engineering process is that it often shifts focus away from risks to customers and other outside individuals. A privacy breach might not damage an enterprise directly, but it may cause significant damage to individuals. This has produced legislation in many regions to protect personal privacy and to hold enterprises accountable for data they store.

In addition, many web users try to preserve privacy through other techniques, notably through anonymity and through “private browsing.” We examine these cases here.

Client Anonymity

Anonymity is both a blessing and a curse on the internet. On some occasions, it seems impossible to hold people accountable for internet misbehavior. On other occasions, it seems impossible to browse without leaving an undesired trail.

People prize anonymity on the internet for a variety of reasons. A plausible, and often socially acceptable, reason for anonymity is to mask the identity of a political dissenter or anyone who wants to voice an unpopular opinion. Anonymity becomes particularly important in places where political dissenters face injury or arrest. In countries that do not criminalize dissent, anonymity may be used to hide other legal but disreputable activities.

Even though web browsing seems like an anonymous activity to many people, the browser provides servers with many details about its users. Researchers have noted that browsers will report detailed configuration information about the browsing host, and the information may be enough to “fingerprint” individual hosts. Although this technique has not seen significant practical use, it might provide circumstantial evidence of a user’s visit.

More significantly, the TCP/IP connection must provide the client’s numerical IP address. Even if the site applies NAT to the IP address, the address and port number uniquely identify the visit. Law enforcement organizations in the United States suggest that ISPs try to preserve information that ties IP addresses to specific subscribers for at least 6 months so that the data may be available to investigators.

Anonymous Proxies

When people wish to further hide their identities while web browsing, they may use an anonymous proxy. To use one, the user configures the browser’s proxy setting to direct traffic at the proxy’s IP address. The browser essentially “tunnels” the IP packets to the proxy, which forwards the packet after applying NAT to the source address. The receiving server sees the anonymous proxy as the packet’s source.

When the server transmits its reply, it directs the packet at the proxy, because that was the address in the inbound packets. The proxy identifies the appropriate browser and sends the packet back to that browser. The user’s traffic remains anonymous as long as an eavesdropper can’t correlate the server’s traffic with that user’s traffic. For example, the eavesdropper simply might monitor traffic at the anonymous proxy. The eavesdropper can associate the user and server addresses by comparing the sizes, contents, and timing of the packets.

Client users must also rely on the proxy to protect their identities. If the proxy keeps a record of visitors and their activities, then there is a risk of the anonymous users being unmasked. Moreover, an attacker who subverts a proxy may interfere with user traffic or indulge in extortion of those relying on the proxy’s anonymity.

The safest anonymous proxies use a technique called onion routing, which routes each connection through a network of proxies. This makes it much more difficult to associate user and server traffic, because connections enter and leave at a variety of points. The Tor network, run by the EFF, is the best-known anonymous proxy network. According to the EFF, Tor users include casual users desiring additional privacy as well as journalists, law enforcement, intelligence agents, business executives, and political activists.

Sites that resist privacy may restrict access via anonymous proxies. Nations that filter their citizens’ web traffic may often try to block access to proxies, because the proxies would circumvent filtering. Wikipedia, the online community encyclopedia, does not allow visitors to edit its articles if visiting via an anonymous proxy, in order to make visitors more accountable for their edits.

Private Browsing

Certain companies try to systematically track a user’s browsing activity. The primary example is in third-party advertising; the advertiser tries to track each user’s visits and then infers a user’s interests from the types of sites visited. Advertisers typically use cookies to track user behavior; the advertiser then selects ads to display to each user based on the particular user’s behavior.

In a perfectly safe and fully functional environment, the server wants to know as much as possible about each client host. The server wants to know the screen size, the contents of its font library, performance capabilities, data formats supported, and anything else that helps optimize the site’s display. Although it isn’t always clear what bits of information will significantly improve the client’s experience when visiting the site, designers are inclined to share as much information as possible. This makes it hard to ensure privacy.

Moreover, browsers routinely maintain a great deal of information about the sites a user visits. Some information resides in the browser’s “history,” which may help users locate a site visited earlier. Browsers also maintain an internal cache with copies of pages and images retrieved from the web. This speeds up the process of displaying a site if the user returns to it; only revised files need to be retrieved.

Between its cookies, history, and cache, the browser retains a great deal of information about a user’s web-visiting behavior. Users that worry about external eavesdroppers monitoring their web browsing also may worry about people searching their client host.

Some browsers provide special features to support private browsing. These may consist of erasing the history, cookies, and cache after the browsing session ends. Other features might restrict which client details the browser reveals.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset