Chapter 2. Attack Motivations

DNN technology is now part of our lives. For example, digital assistants (such as Amazon Alexa, Apple’s Siri, Google Home, and Microsoft’s Cortana) use deep learning models to extract meaning from speech audio. Many algorithms that enable and curate online interactions (such as web searching) exploit DNNs to understand the data being managed. Increasingly, deep learning models are being used in safety-critical applications, such as autonomous vehicles.

Many AI technologies take data directly from the physical world (from cameras, for example) or from digital representations of that data intended for human consumption (such as images uploaded to social media sites). This is potentially problematic, as when any computer system processes data from an untrusted source it may open a vulnerability. Motivations for creating adversarial input to exploit these vulnerabilities are diverse, but we can divide them into the following broad categories:

Evasion

Hiding content from automated digital analysis. For example, see “Circumventing Web Filters”, “Camouflage from Surveillance”, or “Personal Privacy Online”.

Influence

Affecting automated decisions for personal, commercial, or organizational gain. See for example “Online Reputation and Brand Management”.

Confusion

Creating chaos to discredit or disrupt an organization. See for example “Autonomous Vehicle Confusion” or “Voice Controlled Devices”.

This chapter presents some possible motivations for creating adversarial examples. The list is by no means exhaustive, but should provide some indication of the nature and variety of the types of threat.

Circumventing Web Filters

Organizations are under increasing pressure to govern web content sourced from outside to protect against content that might be deemed offensive or inappropriate. This applies particularly to companies such as social media providers and online marketplaces with business models that depend on external data. There may also be legal obligations in place to monitor offensive material and prevent it from being propagated further.

Such organizations face an increasingly difficult challenge. There just aren’t enough people to constantly monitor, and if necessary take action on, all the data being uploaded at the speeds required. Social media sites boast billions of data uploads per day. The data in those posts is not structured data that is easy to filter; it is image, audio, and text information where the categorization of “offensive”/“not offensive” or “legal”/“illegal” can be quite subtle. It is not possible for humans to monitor and filter all this content as it is uploaded.

The obvious solution is, therefore, to use intelligent machines to monitor, filter, or at least triage the data, as depicted in Figure 2-1. DNNs will be increasingly core to these solutions—they can be trained to categorize sentiment and offense in language, they can classify image content, and they are even able to categorize activities within video content. For example, a DNN could be trained to recognize when an image contains indications of drug use, enabling triage of this category of image for further verification by a human.

For an individual or group wishing to upload content that does not adhere to a target website’s policies, there’s the motivation to circumvent the filtering or monitoring put in place, while ensuring that the uploaded content still conveys the information intended for human beings. From the perspective of the organization monitoring its web content, more and more accurate algorithms are required to judge what is “offensive,” “inappropriate,” or “illegal,” while also catching adversarial input. From the adversary’s perspective, the adversarial web content will need to improve at a pace with the monitoring system in order to continue to evade detection by the AI and convey the same semantic meaning when seen, read, or heard by a human.

An adversary might also take another stance. If unable to fool the web upload filter, why not just spam it with lots of data that will cause confusion and additional cost to the defending organization? The decision of whether a data upload is deemed “offensive” by the AI is unlikely to be purely binary, but more likely a statistical measure of likelihood with some threshold. An organization might use human moderation to consider images or data around this threshold, so generating large amounts of benign data that is classified by the AI as “maybe” will impact the organization’s operational capabilities and may reduce confidence in the accuracy of its AI results. If it’s difficult for a human to establish exactly why the data is being triaged as possibly breaking policy (because it appears benign), the influx of data will take up more of an organization’s time and human resources—essentially a denial-of-service (DoS) attack.

A simple depiction of a processing chain for upload of images to a web portal
Figure 2-1. Images uploaded to a social media site might undergo processing and checking by AI before being added to the site.

Online Reputation and Brand Management

Search engines are complex algorithms that decide not only which results to return when you type “cat skateboard,” but also the order in which they are presented in the results. From a commercial perspective, it’s obviously good to be a top result. Companies are therefore highly motivated to understand and game the search engine algorithms to ensure that their adverts appear on the first page of a Google or Bing search and are placed prominently when served on web pages. This is known as search engine optimization (SEO) and has been standard industry practice for many years. SEO is often core to company internet marketing strategies.

Automated web crawlers can be used to sift through and index pages for search results based on characteristics such as page HTML metadata, inbound links to the page, and content. These web crawlers are automated systems underpinned by AI without human oversight. As it’s very easy to manipulate header information, search engines rely more often on content. Indexing based on content also enables search on less obvious search terms that have perhaps not been included in the metadata.

It’s the characteristics based on content that are particularly interesting in the context of adversarial examples. Updating a website’s image content may affect its position in the search engine results, so, from the perspective of a company wanting to increase its visibility to a target audience, why not exploit adversarial perturbation or patches to alter or strengthen image categorization without adversely affecting its human interpretation?

Alternatively, on the more sinister end of the spectrum, there might be motivation to discredit an organization or individual by generating adversarial images, causing the target to be misassociated with something that could be damaging. For example, adversarial images of a chocolate bar that is misclassified by the search engine as “poison” might appear among search results for poison images. Even a subliminal association may be sufficient to affect people’s perception of the brand.

Camouflage from Surveillance

Surveillance cameras acquire their data from the physical world, opening a very different perspective on adversarial input than considered in the previous examples. The digital content is being generated based on sensor (camera) data and is not subject to manipulation by individuals outside the organization.1

The digital rendering of the security footage (the video or image stills) is still often monitored by humans, but this is increasingly infeasible due to the quantities of information and time involved. Most surveillance footage is unlikely to be actively monitored in real time, but may be analyzed later in “slower time”; for example, in the event of some reported criminal activity. Organizations are increasingly turning to automated techniques to monitor or triage surveillance data through AI technologies; for example, to automatically detect and alert to specific faces or vehicles in surveillance footage.

It doesn’t take much imagination to envisage scenarios where an adversary would wish to outwit such a system. The adversary’s aim might be to create a kind of “invisibility cloak” that fools AI but would not draw undue human attention. The goal might be simply to prevent the AI from generating an alert that would result in human scrutiny being applied to the image or video. For example, an adversary might aim to avoid facial detection in real time by an airport security system. Similarly, there may be greater opportunity to carry out nefarious deeds if a real-time threat-detection system does not recognize suspicious activity. In non-real time, security cameras might have captured information pertaining to a crime, such as a face or number plate or another feature that might be used for searching based on witness evidence after the event. Concealing this information from the AI might reduce the chance of criminal detection.

Of course, the motivations may not be criminal; privacy in our increasingly monitored world might be an incentive to camouflage the relevant salient aspects of a face from AI. The individual may be motivated to achieve this through innocuous clothing or makeup, for example, in order to claim plausible deniability that this was a deliberate attempt to fool the surveillance system and to assert that the surveillance system was simply in error.2

There’s another interesting scenario here: what if physical changes to the real world could be introduced that, although seeming benign and looking innocent to humans, could cause surveillance systems to register a false alarm? This might enable an adversary to distract an organization into directing its resources to a false location while the actual deed was committed elsewhere.

Personal Privacy Online

Many social media platforms extract information from the images that we upload to improve the user experience. For example, Facebook routinely extracts and identifies faces in images to improve image labeling, searching, and notifications.

Once again, a desire for privacy could motivate an individual to alter images so that faces are not easily detected by the AI the platform uses. Alterations such as adversarial patches applied to the edge of an image might “camouflage” faces from the AI.

Autonomous Vehicle Confusion

A commonly cited use of AI is in autonomous vehicles, consideration of which moves us into the realm of safety-critical systems. These vehicles operate in the messy, unconstrained, and changing physical world. Susceptibility to adversarial input could result in potentially disastrous consequences.

Autonomous vehicles are not restricted to our roads. Autonomy is increasingly prevalent in maritime situations, in the air, and underwater. Autonomous vehicles are also used in constrained, closed environments, such as in factories, to perform basic or perhaps dangerous tasks. Even these constrained environments could be at risk from camera-sourced adversarial input from within the organization (insider threat) or from individuals who have gained access to the area in which the system is operating. However, we must remember that autonomous vehicles are unlikely to rely solely on sensor data to understand the physical environment; most autonomous systems will acquire information from multiple sources. Data sources include:

Off-board data

Most autonomous vehicles will rely on data acquired from one or more off-board central sources.3 Off-board data includes relatively static information (maps and speed limits), centrally collected dynamic data (such as traffic information), and vehicle-specific information (such as GPS location). All these types of data sources are already used in GPS navigation applications such Waze, Google Maps, and HERE WeGo.

Other off-board data is available in other domains. For example, in shipping, Automatic Identification System (AIS) data is used extensively for automatic tracking of the location of maritime vessels. Ships regularly transmit their identity and location through this system in real time, enabling maritime authorities to track vessel movements.

Onboard sensor data

Autonomous vehicle decisions may also be based on onboard sensors such as cameras, proximity sensors, accelerometers, and gyro sensors (to detect positional rotation). This data is critical in providing information on changes to the immediate vicinity, such as real-time alerts and unexpected events.

There may be occasions where an autonomous vehicle must rely on sensor data only in order to make its decisions. Road positioning is an example that might be derived entirely from sensor data, as shown in Figure 2-2. Such scenarios can pose a significant safety risk as the information generated is potentially untrusted.

In practice, autonomous vehicles will base decisions on information established based on multiple data sources and will always err on the side of caution. A vehicle is unlikely to be fooled by a stop sign that has been adversarially altered to say “go” if it also has access to central data pertaining to the road’s regulations (speed limits, junctions, and stop and give-way requirements). It is more likely that adversaries will exploit this caution—for example, distributing multiple benign printed stickers on the road that are misinterpreted as hazardous objects could cause road network disruption.

A simple depiction of a processing chain in an autonomous vehicle.
Figure 2-2. Camera data might be used by an autonomous vehicle to ensure correct road positioning

Voice Controlled Devices

Voice control audio provides a natural, hands-free way to control many aspects of our lives. Tasks ranging from media control, home automation, and internet search to shopping can all be accomplished now through voice control. Voice controlled devices such as smartphones, tablets, and audio assistants are being welcomed into our homes. The speech processing central to these devices is performed through advanced DNN technologies and is highly accurate. Figure 2-3 depicts a simple processing chain for a voice controlled device.

Audio is streamed into the home through radio, television, and online content. Unbeknown to the listener, this audio might also incorporate adversarial content, perhaps to notify listening audio assistants to increase the volume of a song as it plays. While this might be no more than irritating, adversarial audio might also be used to discredit the audio assistant by causing it to misbehave. An assistant exhibiting unpredictable behavior will be irritating and potentially perceived as creepy. Once a device is untrusted in the home environment, it is unlikely to regain trust easily. Other potential hidden commands could be more malevolent; for example, sending an unwarranted SMS or social media post, changing the settings on the device, navigating to a malicious URL, or changing home security settings.

A simple depiction of a processing chain in a digital assistant.
Figure 2-3. A digital assistant uses AI to process speech audio and respond appropriately

Voice controlled assistants mandate some additional security steps to perform higher-security functions to avoid accidental or deliberate misuse. The assistant might ask. “Are you sure you wish to purchase a copy of Strengthening Deep Neural Networks?” and await confirmation prior to purchasing. It’s possible, but highly unlikely, that you would say “yes” at the correct time if you had not instigated that purchase. However, if it’s possible to get an adversarial command into the voice assistant, then it is also possible to introduce the adversarial response, so long as no one is within earshot to hear the confirmation request.

Here’s an interesting twist. Perhaps adversarial examples are not always malevolent—they might be used for entertainment purposes to deliberately exploit audio as an additional method to transfer commands. There’s potential to exploit adversarial examples for commercial gain, rather than for just doing bad stuff.

Imagine settling down with popcorn in front of your television to watch The Shining. However, this is no ordinary version of the film; you paid extra for the one with “integrated home automation experience.” This enhanced multimedia version of the film contains adversarial audio—secret messages intended for your voice controlled home automation system to “enhance” the viewing experience. Perhaps, at an appropriate moment, to slam the door, turn out the lights, or maybe turn off the heating (it gets pretty cold in that hotel)…

On that slightly disturbing note, let’s move on to the next chapter and an introduction to DNNs.

1 This is obviously not true if there has been a more fundamental security breach allowing outsiders access to the organization’s internal content.

2 See Sharif et al., “Accessorize to a Crime.”

3 There may be constrained environments, such as within a factory or within the home, where the autonomous vehicle is entirely dependent on onboard sensor data to function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset