Preface

Artificial intelligence (AI) is prevalent in our lives. Every day, machines make sense of complex data: surveillance systems perform facial recognition, digital assistants comprehend spoken language, and autonomous vehicles and robots are able to navigate the messy and unconstrained physical world. AI not only competes with human capabilities in areas such as image, audio, and text processing, but often exceeds human accuracy and speed.

While we celebrate advancements in AI, deep neural networks (DNNs)—the algorithms intrinsic to much of AI—have recently been proven to be at risk from attack through seemingly benign inputs. It is possible to fool DNNs by making subtle alterations to input data that often either remain undetected or are overlooked if presented to a human. For example, alterations to images that are so small as to remain unnoticed by humans can cause DNNs to misinterpret the image content. As many AI systems take their input from external sources—voice recognition devices or social media upload, for example—this ability to be tricked by adversarial input opens a new, often intriguing, security threat. This book is about this threat, what it tells us about DNNs, and how we can subsequently make AI more resilient to attack.

By considering real-world scenarios where AI is exploited in our daily lives to process image, audio, and video data, this book considers the motivations, feasibility, and risks posed by adversarial input. It provides both intuitive and mathematical explanations for the topic and explores how intelligent systems can be made more robust against adversarial input.

Understanding how to fool AI also provides us with insights into the often opaque deep learning algorithms, and discrepancies between how these algorithms and the human brain process sensory input. This book considers these differences and how artificial learning may move closer to its biological equivalent in the future.

Who Should Read This Book

The target audiences of this book are:

  • Data scientists developing DNNs. You will gain greater understanding of how to create DNNs that are more robust against adversarial input.

  • Solution and security architects incorporating deep learning into operational pipelines that take image, audio, or video data from untrusted sources. After reading this book, you will understand the risks of adversarial input to your organization’s information assurance and potential risk mitigation strategies.

  • Anyone interested in the differences between artificial and biological perception. If you fall into this category, this book will provide you with an introduction to deep learning and explanations as to why algorithms that appear to accurately mimic human perception can get it very wrong. You’ll also get an insight into where and how AI is being used in our society and how artificial learning may become better at mimicking biological intelligence in the future.

This book is written to be accessible to people from all knowledge backgrounds, while retaining the detail that some readers may be interested in. The content spans AI, human perception of audio and image, and information assurance. It is deliberately cross-disciplinary to capture different perspectives of this fascinating and fast-developing field.

To read this book, you don’t need prior knowledge of DNNs. All you need to know is in an introductory chapter on DNNs (Chapter 3). Likewise, if you are a data scientist familiar with deep learning methods, you may wish to skip that chapter.

The explanations are presented to be accessible to both mathematicians and non-mathematicians. Optional mathematics is included for those who are interested in seeing the formulae that underpin some of the ideas behind deep learning and adversarial input. Just in case you have forgotten your high school mathematics and require a refresher, key notations are included in the appendix.

The code samples are also optional and provided for those software engineers or data scientists who like to put theoretical knowledge into practice. The code is written in Python, using Jupyter notebooks. Code snippets that are important to the narrative are included in the book, but all the code is located in an associated GitHub repository. Full details on how to run the code are also included in the repository.

This is not a book about security surrounding the broader topic of machine learning; its focus is specifically DNN technologies for image and audio processing, and the mechanisms by which they may be fooled without misleading humans.

How This Book Is Organized

This book is split into four parts:

Part I, An Introduction to Fooling AI

This group of chapters provides an introduction to adversarial input and attack motivations and explains the fundamental concepts of deep learning for processing image and audio data:

  • Chapter 1 begins by introducing adversarial AI and the broader topic of deep learning.

  • Chapter 2 considers potential motivations behind the generation of adversarial image, audio, and video.

  • Chapter 3 provides a short introduction to DNNs. Readers with an understanding of deep learning concepts may choose to skip this chapter.

  • Chapter 4 then provides a high-level overview of DNNs used in image, audio, and video processing to provide a foundation for understanding the concepts in the remainder of this book.

Part II, Generating Adversarial Input

Following the introductory chapters of Part I, these chapters explain adversarial input and how it is created in detail:

  • Chapter 5 provides a conceptual explanation of the ideas that underpin adversarial input.

  • Chapter 6 then goes into greater depth, explaining computational methods for generating adversarial input.

Part III, Understanding the Real-World Threat

Building on the methods introduced in Part II, this part considers how an adversary might launch an attack in the real world, and the challenges that they might face:

  • Chapter 7 considers real attacks and the challenges that an adversary faces when using the methods defined in Part II against real-world systems.

  • Chapter 8 explores the specific threat of adversarial objects or adversarial sounds that are developed and created in the physical world.

Part IV, Defense

Building on Part III, this part moves the discussion to building resilience against adversarial input:

  • Chapter 9 considers how the robustness of neural networks can be evaluated, both empirically and theoretically.

  • Chapter 10 explores the most recent thinking in the area of how to strengthen DNN algorithms against adversarial input. It then takes a more holistic view and considers defensive measures that can be introduced to the broader processing chain of which the neural network technology is a part.

  • Finally, Chapter 11 looks at future directions and how DNNs are likely to evolve in forthcoming years.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/katywarr/strengthening-dnns.

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Strengthening Deep Neural Networks by Katy Warr (O’Reilly). Copyright 2019 Katy Warr, 978-1-492-04495-6.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

The Mathematics in This Book

This book is intended for both mathematicians and nonmathematicians. If you are unfamiliar with (or have forgotten) mathematical notations, Appendix A contains a summary of the main mathematical symbols used in this book.

O’Reilly Online Learning

Note

For almost 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/Strengthening_DNNs.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

I am very grateful to the O’Reilly team for giving me the opportunity to write this book and providing excellent support throughout. Thank you especially to my editor, Michele Cronin, for her help and encouragement, and to the production team of Deborah Baker, Rebecca Demarest, and Sonia Saruba. Thanks also to Nick Adams from the tools team for working out some of the more tricky LaTeX math formatting.

Thank you to my reviewers: Nikhil Buduma, Pin-Yu Chen, Dominic Monn, and Yacin Nadji. Your comments were all extremely helpful. Thank you also Dominic for checking over the code and providing useful suggestions for improvement.

Several of my work colleagues at Roke Manor Research provided insightful feedback that provoked interesting discussions on deep learning, cybersecurity, and mathematics. Thank you to Alex Collins, Robert Hancock, Darren Richardson, and Mark West.

Much of this book is based on recent research and I am grateful to all the researchers who kindly granted me permission to use images from their work.

Thank you to my children for being so supportive: Eleanor for her continual encouragement, and Dylan for patiently explaining some of the math presented in the research papers (and for accepting that “maths” might be spelled with a letter missing in this US publication).

Finally, thank you to my husband George for the many cups of tea and for reviewing the early drafts when the words were in completely the wrong order. Sorry I didn’t include your jokes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset