Preface

We are experiencing a renaissance of artificial intelligence, and everyone and their neighbor wants to be a part of this movement. That’s quite likely why you are browsing through this book. There are tons of books about deep learning out there. So you might ask us, very reasonably, why does this book even exist? We’ll get to that in just a second.

During our own deep learning journeys since 2013 (while building products at companies including Microsoft, NVIDIA, Amazon, and Square), we witnessed dramatic shifts in this landscape. Constantly evolving research was a given and a lack of mature tooling was a reality of life.

While growing and learning from the community, we noticed a lack of clear guidance on how to convert research to an end product for everyday users. After all, the end user is somewhere in front of a web browser, a smartphone, or an edge device. This often involved countless hours of hacking and experimentation, extensively searching through blogs, GitHub issue threads, and Stack Overflow answers, and emailing authors of packages to get esoteric knowledge, as well as occasional “Aha!” moments. Even the books on the market tended to focus more on theory or how to use a specific tool. The best we could hope to learn from the available books was to build a toy example.

To fill this gap between theory and practice, we originally started giving talks on taking artificial intelligence from research to the end user with a particular focus on practical applications. The talks were structured to showcase motivating examples, as well as different levels of complexity based on skill level (from a hobbyist to a Google-scale engineer) and effort involved in deploying deep learning in production. We discovered that beginners and experts alike found value in these talks.

Over time, the landscape thankfully became accessible to beginners and more tooling became available. Great online material like Fast.ai and DeepLearning.ai made understanding how to train AI models easier than ever. Books also cornered the market on teaching fundamentals using deep learning frameworks such as TensorFlow and PyTorch. But even with all of this, the wide chasm between theory and production remained largely unaddressed. And we wanted to bridge this gap. Thus, the book you are now reading.

Using approachable language as well as ready-to-run fun projects in computer vision, the book starts off with simple classifiers assuming no knowledge of machine learning and AI, gradually building to add complexity, improve accuracy and speed, scale to millions of users, deploy on a wide variety of hardware and software, and eventually culminate in using reinforcement learning to build a miniature self-driving car.

Nearly every chapter begins with a motivating example, establishes the questions upfront that one might ask through the process of building a solution, and discusses multiple approaches for solving problems, each with varying levels of complexity and effort involved. If you are seeking a quick solution, you might end up just reading a few pages of a chapter and be done. Someone wanting to gain a deeper understanding of the subject should read the entire chapter. Of course, everyone should peruse the case studies included in these chapters for two reasons—they are fun to read and they showcase how people in the industry are using the concepts discussed in the chapter to build real products.

We also discuss many of the practical concerns faced by deep learning practitioners and industry professionals in building real-world applications using the cloud, browsers, mobile, and edge devices. We compiled a number of practical “tips and tricks,” as well as life lessons in this book to encourage our readers to build applications that can make someone’s day just a little bit better.

To the Backend/Frontend/Mobile Software Developer

You are quite likely a proficient programmer already. Even if Python is an unfamiliar language to you, we expect that you will be able to pick it up easily and get started in no time. Best of all, we don’t expect you to have any background in machine learning and AI; that’s what we are here for! We believe that you will gain value from the book’s focus on the following areas:

  • How to build user-facing AI products

  • How to train models quickly

  • How to minimize the code and effort required in prototyping

  • How to make models more performant and energy efficient

  • How to operationalize and scale, and estimate the costs involved

  • Discovering how AI is applied in the industry with 40+ case studies and real-world examples

  • Developing a broad-spectrum knowledge of deep learning

  • Developing a generalized skill set that can be applied on new frameworks (e.g., PyTorch), domains (e.g., healthcare, robotics), input modalities (e.g., video, audio, text), and tasks (e.g., image segmentation, one-shot learning)

To the Data Scientist

You might already be proficient at machine learning and potentially know how to train deep learning models. Good news! You can further enrich your skill set and deepen your knowledge in the field in order to build real products. This book will help inform your everyday work and beyond by covering how to:

  • Speed up your training, including on multinode clusters

  • Build an intuition for developing and debugging models, including hyperparameter tuning, thus dramatically improving model accuracy

  • Understand how your model works, uncover bias in the data, and automatically determine the best hyperparameters as well as model architecture using AutoML

  • Learn tips and tricks used by other data scientists, including gathering data quickly, tracking your experiments in an organized manner, sharing your models with the world, and being up to date on the best available models for your task

  • Use tools to deploy and scale your best model to real users, even automatically (without involving a DevOps team)

To the Student

This is a great time to be considering a career in AI—it’s turning out to be the next revolution in technology after the internet and smartphones. A lot of strides have been made, and a lot remains to be discovered. We hope that this book can serve as your first step in whetting your appetite for a career in AI and, even better, developing deeper theoretical knowledge. And the best part is that you don’t have to spend a lot of money to buy expensive hardware. In fact, you can train on powerful hardware entirely for free from your web browser (thank you, Google Colab!). With this book, we hope you will:

  • Aspire to a career in AI by developing a portfolio of interesting projects

  • Learn from industry practices to help prepare for internships and job opportunities

  • Unleash your creativity by building fun applications like an autonomous car

  • Become an AI for Good champion by using your creativity to solve the most pressing problems faced by humanity

To the Teacher

We believe that this book can nicely supplement your coursework with fun, real-world projects. We’ve covered every step of the deep learning pipeline in detail, along with techniques on how to execute each step effectively and efficiently. Each of the projects we present in the book can make for great collaborative or individual work in the classroom throughout the semester. Eventually, we will be releasing PowerPoint Presentation Slides on http://PracticalDeepLearning.ai that can accompany coursework.

To the Robotics Enthusiast

Robotics is exciting. If you’re a robotics enthusiast, we don’t really need to convince you that adding intelligence to robots is the way to go. Increasingly capable hardware platforms such as Raspberry Pi, NVIDIA Jetson Nano, Google Coral, Intel Movidius, PYNQ-Z2, and others are helping drive innovation in the robotics space. As we grow towards Industry 4.0, some of these platforms will become more and more relevant and ubiquitous. With this book, you will:

  • Learn how to build and train AI, and then bring it to the edge

  • Benchmark and compare edge devices on performance, size, power, battery, and costs

  • Understand how to choose the optimal AI algorithm and device for a given scenario

  • Learn how other makers are building creative robots and machines

  • Learn how to build further progress in the field and showcase your work

What to Expect in Each Chapter

Chapter 1, Exploring the Landscape of Artificial Intelligence

We take a tour of this evolving landscape, from the 1950s to today, analyze the ingredients that make for a perfect deep learning recipe, get familiar with common AI terminology and datasets, and take a peek into the world of responsible AI.

Chapter 2, What’s in the Picture: Image Classification with Keras

We delve into the world of image classification in a mere five lines of Keras code. We then learn what neural networks are paying attention to while making predictions by overlaying heatmaps on videos. Bonus: we hear the motivating personal journey of François Chollet, the creator of Keras, illustrating the impact a single individual can have.

Chapter 3, Cats Versus Dogs: Transfer Learning in 30 Lines with Keras

We use transfer learning to reuse a previously trained network on a new custom classification task to get near state-of-the-art accuracy in a matter of minutes. We then slice and dice the results to understand how well it is classifying. Along the way, we build a common machine learning pipeline, which is repurposed throughout the book. Bonus: we hear from Jeremy Howard, cofounder of fast.ai, on how hundreds of thousands of students use transfer learning to jumpstart their AI journey.

Chapter 4, Building a Reverse Image Search Engine: Understanding Embeddings

Like Google Reverse Image Search, we explore how one can use embeddings—a contextual representation of an image to find similar images in under ten lines. And then the fun starts when we explore different strategies and algorithms to speed this up at scale, from thousands to several million images, and making them searchable in microseconds.

Chapter 5, From Novice to Master Predictor: Maximizing Convolutional Neural Network Accuracy

We explore strategies to maximize the accuracy that our classifier can achieve, with the help of a range of tools including TensorBoard, the What-If Tool, tf-explain, TensorFlow Datasets, AutoKeras, and AutoAugment. Along the way, we conduct experiments to develop an intuition of what parameters might or might not work for your AI task.

Chapter 6, Maximizing Speed and Performance of TensorFlow: A Handy Checklist

We take the speed of training and inference into hyperdrive by going through a checklist of 30 tricks to reduce as many inefficiencies as possible and maximize the value of your current hardware.

Chapter 7, Practical Tools, Tips, and Tricks

We diversify our practical skills in a variety of topics and tools, ranging from installation, data collection, experiment management, visualizations, and keeping track of state-of-the-art research all the way to exploring further avenues for building the theoretical foundations of deep learning.

Chapter 8, Cloud APIs for Computer Vision: Up and Running in 15 Minutes

Work smart, not hard. We utilize the power of cloud AI platforms from Google, Microsoft, Amazon, IBM, and Clarifai in under 15 minutes. For tasks not solved with existing APIs, we then use custom classification services to train classifiers without coding. And then we pit them against each other in an open benchmark—you might be surprised who won.

Chapter 9, Scalable Inference Serving on Cloud with TensorFlow Serving and KubeFlow

We take our custom trained model to the cloud/on-premises to scalably serve from hundreds to millions of requests. We explore Flask, Google Cloud ML Engine, TensorFlow Serving, and KubeFlow, showcasing the effort, scenario, and cost-benefit analysis.

Chapter 10, AI in the Browser with TensorFlow.js and ml5.js

Every single individual who uses a computer or a smartphone uniformly has access to one software program—their browser. Reach all those users with browser-based deep learning libraries including TensorFlow.js and ml5.js. Guest author Zaid Alyafeai walks us through techniques and tasks such as body pose estimation, generative adversarial networks (GANs), image-to-image translation with Pix2Pix, and more, running not on a server but in the browser itself. Bonus: hear from key contributors to TensorFlow.js and ml5.js on how the projects incubated.

Chapter 11, Real-Time Object Classification on iOS with Core ML

We explore the landscape of deep learning on mobile, with a sharp focus on the Apple ecosystem with Core ML. We benchmark models on different iPhones, investigate strategies to reduce app size and energy impact, and look into dynamic model deployment, training on device, and how professional apps are built.

Chapter 12, Not Hotdog on iOS with Core ML and Create ML

Silicon Valley’s Not Hotdog app (from HBO) is considered the “Hello World” of mobile AI, so we pay tribute by building a real-time version in not one, not two, but three different ways.

Chapter 13, Shazam for Food: Developing Android Apps with TensorFlow Lite and ML Kit

We bring AI to Android with the help of TensorFlow Lite. We then look at cross-platform development using ML Kit (which is built on top of TensorFlow Lite) and Fritz to explore the end-to-end development life cycle for building a self-improving AI app. Along the way we look at model versioning, A/B testing, measuring success, dynamic updates, model optimization, and other topics. Bonus: we get to hear about the rich experience of Pete Warden (technical lead for Mobile and Embedded TensorFlow) in bringing AI to edge devices.

Chapter 14, Building the Purrfect Cat Locator App with TensorFlow Object Detection API

We explore four different methods for locating the position of objects within images. We take a look at the evolution of object detection over the years, and analyze the tradeoffs between speed and accuracy. This builds the base for case studies such as crowd counting, face detection, and autonomous cars.

Chapter 15, Becoming a Maker: Exploring Embedded AI at the Edge

Guest author Sam Sterckval brings deep learning to low-power devices as he showcases a range of AI-capable edge devices with varying processing power and cost including Raspberry Pi, NVIDIA Jetson Nano, Google Coral, Intel Movidius, and PYNQ-Z2 FPGA, opening the doors for robotics and maker projects. Bonus: hear from the NVIDIA Jetson Nano team on how people are building creative robots quickly from their open source recipe book.

Chapter 16, Simulating a Self-Driving Car Using End-to-End Deep Learning with Keras

Using the photorealistic simulation environment of Microsoft AirSim, guest authors Aditya Sharma and Mitchell Spryn guide us in training a virtual car by driving it first within the environment and then teaching an AI model to replicate its behavior. Along the way, this chapter covers a number of concepts that are applicable in the autonomous car industry.

Chapter 17, Building an Autonomous Car in Under an Hour: Reinforcement Learning with AWS DeepRacer

Moving from the virtual to the physical world, guest author Sunil Mallya showcases how AWS DeepRacer, a miniature car, can be assembled, trained, and raced in under an hour. And with the help of reinforcement learning, the car learns to drive on its own, penalizing its mistakes and maximizing success. We learn how to apply this knowledge to races from the Olympics of AI Driving to RoboRace (using full-sized autonomous cars). Bonus: hear from Anima Anandkumar (NVIDIA) and Chris Anderson (founder of DIY Robocars) on where the self-driving automotive industry is headed.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at http://PracticalDeepLearning.ai. If you have a technical question or a problem using the code examples, please send email to .

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Practical Deep Learning for Cloud, Mobile, and Edge by Anirudh Koul, Siddha Ganju, and Meher Kasam (O’Reilly). Copyright 2020 Anirudh Koul, Siddha Ganju, Meher Kasam, 978-1-492-03486-5.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

O’Reilly has a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/practical-deep-learning. The authors have a website for this book as well: http://PracticalDeepLearning.ai.

Email to comment or ask technical questions about this book; email to contact the authors about this book.

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

Group Acknowledgments

We’d like to thank the following people for their immense help throughout our journey in writing this book. Without them, this book would not be possible.

This book came to life because of our development editor Nicole Taché’s efforts. She rooted for us throughout our journey and provided important guidance at each step of the process. She helped us prioritize the right material (believe it or not, the book was going to be even larger!) and ensured that we were on track. She was reader number one for every single draft that we had written, so our goal first and foremost was ensuring that she was able to follow the content, despite her being new to AI. We’re immensely grateful for her support.

We also want to thank the rest of the O’Reilly team including our production editor Christopher Faucher who worked tireless hours on a tight schedule to ensure that this book made it to the printing press on time. We are also grateful to our copy editor Bob Russell who really impressed us with his lightning-fast edits and his attention to detail. He made us realize the importance of paying attention to English grammar in school (though a few years too late, we’re afraid). We also want to acknowledge Rachel Roumeliotis (VP of Content Strategy) and Olivia MacDonald (Managing Editor for Development) for believing in the project and for offering their continued support.

Huge thanks are in order for our guest authors who brought in their technical expertise to share their passion for this field with our readers. Aditya Sharma and Mitchell Spryn (from Microsoft) showed us that our love for playing video racing games can be put to good use to train autonomous cars by driving them in a simulated environment (with AirSim). Sunil Mallya (from Amazon) helped bring this knowledge to the physical world by demonstrating that all it takes is one hour to assemble and get a miniature autonomous car (AWS DeepRacer) to navigate its way around a track using reinforcement learning. Sam Sterckval (from Edgise) summarized the vast variety of embedded AI hardware available in the market, so we can get a leg up on our next robotics project. And finally, Zaid Alyafeai (from King Fahd University) demonstrated that browsers are no less capable of running serious interactive AI models (with the help of TensorFlow.js and ml5js).

The book is in its current shape because of timely feedback from our amazing technical reviewers, who worked tirelessly on our drafts, pointed out any technical inaccuracies they came across, and gave us suggestions on better conveying our ideas. Due to their feedback (and the ever-changing APIs of TensorFlow), we ended up doing a rewrite of a majority of the book from the original prerelease. We thank Margaret Maynard-Reid (Google Developer Expert for Machine Learning, you might have read her work while reading TensorFlow documentation), Paco Nathan (35+ years industry veteran at Derwin Inc., who introduced Anirudh to the world of public speaking), Andy Petrella (CEO and Founder at Kensu and creator of SparkNotebook whose technical insights stood up to his reputation), and Nikhita Koul (Senior Data Scientist at Adobe who read and suggested improvements after every iteration, effectively reading a few thousand pages, thus making the content significantly more approachable) for their detailed reviews of each chapter. Additionally, we also had a lot of help from reviewers with expertise in specific topics be it AI in the browser, mobile development, or autonomous cars. The chapter-wise reviewer list (in alphabetical order) is as follows:

  • Chapter 1: Dharini Chandrasekaran, Sherin Thomas

  • Chapter 2: Anuj Sharma, Charles Kozierok, Manoj Parihar, Pankesh Bamotra, Pranav Kant

  • Chapter 3: Anuj Sharma, Charles Kozierok, Manoj Parihar, Pankesh Bamotra, Pranav Kant

  • Chapter 4: Anuj Sharma, Manoj Parihar, Pankesh Bamotra, Pranav Kant

  • Chapter 6: Gabriel Ibagon, Jiri Simsa, Max Katz, Pankesh Bamotra

  • Chapter 7: Pankesh Bamotra

  • Chapter 8: Deepesh Aggarwal

  • Chapter 9: Pankesh Bamotra

  • Chapter 10: Brett Burley, Laurent Denoue, Manraj Singh

  • Chapter 11: David Apgar, James Webb

  • Chapter 12: David Apgar

  • Chapter 13: Jesse Wilson, Salman Gadit

  • Chapter 14: Akshit Arora, Pranav Kant, Rohit Taneja, Ronay Ak

  • Chapter 15: Geertrui Van Lommel, Joke Decubber, Jolien De Cock, Marianne Van Lommel, Sam Hendrickx

  • Chapter 16: Dario Salischiker, Kurt Niebuhr, Matthew Chan, Praveen Palanisamy

  • Chapter 17: Kirtesh Garg, Larry Pizette, Pierre Dumas, Ricardo Sueiras, Segolene Dessertine-panhard, Sri Elaprolu, Tatsuya Arai

We have short excerpts throughout the book from creators who gave us a little peek into their world, and how and why they built the project for which they are most known. We are grateful to François Chollet, Jeremy Howard, Pete Warden, Anima Anandkumar, Chris Anderson, Shanqing Cai, Daniel Smilkov, Cristobal Valenzuela, Daniel Shiffman, Hart Woolery, Dan Abdinoor, Chitoku Yato, John Welsh, and Danny Atsmo.

Personal Acknowledgments

“I would like to thank my family—Arbind, Saroj, and Nikhita who gave me the support, resources, time, and freedom to pursue my passions. To all the hackers and researchers at Microsoft, Aira, and Yahoo who stood with me in turning ideas to prototypes to products, it’s not the successes but the hiccups which taught us a lot during our journey together. Our trials and tribulations provided glorious material for this book, enough to exceed our original estimate by an extra 250 pages! To my academic families at Carnegie Mellon, Dalhousie, and Thapar University, you taught me more than just academics (unlike what my GPA might suggest). And to the blind and low-vision community, you inspired me daily to work in the AI field by demonstrating that armed with the right tools, people are truly limitless.”

Anirudh

“My grandfather, an author himself, once told me, ‘Authoring a book is harder than I thought and more rewarding than I could have ever imagined.’ I am eternally grateful to my grandparents and family, Mom, Dad, and Shriya for advocating seeking out knowledge and helping me become the person I am today. To my wonderful collaborators and mentors from Carnegie Mellon University, CERN, NASA FDL, Deep Vision, NITH, and NVIDIA who were with me throughout my journey, I am indebted to them for teaching and helping develop a scientific temperament. To my friends, who I hope still remember me, as I’ve been pretty invisible as of late, I would like to say a big thanks for being incredibly patient. I hope to see you all around. To my friends who selflessly reviewed chapters of the book and acted as a sounding board, a huge thank you—without you, the book would not have taken shape.”

Siddha

“I am indebted to my parents Rajagopal and Lakshmi for their unending love and support starting from the very beginning and their strong will to provide me with a good life and education. I am grateful to my professors from UF and VNIT who taught me well and made me glad that I majored in CS. I am thankful to my incredibly supportive partner Julia Tanner who, for nearly two years, had to endure nights and weekends of unending Skype calls with my coauthors, as well as my several terrible jokes (some of which unfortunately made it into this book). I’d also like to acknowledge the role of my wonderful manager Joel Kustka in supporting me during the process of writing this book. A shout out to my friends who were incredibly understanding when I couldn’t hang out with them as often as they would have liked me to.”

Meher

Last but not least, thank you to the makers of Grammarly, which empowered people with mediocre English grades to become published authors!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset