
Book Description

Enterprises in traditional and emerging industries alike are increasingly turning to machine learning (ML) to maximize the value of their business data. But many of these teams are likely to experience significant hurdles and setbacks throughout the journey. In this practical ebook, data scientists and machine learning engineers explore six common challenges that teams face every day when creating, managing, and scaling ML applications.

For each problem, you’ll get hard-earned advice from Hussein Mehanna, AI engineering director for Google Cloud; Nakul Arora, VP of product management and marketing at Infosys; Patrick Hall, senior director for data science products at H2O; Matt Harrison, consultant and corporate trainer at MetaSnake; Joao Natali, data science director at Neustar; and Jerry Overton, data scientist and technology fellow at DXC.

Accomplished data scientist Piero Cinquegrana and Matheen Raza of Qubole examine ways to overcome challenges that include:

  • Reconciling disparate interfaces
  • Resolving environment dependencies
  • Ensuring close collaboration among all ML stakeholders
  • Building or renting adequate ML infrastructure
  • Meeting the scalability needs of your application
  • Enabling smooth deployment of ML projects

Table of Contents

  1. Machine Learning at Enterprise Scale
    1. Introduction
      1. Nakul Arora, Infosys
      2. Patrick Hall, H2O
      3. Matt Harrison, MetaSnake
      4. Hussein Mehanna, formerly of Google Cloud
      5. Joao Natali, Neustar
      6. Jerry Overton, DXC Technology
      7. Sean Downes
    2. Problem 1: Reconciling Disparate Interfaces
      1. Data Scientists and Machine Learning Engineers: Different Roles, Different Tools
      2. The Three Interface Challenges Facing Data Professionals
    3. Problem 2: Resolving Environment Dependencies
      1. Setting Up a Machine Learning Environment and Avoiding DevOps Bottlenecks
      2. Solving the Code Portability Problem
    4. Problem 3: Ensuring Close Collaboration Among All Machine Learning Stakeholders
      1. Collaboration Between Data Scientists
      2. Collaboration with Data, Machine Learning, and Software Engineers
      3. Collaboration with Business Decision Makers and Executives
    5. Problem 4: Building (or Renting) Adequate Machine Learning Infrastructure
      1. To GPU or Not to GPU?
      2. Cost Concerns
      3. Regulatory Constraints Around Data Location
    6. Problem 5: Scaling to Meet Machine Learning Requirements
      1. Data Processing at Scale
      2. Hyperparameter Optimization
      3. Distributed Training
      4. Supporting Growing Numbers of Users and Applications
      5. Why Automation Is Key for Scalability
    7. Problem 6: Enabling Smooth Deployment of Machine Learning Projects
      1. Batch versus Real-Time and Deploying to the Edge
      2. Experimentation versus Production
      3. Continued Auditing Is Critical
    8. Conclusion