Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Preface

Generative AI, and Chat GPT-4 in particular, is all the rage these days. Probabilistic machine learning (ML) is a type of generative AI that is ideally suited for finance and investing. Unlike deep neural networks, on which ChatGPT is based, probabilistic ML models are not black boxes. These models also enable you to infer causes from effects in a fairly transparent manner. This is important in heavily regulated industries, such as finance and healthcare, where you have to explain the basis of your decisions to many stakeholders.

Probabilistic ML also enables you to explicitly and systematically encode personal, empirical, and institutional knowledge into ML models to sustain your organization’s competitive advantages. What truly distinguishes probabilistic ML from its conventional counterparts is its capability of seamlessly simulating new data and counterfactual knowledge conditioned on the observed data and model assumptions on which it was trained and tested, regardless of the size of the dataset or the ordering of the data. Probabilistic models are generative models that know their limitations and honestly express their ignorance by widening the ranges of their inferences and predictions. You won’t get such quantified doubts from ChatGPT’s confident hallucinations, more commonly known as fibs and lies.

All ML models are built on the assumption that patterns discovered in training or in-sample data will persist in testing or out-of-sample data. However, when nonprobabilistic ML models encounter patterns in data that they have never been trained or tested on, they make egregious inferences and predictions because of the inherent foundational flaws of their statistical models. Furthermore, these ML models do it with complete confidence and without warning decision makers of their uncertainties.

The increasing adoption of nonprobabilistic ML models for decision making in finance and investments can lead to catastrophic consequences for individuals and society at large, including bankruptcies and economic recessions. It is imperative that all ML models quantify the uncertainty of their inferences and predictions on unseen data to support sound decision making in a complex world with three-dimensional uncertainties. Leading companies clearly understand the limitations of standard AI technologies and are developing their probabilistic versions to extend their applicability to more complex problems. Google recently introduced TensorFlow Probability to extend its established TensorFlow platform. Similarly, Facebook and Uber have introduced Pyro to extend their PyTorch platforms. Currently, the most popular open source probabilistic ML technologies are PyMC and Stan. PyMC is written in Python, and Stan is written in C++. This book uses the extensive ecosystem of user-friendly Python libraries.

Who Should Read This Book?

The primary audience of this book is the thinking practitioner in the finance and investing discipline. A thinking practitioner is someone who doesn’t merely want to follow instructions from a manual or cookbook. They want to understand the underlying concepts for why they must adopt a process, model, or technology. Generally, they are intellectually curious and enjoy learning for its own sake. At the same time, they are not looking for onerous mathematical proofs or tedious academic tomes. I have provided many scholarly references in each chapter for readers who are looking for the mathematical and technical details underlying the concepts and reasoning presented in this book.

A thinking practitioner could be an individual investor, analyst, developer, manager, project manager, data scientist, researcher, portfolio manager, or quantitative trader. These thinking practitioners understand that they need to learn new concepts and technologies continually to advance their careers and businesses. A practical depth of understanding gives them the confidence to apply what they learn to develop creative solutions for their unique challenges. It also gives them a framework to explore and learn related technologies and concepts more easily.

In this book, I am assuming that readers have a basic familiarity with finance, statistics, machine learning, and Python. I am not assuming that they have read any particular book or mastered any particular skill. I am only assuming that they have a willingness to learn, especially when ChatGPT, Bard, and Bing AI can easily explain any code or formula in this book.

Why I Wrote This Book

There is a paucity of general probabilistic ML books, and none that is dedicated entirely to finance and investing problems. Because of the idiosyncratic complexities of these domains, any naive application of ML in general and probabilistic ML in particular is doomed to failure. A depth of understanding of the foundations of these domains is pivotal to having any chance of succeeding. This book is a primer that endeavors to give the thinking practitioner a solid grounding in the foundational concepts of probabilistic ML and how to apply it to finance and investing problems, using simple math and Python code.

There is another reason why I wrote this book. To this day, books are still a medium for serious discourse. I wanted to remind the readers about the continued grave flaws of modern financial theory and conventional statistical inference methodology. It is outrageous that these pseudoscientific methods are still taught in academia and practiced in industry despite their deep flaws and pathetic performance. They continue to waste billions of research dollars producing junk studies, tarnish the reputation of the scientific enterprise, and contribute significantly to economic disasters and human misery.

We are at a crossroads in the evolution of AI technologies, with most experts predicting exponential growth in its use, fundamentally transforming the way we live, work, and interact with one another. The danger that AI systems will take over humanity imminently is silly science fiction, because even the most advanced AI system lacks the common sense of a toddler. The real clear and present danger is that fools might end up developing and managing these powerful savants based on the spurious models of conventional finance and statistics. This will most likely lead to catastrophes faster and bigger than we have ever experienced before.

My criticisms are supported by simple math, common sense, data, and scholarly works that have been published over the past century. Perhaps one added value of this book is in retrieving many of those forgotten academic publications from the dusty archives of history and making readers aware of their insights in plain, unequivocal language using logic, simple math, or code that anyone with a high school degree can understand. Clearly, the conventional mode of expressing these criticisms hasn’t worked at all. The stakes for individuals, society, and the scientific enterprise are too high for us to care if plainly spoken mathematical and scientific truths might offend someone or tarnish a reputation built on authoring or supporting bogus theories.

Navigating This Book

The contents of this book may be divided into two logical parts interwoven unevenly throughout each chapter. One part examines the appalling uselessness of the prevailing economics, statistical, and machine learning models for finance and investing domains. The other part examines why probabilistic machine learning is a less wrong, more useful model for these problem domains. The singular focus of this primer is on understanding the foundations of this complex, multidisciplinary field. Only pivotal concepts and applications are covered. Sometimes less is indeed more. The book is organized as follows, with each chapter having at least one of the main concepts in finance and investing applied in a hands-on Python code exercise:

Chapter 1, “The Need for Probabilistic Machine Learning” examines some of the woeful inadequacies of theoretical finance, how all financial models are afflicted with a trifecta of errors, and why we need a systematic way of quantifying the uncertainty of our inferences and predictions. The chapter explains why probabilistic ML provides a useful framework for finance and investing.
Chapter 2, “Analyzing and Quantifying Uncertainty” uses the Monty Hall problem to review the basic rules of probability theory, examine the meanings of probability, and explore the trinity of uncertainties that pervade our world. The chapter also explores the problem of induction and its algorithmic restatement, the no free lunch (NFL) theorems, and how they underpin finance, investing, and probabilistic ML.
Chapter 3, “Quantifying Output Uncertainty with Monte Carlo Simulation” reviews important statistical concepts to explain why Monte Carlo simulation (MCS), one of the most important numerical techniques, works by generating approximate probabilistic solutions to analytically intractable problems.
Chapter 4, “The Dangers of Conventional Statistical Methodologies” exposes the skullduggery of conventional statistical inference methodologies commonly used in research and industry, and explains why they are the main cause of false research findings that plague the social and economic sciences.
Chapter 5, “The Probabilistic Machine Learning Framework” explores the probabilistic machine framework and demonstrates how inference from data and simulation of new data are logically and seamlessly integrated in this type of generative model.
Chapter 6, “The Dangers of Conventional AI Systems” exposes the dangers of conventional AI systems, especially their lack of basic common sense and how they are unaware of their own limitations, which pose massive risks to all their stakeholders and society at large. Markov chain Monte Carlo simulations are introduced as a dependent sampling method for solving complex problems in finance and investing.
Chapter 7, “Probabilistic Machine Learning with Generative Ensembles” explains how probabilistic machine learning is essentially a form of ensemble machine learning. It shows readers how to develop a prototype of a generative linear ensemble for regression problems in finance and investing using PyMC, Xarray, and ArviZ Python libraries.
Chapter 8, “Making Probabilistic Decisions with Generative Ensembles” shows how to apply generative ensembles to risk management and capital allocation decisions in finance and investing. The implications of ergodicity and the pitfalls of using ensemble averages for financial decision making are explored. The strengths and weaknesses of capital allocation algorithms, including the Kelly criterion, are examined.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic: Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width: Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.
Constant width bold: Shows commands or other text that should be typed literally by the user.
Constant width italic: Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples) is available for download at https://oreil.ly/supp-probabilistic-ML.

If you have a technical question or a problem using the code examples, please send email to [email protected].

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Probabilistic Machine Learning for Finance and Investing by Deepak K. Kanungo (O’Reilly). Copyright 2023 Hedged Capital L.L.C., 978-1-492-09767-9.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at [email protected].

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-889-8969 (in the United States or Canada)
707-829-7019 (international or local)
707-829-0104 (fax)
[email protected]
https://www.oreilly.com/about/contact.html

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/Probabilistic_ML.

For news and information about our books and courses, visit https://oreilly.com.

Find us on LinkedIn: https://linkedin.com/company/oreilly-media

Watch us on YouTube: https://youtube.com/oreillymedia

Acknowledgments

I would like to thank Michelle Smith, Jeff Bleiel, and the entire O’Reilly Media team for making this book possible. It was a pleasure working with everyone, especially Jeff, whose honest and insightful feedback helped me improve the contents of this book.

I would also like to thank the expert reviewers of my book, Abdullah Karasan, Juan Manuel Contreras, and Isaac Rhea, for their valuable comments.

Furthermore, I would like to thank the following readers of the early releases of the book for their equally valuable feedback: Ian Angell, Bruno Rignel, Jonathan Hugenschmidt, Autumn Peters, and Mike Shwe.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Preface

Create new playlist

Sign In

Sign Up

Preface

Who Should Read This Book?

Why I Wrote This Book

Navigating This Book

Conventions Used in This Book

Tip

Note

Warning

Using Code Examples

O’Reilly Online Learning

Note

How to Contact Us

Acknowledgments

Table of Contents for
Preface