Preface

Generative AI, and Chat GPT-4 in particular, is all the rage these days. Probabilistic machine learning (ML) is a type of generative AI that is ideally suited for finance and investing. Unlike deep neural networks, on which ChatGPT is based, probabilistic ML models are not black boxes. These models also enable you to infer causes from effects in a fairly transparent manner. This is important in heavily regulated industries, such as finance and healthcare, where you have to explain the basis of your decisions to many stakeholders.

Probabilistic ML also enables you to explicitly and systematically encode personal, empirical, and institutional knowledge into ML models to sustain your organization’s competitive advantages. What truly distinguishes probabilistic ML from its conventional counterparts is its capability of seamlessly simulating new data and counterfactual knowledge conditioned on the observed data and model assumptions on which it was trained and tested, regardless of the size of the dataset or the ordering of the data. Probabilistic models are generative models that know their limitations and honestly express their ignorance by widening the ranges of their inferences and predictions. You won’t get such quantified doubts from ChatGPT’s confident hallucinations, more commonly known as fibs and lies.

All ML models are built on the assumption that patterns discovered in training or in-sample data will persist in testing or out-of-sample data. However, when nonprobabilistic ML models encounter patterns in data that they have never been trained or tested on, they make egregious inferences and predictions because of the inherent foundational flaws of their statistical models. Furthermore, these ML models do it with complete confidence and without warning decision makers of their uncertainties.

The increasing adoption of nonprobabilistic ML models for decision making in finance and investments can lead to catastrophic consequences for individuals and society at large, including bankruptcies and economic recessions. It is imperative that all ML models quantify the uncertainty of their inferences and predictions on unseen data to support sound decision making in a complex world with three-dimensional uncertainties. Leading companies clearly understand the limitations of standard AI technologies and are developing their probabilistic versions to extend their applicability to more complex problems. Google recently introduced TensorFlow Probability to extend its established TensorFlow platform. Similarly, Facebook and Uber have introduced Pyro to extend their PyTorch platforms. Currently, the most popular open source probabilistic ML technologies are PyMC and Stan. PyMC is written in Python, and Stan is written in C++. This book uses the extensive ecosystem of user-friendly Python libraries.

Who Should Read This Book?

The primary audience of this book is the thinking practitioner in the finance and investing discipline. A thinking practitioner is someone who doesn’t merely want to follow instructions from a manual or cookbook. They want to understand the underlying concepts for why they must adopt a process, model, or technology. Generally, they are intellectually curious and enjoy learning for its own sake. At the same time, they are not looking for onerous mathematical proofs or tedious academic tomes. I have provided many scholarly references in each chapter for readers who are looking for the mathematical and technical details underlying the concepts and reasoning presented in this book.

A thinking practitioner could be an individual investor, analyst, developer, manager, project manager, data scientist, researcher, portfolio manager, or quantitative trader. These thinking practitioners understand that they need to learn new concepts and technologies continually to advance their careers and businesses. A practical depth of understanding gives them the confidence to apply what they learn to develop creative solutions for their unique challenges. It also gives them a framework to explore and learn related technologies and concepts more easily.

In this book, I am assuming that readers have a basic familiarity with finance, statistics, machine learning, and Python. I am not assuming that they have read any particular book or mastered any particular skill. I am only assuming that they have a willingness to learn, especially when ChatGPT, Bard, and Bing AI can easily explain any code or formula in this book.

Why I Wrote This Book

There is a paucity of general probabilistic ML books, and none that is dedicated entirely to finance and investing problems. Because of the idiosyncratic complexities of these domains, any naive application of ML in general and probabilistic ML in particular is doomed to failure. A depth of understanding of the foundations of these domains is pivotal to having any chance of succeeding. This book is a primer that endeavors to give the thinking practitioner a solid grounding in the foundational concepts of probabilistic ML and how to apply it to finance and investing problems, using simple math and Python code.

There is another reason why I wrote this book. To this day, books are still a medium for serious discourse. I wanted to remind the readers about the continued grave flaws of modern financial theory and conventional statistical inference methodology. It is outrageous that these pseudoscientific methods are still taught in academia and practiced in industry despite their deep flaws and pathetic performance. They continue to waste billions of research dollars producing junk studies, tarnish the reputation of the scientific enterprise, and contribute significantly to economic disasters and human misery.

We are at a crossroads in the evolution of AI technologies, with most experts predicting exponential growth in its use, fundamentally transforming the way we live, work, and interact with one another. The danger that AI systems will take over humanity imminently is silly science fiction, because even the most advanced AI system lacks the common sense of a toddler. The real clear and present danger is that fools might end up developing and managing these powerful savants based on the spurious models of conventional finance and statistics. This will most likely lead to catastrophes faster and bigger than we have ever experienced before.

My criticisms are supported by simple math, common sense, data, and scholarly works that have been published over the past century. Perhaps one added value of this book is in retrieving many of those forgotten academic publications from the dusty archives of history and making readers aware of their insights in plain, unequivocal language using logic, simple math, or code that anyone with a high school degree can understand. Clearly, the conventional mode of expressing these criticisms hasn’t worked at all. The stakes for individuals, society, and the scientific enterprise are too high for us to care if plainly spoken mathematical and scientific truths might offend someone or tarnish a reputation built on authoring or supporting bogus theories.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples) is available for download at https://oreil.ly/supp-probabilistic-ML.

If you have a technical question or a problem using the code examples, please send email to .

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Probabilistic Machine Learning for Finance and Investing by Deepak K. Kanungo (O’Reilly). Copyright 2023 Hedged Capital L.L.C., 978-1-492-09767-9.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/Probabilistic_ML.

For news and information about our books and courses, visit https://oreilly.com.

Find us on LinkedIn: https://linkedin.com/company/oreilly-media

Follow us on Twitter: https://twitter.com/oreillymedia

Watch us on YouTube: https://youtube.com/oreillymedia

Acknowledgments

I would like to thank Michelle Smith, Jeff Bleiel, and the entire O’Reilly Media team for making this book possible. It was a pleasure working with everyone, especially Jeff, whose honest and insightful feedback helped me improve the contents of this book.

I would also like to thank the expert reviewers of my book, Abdullah Karasan, Juan Manuel Contreras, and Isaac Rhea, for their valuable comments.

Furthermore, I would like to thank the following readers of the early releases of the book for their equally valuable feedback: Ian Angell, Bruno Rignel, Jonathan Hugenschmidt, Autumn Peters, and Mike Shwe.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset