XGBoost

XGBoost is a scalable, portable, and distributed gradient boosting library (a tree ensemble machine learning algorithm). Initially created by Tianqi Chen from Washington University, it has been enriched by a Python wrapper by Bing Xu and an R interface by Tong He (you can read the story behind XGBoost directly from its principal creator at http://homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html). XGBoost is available for Python, R, Java, Scala, Julia, and C++, and it can work on a single machine (leveraging multithreading) in both Hadoop and Spark clusters:

Detailed instructions for installing XGBoost on your system can be found at https://github.com/dmlc/xgboost/blob/master/doc/build.md.

The installation of XGBoost on both Linux and macOS is quite straightforward, whereas it is a little bit trickier for Windows users, though the recent release of a pre-built binary wheel for Python has made the procedure a piece of cake for everyone. You simply have to type this on your shell:

$> pip install xgboost

If you want to install XGBoost from scratch because you need the most recent bug fixes or GPU support, you need to first build the shared library from C++ (libxgboost.so for Linux/macOS and xgboost.dll for Windows) and then install the Python package. On a Linux/macOS system, you just have to build the executable by the make command, but on Windows, things are a little bit more tricky.

Generally, refer to https://xgboost.readthedocs.io/en/latest/build.html#, which provides the most recent instructions for building from scratch. For a quick reference, here, we are going to provide specific installation steps to get XGBoost working on Windows:

  1. First, download and install Git for Windows, (https://git-for-windows.github.io/).
  2. Then, you need a MINGW compiler present on your system. You can download it from http://www.mingw.org/ or http://tdm-gcc.tdragon.net/, according to the characteristics of your system.

  1. From the command line, execute the following:
$> git clone --recursive https://github.com/dmlc/xgboost
$> cd xgboost
$> git submodule init
$> git submodule update
  1. Then, always from the command line, copy the configuration for 64-byte systems to be the default one:
$> copy makemingw64.mk config.mk
  1. Alternatively, you just copy the plain 32-byte version:
$> copy makemingw.mk config.mk
  1. After copying the configuration file, you can run the compiler, setting it to use four threads in order to speed up the compiling procedure:
$> mingw32-make -j4
  1. In MinGW, the make command comes with the name mingw32-make. If you are using a different compiler, the previous command may not work. If so, you can simply try this:
$> make -j4
  1. Finally, if the compiler completes its work without errors, you can install the package in your Python by using the following:
$> cd python-package
$> python setup.py install
After following all the preceding instructions, if you try to import XGBoost in Python and it doesn't load and results in an error, it may well be that Python cannot find MinGW's g++ runtime libraries.
You just need to find the location on your computer of MinGW's binaries (in our case, it was in C:mingw-w64mingw64in; just modify the following code and put yours) and place the following code snippet before importing XGBoost:
import os
mingw_path = 'C:mingw-w64mingw64in'
os.environ['PATH']=mingw_path + ';' + os.environ['PATH']
import xgboost as xgb

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset