Distance correlation

This is a far more concept than the Pearson coefficient, and we cannot address it in detail here. The main difference between the two concepts are:

  • The quantities that are multiplied to investigate the type of relation occurring between data, which were the distance from the mean for Pearson and are the doubly centered distances for the distance correlation
  • The kind of relationship spotted from the statistics, which was only the linear one for the Pearson coefficient, while it is any kind of relationship for the distance correlation.

This value is formally defined as follows:

As you see, it is, in a way, similar to the Pearson coefficient, since it is a ratio between a covariance and the square of two variances. Moreover, both of the ratios can range from -1 to 1.

Within R, we have the dcor function from the energy package, which can help us compute these statistics:

dcor(cash_flow_report_mutation$delays, cash_flow_report_mutation$cash_flow)

This results in 0.14. What does this mean? It means that a small positive correlation is going on within our data, of some functional form different from the linear one. Once again, the value of this coefficient is really near to 0 and cannot therefore be interpreted as strong evidence of a trend of our cash flows over time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset