Understanding the differences between concat, join, and merge

The merge and join DataFrame (and not Series) methods and the concat function all provide very similar functionality to combine multiple pandas objects together. As they are so similar and they can replicate each other in certain situations, it can get very confusing when and how to use them correctly. To help clarify their differences, take a look at the following outline:

  • concat:
    • Pandas function
    • Combines two or more pandas objects vertically or horizontally
    • Aligns only on the index
    • Errors whenever a duplicate appears in the index
    • Defaults to outer join with option for inner
  • join:
    • DataFrame method
    • Combines two or more pandas objects horizontally
    • Aligns the calling DataFrame's column(s) or index with the other objects' index (and not the columns)
    • Handles duplicate values on the joining columns/index by performing a cartesian product
    • Defaults to left join with options for inner, outer, and right
  • merge:
    • DataFrame method
    • Combines exactly two DataFrames horizontally
    • Aligns the calling DataFrame's column(s)/index with the other DataFrame's column(s)/index
    • Handles duplicate values on the joining columns/index by performing a cartesian product
    • Defaults to inner join with options for left, outer, and right
The first parameter to the join method is other which can either be a single DataFrame/Series or a list of any number of DataFrames/Series.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset