If you want to create a calculated field that produces the percentage of the total land flooded during an event, we need to use a data source that has this information in it. Most stories need more than one data source. When you are adding in secondary data sources, it's important to know what the level of aggregation is if you have joined data sources with different levels of aggregation and have not accounted for that when blending them because if you fail to do so, you might get the wrong results in your calculated fields.
The flood data is aggregated at the event level. This means that for every major flood, we have summarized data—the start date, end date, location, total land area affected, and so on. This is moderately granular data. If it were even more summarized, it might be summed up as floods per year per country.
We want to know the total percent of land flooded each year, and for this, we will assume that the area of each country is static throughout time. So, we created a new data source. We filtered the World Bank indicators data (available at http://data.worldbank.org/data-catalog/world-development-indicators/) in this data source to include only the indicators pertaining to the land area or population. The World Bank indicator data is aggregated by country by year, which is not the same level of aggregation as that of the data on floods; it is higher. So, in our calculated field, which will sum the total area flooded and then divide it by the total area of each country, we will need to tell Tableau how to aggregate the total area of each country. Otherwise, we will get the wrong results.
First, we need to join the data sources. In the companion workbook, which can be found at [X], we have a data source called Population and Land Data. This is the version of the World Bank indicators that we have filtered. It has only four dimensions, in this case, we have three, Country, Indicator Name, and Year. The data source on floods has Country Name and several other date fields. We want to join on Country name, as shown in the following screenshot:
If we look at the Population and Land Data source, we do not see any of the gray chain links next to our dimensions that indicate that the fields are linked. There's a reason behind this. The names of the fields are not exactly the same. In order for Tableau Public to identify the join conditions automatically, the field names need to be exactly the same, with both capitalization, punctuation, and content. The field types need to be exactly the same as well.
From the Data menu, we can create a join on Country name in the Floods, as follows:
Next, we need to tell Tableau Public that the Country Name field in the major global floods data, which was created to remediate data quality issues, is the same as the Country field in the major global floods data source, as shown in the following screenshot:
You can create only one custom join at a time. So for instance, if you would like to add more joins, such as a date field, you need to add each one individually by clicking on the OK button and then clicking on the Add button again, which we did in order to add a second join on the year of the event.
Note the small arrow next to the Began field, which is above Country Name. This field allows you to establish a join condition on a specific date part. Since we also wanted to join the data sources on the year of the flood event, we expanded the list to see the appropriate dateparts, clicked on Year, and then selected Year in the secondary data source.
We clicked on OK again to return to the visualization, which is still the map that we created that shows the average flooded land.
The view is now different. Looking at the Data window for the Population and Land Data source, you will see that that there are two chain links, the chain link for Country, and the following is orange. Because the linking field is a part of the visualization, you will see that Country Name is on the Detail shelf on the Marks card. However, Year has a gray, broken chain link because it is not a part of the visualization. If you want it to be used in the calculated fields, click on the gray chain link to activate the join. Even though the field might not be on the visualization, it still be a part of calculations using fields from its data source, as shown in the following screenshot:
We can create a calculated field that produces the average amount of land in each country that was flooded. In this case, since some countries might have had major floods that add up to more than 100 percent of the total flooded land, we will create the maximum percent of land flooded, as shown in the following screenshot:
In order to create this calculation, perform the following steps:
Check out the correct calculated field, It's important to learn how to handle errors. The next two errors are very common and are as follows:
ATTR
function on it. ATTR tells Tableau to use the discrete value for it: