An Upset plot is an alternative to the Venn diagram used to deal with more than 3 sets. The total size of each set is represented on the left bar plot. Every possible intersection is represented by the bottom plot, and their occurrence is shown on the top bar plot.
UpSet plots the intersections of a set as a matrix. Each column corresponds to a set, and each row corresponds to one segment in a Venn diagram, as indicated in the figure below.
Cells are either empty (grey), indicating that this set is not part of that intersection, or filled (black), showing that the set is participating in the intersection:
- The first row in the figure is completely empty - it corresponds to all the elements that are in none of the sets
- The second row corresponds to the elements that are only in the set A, (not in B or C).
- The fifth row corresponds to the elements that are in set A and B.
- The last row corresponds to the elements that are in all three sets.
This Upset plot...
...is equivalent to this Venn plot.
The plot is run twice,...
...once for each period.
SAVING THE PLOT
To save the plot as a PNG file, right-click on the plot, then select "save image as".
Open the expander under the 📊 Plot type drop down menu by clicking on the ➕ sign.
Additional options are available for selection.
Small multiples does not work for the moment.
You can choose the dimension you want to plot as sets of the plot as well as the dimension over which to calculate communality. In this example "Material" is the dimension plotted as sets, while "Country" (the materials sold in each country) is the dimension used to calculate communality.
You can choose to plot a Upset diagram with up to ten sets.
In this example, if you decide to plot 4 sets, the app will automatically aggregate the other sets together.
You can also exclude the other sets.
If there are many intersection, the app might return an unreadable chart.
You can filter out the intersections with less than a given number of elements. For instance the chart above, filtered with only the intersections having at least 6 elements, becomes this...
...and filtered with only the intersections having at least 10 elements becomes this.
if you filter too much..
you will get an error.
You can plot the distribution of the variables along the metric as strip, box or violin plot.
The strip plot selection returns this
It shows the distribution of the observations for it column.
For the first column, for example it answer the question:
"OK I am selling 272 of my products only in the USA, but which products? high sellers? low sellers?"
Apparently mostly average-sellers. All the dots are concentrated together in a little cloud just on top of the 272 number. But there is a lone overachiever. This little guy, sold only is the US, is very successful.