Expertise, application examples and advice about visualisation

The whole title:
Big Data, Big Dupe – A little book about a big bunch of nonsense

It was and is much written, talked about big data in recent years. For high hopes and vague opportunities even more money flows. Here comes this little book – published in February 2018 – by Stephen Few just right. Fluid to read, it admits with the errors that are so popular today in connection with large amounts of data:

  • Large amounts of data lead to more information only due to the quantity.
  • It counts the correlation, causality is negligible.
  • Statistical samples are obsolete due to the large amount of data.
  • We should measure everything, because we have data from everything. And a few more mistakes …

The exciting part of this text lies in the way these errors are revealed. Few refers his arguments to original statements by experts, quotes scientists and questions an entire industry based on concrete statements.

 

ConclusionCover: Big Data, Big Dupe - a little book of a big bunch of nonsense von Stephen Few
In short: Worth reading!!!

Slightly longer: It is not a clearly structured textbook, which one must work through. On the contrary, it entertains and provides interesting food for thought through many cross-references, links tips and hints. For those who are deep in the matter, it may be not detailed enough, leaving too many questions unanswered. But it helps to stay awake for the topic, it cleans up in a fun way with big data nonsense. The hope of Stephen Few is that if it´s telling the truth, something small can expose and unmask something great. In this sense, this small format serves the purpose.

 

 

For those Interested:

Title Big Data, Big Dupe
Subtitles A little book about a big bunch of nonsense
Author Stephen Few
Edition Published February 01, 2018 (available in English)
Pages 96
Publisher Analytics Press, New Jersey
Price about 10 $ depending on the provider
ISBN Print: 978-1-938377-10-5 (Hardcover); E-Book: 978-1-118-85841-7

 

If you want to read more about the author: https://www.perceptualedge.com/blog/

 

Data visualization in general means to bring abstract data and relationships into a visually comprehensible form. So far so good. But what are the unfamiliar forms of representation? When do I apply this? When are other forms more meaningful? I would like to introduce the project of Severino Ribecca to all those who ask these questions more often:

www.datavizcatalogue.com

 

This online catalog is a library of various information visualization types. Initially, the project Ribecca served to expand its own knowledge through data visualization and as a tool for one’s own work.

He himself writes about the project: “However, I would like to know how it is. Although there is no such thing as a visualization method, it is not the only way to make sense.

And, fortunately, Severino Ribecca does not hide the knowledge, right on the home page is in the middle:

http://www.datavizcatalogue.com

 

In addition to the first overview, the “Search by Function” option is particularly appealing to practitioners. Here again, the initiator relies on a fast-to-capture image language:

http://www.datavizcatalogue.com/search.html

As soon as you click on an icon, you get a well researched and edited knowledge about the selected visualization type. Each display form is displayed with its functionality and application possibilities and is visualized with dummy data. At the lower end of the description, the reader also finds references to similar display forms:

http://www.datavizcatalogue.com/methods/network_diagram.html

So there are a maximum of three well-structured clicks up to the knowledge about individual visualization forms. I find this project visually very successful and meaningful. Perhaps this tip gives you some inspiration for your information visualization?

 

In any case, I wish you Happy Reporting 🙂

Yours, Silja Wolff

 

 

Portfolio diagrams can be created with the Excel „bubble“ diagram modus for many applications.  What makes this type of diagram special is that a total of three data dimensions can be visualised: X-position (horizontal), Y-position (vertical) and the bubble size (area).

Portfolios are primarily used if a quantity of data are to be segmented. In an investment portfolio, for example, factors such as productivity and risk index can be inserted on the axes, whilst the bubble size represents the investment volume. Typical portfolio applications include product/market observations, employee performance and strategic project management. In all such cases, visualisation enables easily recognisable sub-categorisation of data (products, projects etc.) in certain groups, combined with evaluation options for good/bad or preferred/not preferred. For example, high-risk projects with a low yield can be identified at a glance.

As in all diagrams, portfolios should be concise and informative. Easy to read labelling and precise highlighting play an important role. Unfortunately, standard functions for this are quite limited in the Excel bubble diagrams. We have thought up some good ideas to create professional and informative representation with a low outlay. Find out how to do this in the following.

  • Create a bubble diagram with a single data set (names of countries may not be marked)
  • Set axis length and gridlines according to requirements
  • Set data set colour to grey (neutral, no highlighting)

  • Develop formulae for automatic display/hiding of differently coloured data sets (grey, red, green) Hide using #NV

 

  • Change the data source of the first data sets, so that the bubble size is used from the previously calculated formula area
  • Insert and format two additional data sets with the corresponding source areas for red and green, only a few bubbles are visible in each of these data sets

Tip: Bubbles with a semi-transparent filling and white frame line are also easily identifiable when several data points intersect.

  • Insert an additional data set on the basis of the original source area and format it invisibly (no filling, no line), all bubbles in this data set are visible
  • Display data labelling for this data set and link with the cells containing names of countries

Tip: Linking works with the method from the article “clever data labelling”. For bubble diagrams in particular, it´s best to first click on one of the data labels and then press the cursor button (arrow) on the right. In this way, the first label field is accurately marked, which can then be linked with the first country names using “=” and a mouse click. Then move the cursor button to the right to the second label field and so on. As „column labelling” is not foreseen for the bubble diagram, the linked labels you made yourself are the only option. The labelling option “value from cells” is available from Excel 2013 onwards, allowing the same result to be achieved faster.

Done 🙂

 

 

 

One of our previous posts discussed small multiples, i.e. a series of charts consisting of several small charts of the same type. It also describes conceptual considerations on typical application scenarios involving small multiples.

This post introduces an alternative method which you can use to create small multiples for simpler application scenarios at the blink of an eye. This type of alternative is especially interesting if the speed of implementation is more important that a special, individual chart design.

The idea: Excel sparklines are ‘opened up’ into a small multiple.

Sparklines are very simple diagrams nested within a single cell. Generally speaking, several of these charts are bundled together in a group, for which certain properties are defined, e.g. column colour or axis length. Use the following steps to transform standard sparklines into small multiples in a matter of minutes.

  • “Insert” menu -> Sparklines -> Column / select data and location range so the sparklines appear next to the input area for the values
  • Link the first sparkline with the first country name / copy the formula downwards for the other sparklines

  • Move the cells with sparklines into a cell range with the desired layout

The interim result doesn’t provide a good overview over the data yet because Excel automatically displays each sparkline with an individual axis. Therefore, the charts need to be given a standardised axis using the following option:

  • Click on a sparkline / “Design” menu -> Axis / set Minimum and Maximum value to the same for all sparklines

To increase the informational value, if required, you can also automatically display certain columns in a different colour, e.g. High Point and Last Point.

Done 🙂

 

 

 

 

 

Labelling data points in charts is generally more reader-friendly than using a vertical axis. The axis is automatically displayed in Excel, and therefore it is often found in reports and presentations. However, it forces the chart reader to move their eyes back and forth between the data points and the axis.

If data points are labelled directly, on the other hand, the reader’s eye can rest steadily on the data points. That way, readers of your charts can concentrate on the content (!). Unfortunately, standard labelling options in Excel are often too inflexible, and the results are quite ugly. In line charts, for example, there are often overlaps between label and line, and in the worst case, they are no longer legible.

Excel does offer different label positions (top, bottom, right, left, centred). But these are always applied to all data points at once, although an individual alignment per data point would make more sense. Of course, you can also move the labels manually. But this can also lead to problems, as the manual position also needs to be changed by hand after data updates.

A little trick provides the solution: an invisible auxiliary data range can be used to generate data labels that always position themselves correctly. In the case of a line chart, you can have it calculate whether it ‘bends’ up or down for each point. The formula then compares each data point with the median value of its two neighbours. Of course, you may have to use completely different formulas for different chart types.

Use the following steps to create clever labels for line charts:

  • Create a formula to move the label up (+1) or down (-1)
  • Calculate auxiliary data range “Label” for data labelling (at the points level)

  • Add data range “Label” to the diagram, make points and connecting lines invisible and insert centred data labels

  • Click the first label field of the new data range twice (don’t double click), so that the individual label field is selected
  • Type “=” and click on the table cell showing the value to be displayed (e.g. the value for January), then confirm with enter (this creates a so-called linked label)
  • Link the other data points with the corresponding table cells

Done 🙂

Tip: Excel Version 2013 introduced the labelling option “Value From Cells”, with which you can achieve the desired result much quicker.

 

“Small multiples” are charts that display a range of data in several smaller charts of the same type. This is often easier to read than mixing the same data in a single, more complex chart. The informational value and overview are the result of the aggregate charts in this case.

The specific application scenario determines whether to use a small multiple or a single chart with several data ranges. Stacked column or bar charts visually highlight the sum. Small multiple charts place the visual focus on the individual data ranges.

If the company revenue is displayed as a stacked bar chart (1 chart with 4 data ranges), for example, then total revenue is easy to see, but less so the precise development of individual product groups.

On the other hand, if 4 single charts (with one data range each) are displayed for the product groups, then revenue performance for each product group can easily be gauged. Additional details such as deviation indicators can easily be implemented at the product group level.

To maintain a good overview, however, you need to standardise the individual charts making up the small multiple. A consistent axis scale is especially important. Highlighting improves the informational value, e.g. using the signal colours green and red to mark positive (desired) or negative (undesired) deviations.

Unfortunately, this type of chart isn’t part of Excel’s native functionality. Nevertheless, you can easily implement custom small multiples in just a few steps.

Create Chart

  • Chart Type: Cluster Bar
  • Data ranges: main data range and min/max data range for a consistent axis scale

Format chart

  • Add Data Labels
  • Format Data Series… Series Overlap: 100% / Gap Width: 50% / bar colour: Solid Fill = green, Invert if negative, second colour = red / min/max colour: no fill

  • Format Axis… Axis Options: Values in reverse order / Line: Solid Line, black / Label Position: Low
  • Format Chart Area… Fill: No fill / Border: No line / Font size: 8 pt

Position chart

  • Align chart area (outside) and drawing area (inside) to the cell using the Alt button

Copy charts

Adjust the data source for each chart

Done 🙂

Tip: In one of our next posts, we’ll be presenting a second way of realising small multiples even quicker for simple application scenarios.

 

 

 

 

 

In bar diagrams multiple streams of data can be grouped (side by side) or piled (on top of each other). Both display formats are important. Sometimes it is also practical to combine both forms. E.g. two piled columns side by side per month.

Due to the fact that the combination of piled and grouped columns is not available within Excel, we need our today’s trick to accomplish our illustration:

  • reserve a separate segment area for the streams of data of your diagram. E.g. two columns per month plus one column of space between months
  • Transfer the values from your input area into the reserved segment (via segment source or formula)
Template: Schrittweise zum Säulendiagramm mit Doppelstapel
  • compile a piled bar diagram from the specifically built formula
  • Adjust the space width of columns to a very small value such as 5%
  • As needed pigment the columns and fade in captions
Template Säulendiagramm, Doppelstapel
  • Add legible sums and legends (see our trick “captions”)

Finished 🙂

Säulendiagramm und Doppelstapel

 

It´s mostly about variations when presenting data to decision makers, such as planned, target or empirical values. Only these variations allow us to reasonably evaluate the actual numbers.

Variations should play a major role during the visualisation as well. Today´s trick shows us that this is feasible within Excel only with a few clicks in specific cases:

  • Compile a diagram with two curve streams of data
Abweichungen Schritt-für-Schritt erklärt
  • Mark the diagram and add the variation columns by choosing the menu diagram draft
  • Format the variation columns red or green respectively. The column width can be adjusted to one of the curve stream of data by right click. (parameter “space width”)
Abweichungen mit Excel darstellen
  • Add data caption and legend if applicable (see our trick “captions”)

Finished 🙂

Excel Template Abweichungsdiagramm

 

 

Data captions and legends are important to understand the content of diagrams. Unfortunately the provided alternatives to caption are partially not legible. For example, sums on top of piled columns are not provided. You search in vain for standard functions such as legends which support the viewer not to look back and forth as they are placed directly next to the stream of data.

With our today´s trick it is possible to implement legible and at the same time dynamic captions:

Part 1: Sums on top of piled columns

  • Calculate sum values and insert as new stream of data (right click on diagram, “choose data”, add stream of data)
  • Change stream of data type to “XYZ” of the new stream of data (right click on new column, choose “Change stream of data type”)
  • Invisibly format items and fade in the data caption on top
Summe in Excel Diagramm

Part 2: Legend next to piled column

  • Fade out the standardised legend
  • Extend the column stream of data to an area with empty segments, so that there is enough space for the legend next to the last column
  • Calculate the X- and Y- coordinates for the legend text, depending on the data of the last column (see sample data file)
  • Insert a stream of data of the type “XYZ” (as in part 1)
Steuerung von Excel Legenden
  • (The X-values for the horizontal alignment of the points, can be divided into the data source capture only after the type “point XY” is selected)
  • Formatting points invisible and show data label right
  • First data label of the new data series, click twice (not double), is so marked a single label area
  • “=” type and click on the table cell in the desired legend text is, then Enter to confirm
Excel Template
  • Connect the remaining legend texts with the equivalent chart segments

Finished 🙂

Template Beschriftung direkt an der Säule

In many cases it is important to allocate data to multiple diagrams so the presentation is clear. It results in so-called Small-Multiple illustrations.

If each single diagram uses its own scale, the informative value of the entire page will get lost. In terms of a high quality visualisation, unitary scales are very important. At the same time the diagrams should be of low-maintenance and dynamic.

Negativbeispiel

Skalierung mehrerer Diagramme ohne Excel Trick

This challenge can be managed in Excel with our today´s small trick:

  • Calculate the minimum and maximum of all (!) single values (function MIN and MAX)
  • Insert into each single diagram the minimum and maximum values as an extra data stream
  • Invisibly format the minimum and maximum data stream (no filling, no frame line)
  • Align diagrams of same size next to each other

(keep holding the alt-key when pulling the mouse; then your diagrams lock with the segments)

Excel einfügen einer min-/max-Datenreihe

Done 🙂

Excel to go Skalierung