Saturday, August 19, 2006

Sorry but this is VERY BAD data representation

Now, i am not exactly the rude person but i must admit to being irritated when i saw the chart below. What's the data and what's the chart saying?
Now, are you irritated as well when you saw this chart? Full article here.

This type of chart is called a doughnut chart (abbr. as D-chart henceforth) and is one of the numerous chart types available in MS Office applications.

Irritation reason 1: MAJOR reason, data vs chart-type
- The data is '% growth'. How in heaven's name can that be represented as a D-chart!
- The D-chart is nothing but a type of a pie chart.
- These are charts that you can only think of using when you have constituent segments of a category that add up to the whole category. Mostly used with Percentages/Proportions that add up to 100%/1.
- BUT THESE ARE % GROWTHS! HOW, OH HOW CAN YOU PUT THAT INTO A D-CHART.

Irritation reason 2: Why a D-chart
Why not a pie chart, if at all - thought that's not considered a nice option too.

Irritation reason 3: Why a chart at all?
These are three numbers! How about just writing the following:
"India has registered a 7.8% growth rate in it's online population as compared to China's 5.22%. India's growth seems impressive considering that the overall worldwide online population grew at 2.7%, but it should be noted that this is coming from a much lower base"
Irritation reason 4: What's with the growth rate?
This continues from the highlighted phrase of the previous point : Is growth really a good metric in this case? Let's say the country of Baddatitis had only 10 people who used the internet in last year in the month of July. Now this july this figure went up to 20. Whoa! a 100% growth rate. Baddatitis' rules!
The base is important. What are the base penetration figures? The fact that the 'World' grew at 2.7% means nothing. It includes a country like New Zealand with a penetration of more than 75%, India currently is about 5%! More space to grow => More growth rate

For more interesting stats see: http://www.internetworldstats.com/stats3.htm#asia

Irritation reason 5: Laziness
You are justified in thinking i'm picking holes here, but trust me, if you were showing this chart to someone who has worked with data using MS office before, the first mental reaction would be "Damn, these are the default colours in Excel - these guys didn't even bother changing the colours while preparing the chart"

Irritation reason 6: Definitions used
Two points:
1) What is 'World'?
All countries except India and China? Should be the case, and most likely is, but after seeing that chart one has to even ask this question, since it's not made explicit in the article.
2) Online population

The article defines this as "people aged over 15 accessing internet". Apart from asking why the age filter (since kids are now taking to the net in a big way), my first question is
"How do you define 'accessing internet'? ". Accessed at least once in the last month? At least thrice in the last quarter? What?

I realize that this is an article aimed at general readers but my humble opinion is that journalism (of whatever kind - even technical journalism) should set standards in thinking. I honestly don't think that has happenned here. Sorry.

Update: August 26th, 2006
Arunn, through SS Katte, helpfully dropped a comment here on his post. His post is way more detailed than this tiny piece. It'll give you a nice walk-through of the numbers involved. As well as the interpretations.

12 comments:

Kshitij L said...

And this on CNN-IBN. Brilliant. CNN doesn't know what sort of thing is happening to their reputation here.

Ajith said...

Sharan,
The only possible reason why the creator used this is maybecos he loves donuts...!:-)
nuts love donuts!

Sandeep Bhasin said...

I saw this on their website and didn't even noticed this "goof-up". Thanks for helping me learn a new concept today.

(...... and i am sure almost all of us who saw this on the website did not notice it. The article has got a rating of 9+.)

Sharan Sharma said...

@kshitij,
Am a little stuck here. See, i am associated to this industry so can't comment on individual channels. Just happenned to pick this up since this was so striking when i came across it

@Ajith,
:)

@Sandeep,
Anytime!

Anonymous said...

Sharan Sharma: Good debunking. I have made a post on this one as well, although a bit longish, to educate more on why even the interpretation is also misleading.

Included your (this) post therein.

Keep writing more such nice things.

Sharan Sharma said...

Hi Arunn,
Thanks for dropping by.
I visited your blog and saw that your entry would actually be a better detailed one. Will update my post accordingly!

Sharan Sharma said...

Hi SK,
ha...ha...Yes, thank you for actually pointing these out:

1) Not having a blogroll

Believe me, if i had a blogroll, there's no way i wouldn't have linked to you.
The reason i don't have a blogroll is that i've seen it become a mutual back-scratching exercise. Also, a way to 'reward' people. And where does one stop? Many of our friends, colleagues, students (past and present) have blogs. Some of them comment. Then not putting them on the blogroll but putting other distinguished people/bloggers on the roll (like you) becomes discriminatory. So to steer away from all these problems, i decided to not have a blog-roll.
But again, believe me, the day i decide to have one, you will be there on it.

2) Not thanking you for leading Arunn to this post

Of course, you said it out of jest, but just to take it at face value for a minute...

Yes, a 'thanks' is in order. The reason i didn't thank you for leading Arunn to this post is that i felt it probably insulting (to you) to do so.

After all, we are involved in our small ways (small goes for me!) , in the spreading of knowledge - which does not require a special 'thanks'. Since your motive was purely this all i did was to include your name appropriatey in the post itself.and not thank you explicitly.

Also, i believe that we might get caught in the trap of driving audiences. Or being thrilled when a lot of people are driven to visit our sites. Then our messages starts getting diluted much like MSM.

I am sure you've seen bloggers that way - they start off sincere and then the audience increase - then they are more worried about popularity than expressing themselves freely.

Hence, thanking you would be in a way trying to say "i am happy that more people are visiting this site - thanks for getting them here" - a thought which i am a little uncomfortable with and i thought, might be insulting to you.

* Finally, what strikes me most is that you are such an open minded person.*
After my 'attack' on your blog, to be so open minded is a real rarity in this world - especially when you know that i have a point of view so different from yours. Thanks!

Anonymous said...

Sharan: The attention span of most of the internet users for them to stay in a particular page seems to be very short. In that sense, I think my post, although detailed etc., is a bit too long for our liking. Your's IS short and sweet.

Anyway, crosslinking could of course make people of either kind (with long or short attention spans) would want to read the issue of our posts and get the purport. Which is fine with me.

About Katte: He is a friend of mine for enough years; an extremely honest and decent person. So, your surprise is not a surprise for me. I would be surprised had he been any different (I read your conversataion at his blog on that VM topic)...;)

Blogroll to have or have not: Yes, your "problems" on having one are genuine. Thankfully, many of my friends don't blog. Even otherwise I made a separate stand-alone page for my blogroll...;)

If a blog is also judged by its blogroll (whether by others or by the owner, doesn't matter), then yes, choosing one's blogroll content is an onus.

I believe my Nonoscience blog is for a purpose, however pretentious it may sound to others (I have other blogs with equally different other flavors). The purpose of nonoscience being to discuss the effects of good and bad science and science related issues to a general audience.

I maintain a blogroll whose content also reflect this "character" of my blog. So, even if a friend keeps a blog, if its contetns don't match with mine, I don't blogroll him/her in that particular blog of mine.

Of course, if I couldn't judge a particular blog's character totally, but still like it because at least some of its posts debunk some nonsense and educate an audience, I still would want to add that blog to my blog roll. So I keep categories under my blogroll ranging from "just for browsing etc." to "good science content ones etc." and so could classify it accordingly. Well that is what I do and it works for me, so far...

Sharan Sharma said...

Thanks for your ideas, Arunn!

Sharan Sharma said...
This comment has been removed by a blog administrator.
Sharan Sharma said...

Thanks SSK!

Anonymous said...

Sharan: Thanks for the mention of my post in yours. Kindly redirect the link from my name to my blog itself (i.e. to the link in my comment's signature) and not to where it goes now. Thanks.

Dr. Katte: I shall make a post on blogrolls...including these discussions...;)