OT - Benford's Law and forum

Jeremy Leach

Senior Member
This is definitely off topic !

In the current New Scientist magasine (16th Oct 2010) there is an article (page 10) about Benford's Law. This law is all about sets of numbers and how some naturally occurring sets of numbers obey this law - areas of rivers, earthquake data etc.

I'm NOT knowledgeable on this at all, but they suggest people look at their own data - so I have, at the number of views of posts on the PICAXE forum ! It's only a quick analysis of the first two pages of posts as of today, but my graph is attached.

I won't fight anyone who says this is totally useless data for the forum :) - but it's just interesting that the view data might obey this law (no ?!). It might be because views grow exponentially - ie posts to a thread attract more posts and more views, in an exponential way.

The article says deviations from it can detect tax fraud, voter fraud, digital image manipulation. There's a lot more info out there on the web, for instance Wiki info here :
http://en.wikipedia.org/wiki/Benford's_law
 

Attachments

Last edited:

hippy

Ex-Staff (retired)
Interesting and probably a valid observation but not sure I'd call it a 'law' as such.

If you take all the possible numbers 1 to 19, over 50% start with 1, and you have to go to the set of 1 to 99 to even that out. That's quite a big increase. Add another 100 numbers, so 1 to 199 and you've slanted it in favour of 1 again, need to add another 800 numbers to bring it back to even. Reach 999 and you need to go up to 9999 to bring it back to balance.

What the 'law' seems to say is that for a certain sized set the probability of starting with a particular digit is defined by that set size. Can't argue with that.

Where I'd say the 'law' is being exaggerated in usefulness is in trying to apply it universally, any sized set. The Wikipedia section on 'Multiple probability distributions' shows why this doesn't work because the target measurement would generally be in a set which starts with 1. For example, most people will be at least 1 metre in height but less than 2, weigh above 10st and less than 20st, 100kg and less than 200kg and so on, even most resistor values plucked from the air are 1K, 10K and 100K.
 

Jeremy Leach

Senior Member
Hmm ... yes I see what you're saying, and the Wiki does say
Benford's law can only be applied to data that is distributed across multiple orders of magnitude
Oh well, keep posting everyone, but don't try to bump posts to deliberately affect the leading digit of the view count becuse I'll spot it :)
 

hippy

Ex-Staff (retired)
I don't have a 'pure maths' leaning so don't really understand the underlying law or its usefulness, but suspect that its latest presentation turns the law round on itself to present something which appears to be more interesting or significant than it is.

From 'for any specific set size the probability of first digit is pre-ordained' (true) it's observed that 'for a majority of sets they are of such a size that the probability matches this pattern' (possibly). It seems to me to be more about set size than probability of first digit though the two correlate.
 

techElder

Well-known member
Don't forget to verify how product prices follow the probability of starting with the digit '1' such as, "199.99", "1.95" or the ubiquitous twofer "19.99".

PS. Catch me if you can.
 

BeanieBots

Moderator
If it's a "law" then things such as the native language should have no effect.
What happens if you present the data with for example Roman numerals?
Or, maybe even HEX digits.

Suddenly, the maths goes a bit haywire which suggests it's more 'manipulation' than 'law'. Humans simply love to have rules!
 

Pauldesign

Senior Member
"199.99", "1.95" or the ubiquitous twofer "19.99".
OT:D; the economists or groups who invented the trailing .99 or .95 system is clever. This usually fool the buyer to think it less than what he/she is paying for. e.g What is the value difference between $2.000,000 and $1.999,999.
But there is a mind difference of $0.000,000,1.:rolleyes:

Interesting and probably a valid observation but not sure I'd call it a 'law' as such.
You're right Hippy, that's an abstract hypothesis (not even a theory) and not a law. ;)
 

Jeremy Leach

Senior Member
I'll leave the next step to someone else :).

The 'law' does also say that it seems to be 'scale invariant' so changing the units of the series still comes up with the same curve ...

It's way beyond me, but thought it might generate some web server traffic :)
 

John West

Senior Member
I tend to think of resistors in the 47 Ohm - 470 Ohm - 4.7K - 47K - 470K range when I think of them off the top of my head. But that's because I think I have more of them.
 

Dippy

Moderator
I'm so sorry to hear that John West.
I tend not to think of resistors at all if I can help it.
Take the car down town and have a day off.
 
Top