Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Some nerdy stuff below because I was bored and avoiding my PhD (ironically, also on data mining):
- 22 hours of data collected, once per minute. Sampling likes and "mentions" (mentions are cached and only update once every 24 hours).
- Raw data is here: http://puu.sh/3YjTi.csv
- This is what Abbott's graph looks like:
- http://puu.sh/3YkGP.png
- This is Rudd's:
- http://puu.sh/3YkIQ.png
- There are four distinct periods:
- "Normal": 8am-11pm EST
- "Rampdown": 11pm-12:15pm EST
- "Low": 12:15pm-6:45am EST
- "Rampup": 6:45am-8am EST
- During these periods, the differences were pretty stark:
- First Normal
- Rudd: Mean 1, Stdev 1.18
- Abbott: Mean 18.22, Stdev 4.31
- Rampdown
- Rudd: Mean 1.19, Stdev 1.07
- Abbott: Mean 12.32, Stdev 4.43
- Low
- Rudd: Mean 0.22, Stdev 0.53
- Abbott: Mean 2.89, Stdev 2.38
- Rampup
- Rudd: Mean 0.39, Stdev 0.69
- Abbott: Mean 10.15, Stdev 3.06
- Second Normal
- Rudd: Mean 0.6, Stdev 0.82
- Abbott: Mean 18.47, Stdev 4.96
- Some notes:
- Not only are the total numbers of likes way too high to account merely for likebait saturation, Abbott's variation is far too low for to be entirely human-based. There are also no major spikes that we'd tend to see during periods of policy releases (for instance, his release of indigenous policy this morning). The means of each time period tend to line up too perfectly and again, lack in variation. Overnight, the deviation returns to what we'd tend to expect from this kind of data - indicating that the bot/net are probably turned off overnight after they ramp down/up (so as to not see an immediate jump from a mean of ~16-20 straight to 2-3). In short, this is exactly how I'd code a bot to be difficult to detect (if I were actually ridiculous enough to do so).
- There's some other notes: the number of mentions cached line up almost exactly with the number of likes added in the same time period - 6727 likes from collection start to cache refresh, 8010 new mentions since previous cache refresh. There are several hours unaccounted after the previous cache refresh, which would likely make up the missing number there. Note that in a similar time period, Rudd gained approximately 350 likes and 0 new mentions. As people tend not to reference people they don't know on facebook except through page likes (not mentions), you can pretty safely assume that each bogus account is also mentioning the page 0-1 times.
- Since collection, Rudd has gained 582 likes and 0 mentions, and Abbott has gained 15867 likes and 8010 mentions. It's worth noting that Abbott averages to almost exactly 1000 likes per hour yesterday, and 1200 today. Rudd's is all over the place, by comparison - 20-60 yesterday to 35-50 today.
- tl;dr it's a bot, but written exactly how I would do so if I were to write a bot to spam likes. I'd make a couple of modifications - randomised ramping start/end times (within a tolerance of 2 hours), and much greater variation in the number of likes per minute.
- e; bonus stat: it's costing the Libs (or whoever's running it) about $35$120/hr for the likes.
Advertisement
Add Comment
Please, Sign In to add comment