Advertisement
Guest User

Untitled

a guest
Jan 19th, 2015
235
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.77 KB | None | 0 0
  1. *For the -Kremlinologists- Plusologists out there*
  2.  
  3. TL;DR: 0.2% of G+ accounts have posted something _other_ than a YouTube comment in the past 18 days -- about 4.4 million people _publicly_ using G+ for posts.
  4.  
  5. There are about 2.2 billion G+ _profiles_.
  6.  
  7. Of these, about 9% have _any_ publicly-posted content.
  8.  
  9. Of those, about 37% are YouTube posts, another 8% are profile photo changes.
  10.  
  11. Only 6% of active profiles have any post activity in 2015 (18 days so far).
  12.  
  13. Only _half_ of those, 3% of active profiles, are _not_ YouTube posts.
  14.  
  15. *That is, 0.2% of all G+ profiles, about 4.4 million users, have made public _post_ in 2015. That's 244,000 users posting _daily_.*
  16.  
  17. More than "hundreds", but not by all that much.
  18.  
  19. This _doesn't_ include non-public posts or _comments_, but its' a pretty clear indication of the level of activity on G+.
  20.  
  21.  
  22.  
  23. How do we get this?
  24.  
  25. A rough sense of G+ size in terms of profiles can come from the sitemap files.
  26.  
  27. If my rough counts are right, 50,000 sitemap files of 45,429 entries gives around 2.2 billion profile pages.
  28.  
  29. Search for blank profiles turns up a count far lower than mentioned in the article (~20k). I suspect funny bidness.
  30.  
  31. See below. A crawl of an arbitrarily selected sitemap (2820 entries so far) shows about 9.2% of Profiles have any public activity. That gives us 202 million users with _any_ activity on G+ at _any_ time. Let's see if I can't find a most recent post date for those that _are_ active.
  32.  
  33. 37 of 100 most recent posts are comments to YouTube videos. That's 37%.
  34.  
  35. OK, of 283 profiles checked so far _with_ comments, 34.6% have as their _most recent_ comment a YouTube video comment -- literally "commented on a video on YouTube".
  36.  
  37. Another fairly commonly occurring pattern is "changed * profile photo" That's another 8.1% of posts.
  38.  
  39. So, of 283 profiles _with_ posts, what's the most recent post date? Only 245 of 283 _have_ a "Shared publicly" line.
  40.  
  41. By year:
  42.  
  43. 4 2011
  44. 23 2012
  45. 62 2013
  46. 139 2014
  47. 17 2015
  48.  
  49. That is, 17/283, or _six percent of *active* profiles have left _any_ content in the first 18 days of 2015.
  50.  
  51. Of those 17 posts, 8 are comments on YouTube videos -- that is, *this is the payoff of the : it's _doubled_ the _apparent_ traffic on G+.*
  52.  
  53. But this leaves us only 3% of _active_ profiles as active in 2015 -- that's 3% of 9%, or *0.2% of all G+ accounts are active. Roughly 4.4 million people.*
  54.  
  55. That's actually far lower than I'd been allowing for previously (30 - 100 million posts).
  56.  
  57.  
  58.  
  59. *More on methods*
  60.  
  61. OK, let's crawl G+.
  62.  
  63. I've picked a single sitemap file and am crawling the profile pages on it with the following script:
  64.  
  65. i=0; time zcat sitemap-25007-of-50000.gz | while read URL; do i=$(( i + 1 )); echo -e "$i: \c"; lynx -dump $URL | grep "hasn't shared anything" || echo "Not found"; done | tee log
  66.  
  67. This produced output similar to:
  68.  
  69.  
  70. 1: Jenilee hasn't shared anything with you.
  71. 2: Brian hasn't shared anything with you.
  72. 3: Gene hasn't shared anything with you.
  73. 4: kishor hasn't shared anything with you.
  74. 5: Daniel hasn't shared anything with you.
  75. 6: aping hasn't shared anything with you.
  76. 7: Corey hasn't shared anything with you.
  77. 8: Not found
  78. 9: Ohh hasn't shared anything with you.
  79. 10: kinyo2006 hasn't shared anything with you.
  80. 11: patrik hasn't shared anything with you.
  81. 12: Melina hasn't shared anything with you.
  82. 13: Not found
  83. 14: Akihito hasn't shared anything with you.
  84. 15: Paul hasn't shared anything with you.
  85. 16: Pamela hasn't shared anything with you.
  86. 17: Eddie hasn't shared anything with you.
  87. 18: bekzat hasn't shared anything with you.
  88. 19: H hasn't shared anything with you.
  89. 20: Calm hasn't shared anything with you.
  90.  
  91. ("Page" profiles create dupe output that's filtered out in the analysis above). The logfile creates the data source I've used for further analysis above.
  92.  
  93.  
  94. With 214 profiles crawled (there are 45,000+ in the file, this'll take a while or I'll be blocked), I see 9% rate of profiles _with_ content on their pages. That's 20 of 214 pages.
  95.  
  96. I'll posit this is close to a random sampling, though other sitemap pages can be compared as well, and others can replicate the method here.
  97.  
  98. And yes, I could parallelize the process but suspect that I'm in danger of triggering abuse blocks as it is.
  99.  
  100. Going through the list and looking at most-recently-posted dates would also be interesting.
  101.  
  102. Note that those who are using G+ for _only_ non-public, or Community discussions won't appear here.
  103.  
  104. But it's pretty clear that the rate of participation is about 8-12% of all _created_ accounts.
  105.  
  106. Now at 623 queries: 55/623 => 8.83%. Percentage is actually trending down.
  107.  
  108. 1,195 queries, 9.4%.
  109. 6,569 queries, 8.91%
  110.  
  111.  
  112. http://www.labnol.org/internet/google-plus-users/21035/
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement