lyfsy

XSS attacks on Googlebot allow search index manipulation

Feb 22nd, 2020
78
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 9.23 KB | None | 0 0
  1. XSS attacks on Googlebot allow search index manipulation
  2. Last year I published details of an attack against Google’s handling of XML Sitemaps, which allowed an attacker to ‘borrow’ PageRank from other sites and rank illegitimate sites for competitive terms in Google’s search results. Following that, I had been investigating other potential attack when my colleague at Distilled, Robin Lord, mentioned the concept of Javascript injection attacks which got me thinking.
  3. ++++++++++++++
  4. list of top cheapest host http://Listfreetop.pw
  5.  
  6. Top 200 best traffic exchange sites http://Listfreetop.pw
  7.  
  8. free link exchange sites list http://Listfreetop.pw
  9. list of top ptc sites
  10. list of top ptp sites
  11. Listfreetop.pw
  12. Listfreetop.pw
  13. +++++++++++++++
  14. XSS Attacks
  15. 2There are various types of cross-site scripting (XSS) attack; we are interested in the situation where Javascript code inside the URL is included inside the content of the page without being sanitized. This can result in the Javascript code being executed in the user’s browser (even though the code isn’t intended to be part of the site). For example, imagine this snippet of PHP code which is designed to show the value of the page URL parameter:
  16. If someone was to craft a malicious URL where instead of a number in the page parameter they instead put a snippet of Javascript:
  17.  
  18. https://foo.com/stores/?page=<script>alert('hello')</script>
  19.  
  20. Then it may produce some HTML with inline Javascript, which the page authors had never intended to be there:
  21. That malicious Javascript could do all sorts of evil things, such as steal data from the victim page, or trick the user into thinking the content they are looking at is authentic. The user may be visiting a trusted domain, and therefore trust the contents of the page, which are being manipulated by a hacker.
  22.  
  23. Chrome to the rescue
  24. It is for that reason that Google Chrome has an XSS Auditor, which attempts to identify this type of attack and protect the user (by refusing to load the page):
  25. So far, so good.
  26.  
  27. Googlebot = Chrome 41
  28. Googlebot is currently based on Chrome version 41, which we know from Google’s own documentation. We also know that for the last couple of years Google have been promoting the fact that Googlebot executes and indexes Javascript on the sites it crawls. Chrome 41 had no XSS Auditor (that I’m aware of, it certainly doesn’t block any XSS that I’ve tried), and therefore my theory was that Googlebot likely has no XSS Auditor.
  29.  
  30. So the first step was to check, whether Googlebot (or Google’s Website Rendering Service [WRS], to be more precise) would actually render a URL with an XSS attack. One of my early tests was on the startup bank, Revolut — a 3 year old fintech startup with $330M in funding having XSS vulnerabilities demonstrates the breadth of the XSS issue (they’ve now fixed this example).
  31.  
  32. I used Google’s Mobile Friendly Tool to render the page, which quickly confirms Google’s WRS executes the XSS Javascript, in this case I’m crudely injecting a link at the top of the page:
  33. It is often (as in the case with Revolut) possible to entirely replace the content of the page to create your own page and content, hosted on the victim domain.
  34.  
  35. Content + links are cached
  36. I submitted a test page to the Google index, and then examining the cache of these pages shows that the link being added to the page does appear in the Google index:
  37. Canonicals
  38. A second set of experiments demonstrated (again via the mobile friendly tool) that you can change the canonicals on pages:
  39. Which I also confirmed via Google’s URL Inspector Tool, which reports the injected canonical as the true canonical (h/t to Sam Nemzer for the suggestion):
  40. Links are crawled and considered
  41. At this point, I had confirmed that Google’s WRS is susceptible to XSS attacks, and that Google were crawling the pages, executing the Javascript, indexing the content and considering the search directives within (i.e. the canonicals). The next important stage, is does Google find links on these pages and crawl them. Placing links on other sites is the backbone of the PageRank algorithm and a key factor for how sites rank in Google’s algorithm.
  42.  
  43. To test this, I crafted a page on Revolut which contained a link to a page on one of my test domains which I had just created moments before, and had previously not existed. I submitted the Revolut page to Google and later on Googlebot crawled the target page on my test domain. The page later appeared in the Google search results:
  44. This demonstrated that Google was identifying and crawling injected links. Furthermore, Google confirms that Javascript links are treated identically to HTML links (thanks Joel Mesherghi):
  45. All of this demonstrates that there is potential to manipulate the Google search results. However, I was unsure how to test this without actually impacting legitimate search results, so I stopped where I was (I asked Google for permission to do this in a controlled fashion a few days back, but not had an answer just yet).
  46.  
  47. How could this be abused?
  48. The obvious attack vector here is to inject links into other websites to manipulate the search results – a few links from prominent sites can make a very dramatic difference to search performance. The https://www.openbugbounty.org/ lists more than 125,000 un-patched XSS vulnerabilities. This included 260 .gov domains, 971 .edu domains, and 195 of the top 500 domains (as ranked by the Majestic Million top million sites.
  49.  
  50. A second attack vector is to create malicious pages (maybe redirecting people to a malicious checkout, or directing visitors to a competing product) which would be crawled and indexed by Google. This content could even drive featured snippets and appear directly in the search results. Firefox doesn’t yet have adequate XSS protection, so this pages would load for Google users searching with Firefox.
  51.  
  52. Defence
  53. The most obvious way to defend against this is to take security seriously and try to ensure you don’t have XSS vulnerabilities on your site. However, given then numbers from OpenBugBounty above, it is clear that that is more difficult that it sounds – which is the exact reason that Google added the XSS Auditor to Chrome!
  54.  
  55. One quick thing you can do is check your server logs and search for URLs that have terms such as ‘script’ in them, indicating a possible XSS attempt.
  56.  
  57. Wrap up
  58. This exploit is a combination of existing issues, but combine to form an zero-day exploit that has potential to be very harmful for Google users. I reported the issue to Google back on November 2018, but they have not confirmed the issue from their side or made any headway addressing it. They cited “difficulties in communication with the team investigating”, which felt a lot like what happened during the report of XML Sitemaps exploit.
  59.  
  60. My impression is that if a security issue affects a not commonly affected part of Google, then the internal lines of communication are not well oiled. It was March when I got the first details, when Google let me know “that our existing protection mechanisms should be able to prevent this type of abuse but the team is still running checks to validate this” – which didn’t agree with the evidence. I re-ran some of my tests and didn’t see a difference. The security team themselves were very responsive, as usual, but seemingly had no way to move things forward unfortunately.
  61.  
  62. It was 140 days after the report when I let Google know I’d be publicly disclosing the vulnerability, given the lack of movement and the fact that this could already be impacting both Google search users, as well as website owners and advertisers. To their credit, Google didn’t attempt to dissuade me and asked me to simply to use my best judgement in what I publish.
  63.  
  64. If you have any questions, comments or information you can find me on Twitter at @TomAnthonySEO, or if you are interested in consulting for technical/specialised SEO, you can contact me via Distilled.
  65.  
  66. Disclosure Timeline
  67. 3rd November 2018 – I filed the initial bug report.
  68. Over the next few weeks/months we went back and forth a bit.
  69. 11th February 2019 – Google responded letting me know they were “surfacing some difficulties in communication with the team investigating”
  70. 17th April 2018 – Google confirmed they have no immediate plans to fix this. I believe this is probably because they are preparing to release a new build of Googlebot shortly (I wonder if this was why the back and forth was slow – they were hoping to release the update?)
  71.  
  72. Originally written on http://www.tomanthony.co.uk/blog/xss-attacks-googlebot-index-manipulation/
  73. Thank you very much. I'm working on this and works like charm.
  74. This deserves much more attention!!
  75. That's crazy. Thank you for the share !
  76. Doesn't work for me, everytime I submit a URL. I get an Uncaught ReferenceError: $ is not defined.
  77.  
  78. Do I need to download the old chrome version and then check the url trough their mobile friendly tool?
  79. The bug already got patched.
  80. Amazing!!! This is Gold!.
  81. Thank you very much. I'm working on this and works like charm.
  82. How are you working on this? will you please help me too?
  83. It's already fixed by google.
  84.  
  85. 710 host rd womelsdorf pa
  86. hosting the presence
  87. hosting u bosni
  88. make money zazzle
  89. b domain of protein disulfide isomerase
  90. hosting 3cx
  91. adom.me
  92. web host forums
Add Comment
Please, Sign In to add comment