Advertisement
Guest User

SEO Fundamentals

a guest
Feb 14th, 2018
662
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
HTML 38.34 KB | None | 0 0
  1. <!-- Optional title format -->
  2. <!-- Primary Keyword - Secondary Keyword | Brand naame -->
  3. <!-- 8-foot Green Widgets - Widgets & Tools | Widget World -->
  4.  
  5. <head>
  6.     <title> Fitness Concept - Fitness & Lifestyle | Brand name </title>
  7. </head>
  8. <!-- Give every page a unique title -->
  9. <!-- [Product Name] - [Product Category] | [Brand Name] -->
  10.  
  11.  
  12. <!-- In rare cases, search engines may pull a title from DMOZ (aka Open Directory Project). If your display title in search doesn't match your title tag but does match your DMOZ listing, then you can block that substitution with the Robots NOODP tag, which looks like this: -->
  13. <meta name="robots" content="noodp">
  14. <!-- Meta robots is a fairly technical topic, but if you're seeing an unexplained display title in SERPs, do a quick search on DMOZ for your business. You might save yourself a few headaches. -->
  15.  
  16. <!-- Meta description example ~300 char -->
  17. <head>
  18.  <meta name="description" content="This is an example of a meta description. This will often show up in search results.">
  19. </head>
  20.  
  21. <!-- Alt text (alternative text), also known as "alt attributes", “alt descriptions,” and technically incorrectly as "alt tags,” are used within an HTML code to describe the appearance and function of an image on a page. -->
  22. <img src="pupdanceparty.gif" alt="Puppies dancing">
  23.  
  24. <!-- Okay: -->
  25.     <img src="pancakes.png" alt="pancakes">
  26.  
  27. <!-- Okay: -->
  28.     <img src="pancakes.png" alt="Stack of blueberry pancakes with powdered sugar">
  29.  
  30. <!-- Not recommended: -->
  31.  
  32. <img src="pancakes.png" alt="">
  33.  
  34. <!-- or -->
  35.  
  36. <img src="pancakes.png" alt="pancake pancakes pan cake hotcakes hotcake breakfast food best breakfast top breakfasts breakfast recipes pancake recipe">
  37.  
  38. <!-- Don’t forget longdesc="". Explore using the longdesc="" tag for more complex images that require a longer description.
  39.  
  40. Don’t neglect form buttons. If a form on your website uses an image as it’s “submit” button, give it an alt attribute. Image buttons should have an alt attribute that describes the function of the button like, "search", "apply now", “sign up,” etc. -->
  41.  
  42. <!-- Okay alt text: --> <img src="bird.png" alt="Rooster">
  43.  
  44. <!-- Better alt text: --> <img src="bird.png" alt="Rooster crowing">
  45.  
  46. <!-- Best alt text: --> <img src="bird.png" alt="Red-crested rooster crowing">
  47.  
  48. <!-- Rel="canonical" -->
  49.  
  50. <!-- General format: -->
  51.  
  52. <head>
  53. ...[other code that might be in your document's HTML head]...
  54. <link href="URL OF ORIGINAL PAGE" rel="canonical" />
  55. ...[other code that might be in your document's HTML head]...
  56. </head>
  57.  
  58. <!-- Additional methods for dealing with duplicate content -->
  59. <!-- Maintain consistency when linking internally throughout a website. For example, if a webmaster determines that the canonical version of a domain is www.example.com/, then all internal links should go to http://www.example.com/example rather than http://example.com/page (notice the absence of www). -->
  60.  
  61. <!--  robots.txt files indicate whether certain user agents (web-crawling software) can or cannot crawl parts of a website. These crawl instructions are specified by “disallowing” or “allowing” the behavior of certain (or all) user agents. -->
  62. <!-- Basic format: -->
  63.  
  64. User-agent: [user-agent name]
  65. Disallow: [URL string not to be crawled]
  66.  
  67. <!-- Within a robots.txt file, each set of user-agent directives appear as a discrete set, separated by a line break: -->
  68. <!-- Three separate sets of user-agent directives, each separated by a line break -->
  69.  
  70. User-agent: [feedjira]
  71. Disallow: /
  72.  
  73. User-agent: [magpie-crawler]
  74. Disallow: /
  75.  
  76. User-agent: *
  77. Disallow: /bullpen/
  78.  
  79. <!-- Example robots.txt: -->
  80. <!-- Robots.txt file URL: www.example.com/robots.txt -->
  81. <!-- Blocking all web crawlers from all content -->
  82.  
  83. User-agent: *
  84. Disallow: /
  85.  
  86. <!-- Using this syntax in a robots.txt file would tell all web crawlers not to crawl any pages on www.example.com, including the homepage. -->
  87.  
  88. <!-- Allowing all web crawlers access to all content -->
  89.  
  90. User-agent: *
  91. Disallow:
  92.  
  93. <!-- Using this syntax in a robots.txt file tells web crawlers to crawl all pages on www.example.com, including the homepage -->
  94.  
  95. <!-- Blocking a specific web crawler from a specific folder -->
  96.  
  97. User-agent: Googlebot
  98. Disallow: /example-subfolder/
  99.  
  100. <!-- This syntax tells only Google’s crawler (user-agent name Googlebot) not to crawl any pages that contain the URL string www.example.com/example-subfolder/. -->
  101.  
  102. <!-- Blocking a specific web crawler from a specific web page -->
  103.  
  104. User-agent: Bingbot
  105. Disallow: /example-subfolder/blocked-page.html
  106.  
  107. <!-- This syntax tells only Bing’s crawler (user-agent name Bing) to avoid crawling the specific page at www.example.com/example-subfolder/blocked-page. -->
  108.  
  109.  
  110. <!-- It’s generally a best practice to indicate the location of any sitemaps associated with this domain at the bottom of the robots.txt file. Here’s an example:
  111. -->
  112.  
  113. User-agent: *
  114. Allow: /*.html$
  115. Disallow: /*/data/*
  116. Disallow: *.dl/html
  117. Sitemap: https://www.workday.com/en-us/sitemap.xml
  118. Sitemap: https://www.workday.com/en-gb/sitemap.xml
  119. Sitemap: https://www.workday.com/en-se/sitemap.xml
  120. Sitemap: https://www.workday.com/en-fr/sitemap.xml
  121. Sitemap: https://www.workday.com/en-nl/sitemap.xml
  122. Sitemap: https://www.workday.com/en-de/sitemap.xml
  123.  
  124. <!-- Workday.com has called out multiple sitemaps in their robots.txt file -->
  125.  
  126. <!-- Trhnical robots.txt syntax -->
  127. <!-- There are five common terms you’re likely come across in a robots file -->
  128.  
  129. <!-- User-agent: -->
  130.     <!-- The specific web crawler to which you’re giving crawl instructions (usually a search engine) -->
  131. <!--  Disallow -->
  132.     <!-- The command used to tell a user-agent not to crawl particular URL. Only one "Disallow:" line is allowed for each URL -->
  133. <!-- Allow -->
  134.     <!-- The command to tell Googlebot it can access a page or subfolder even though its parent page or subfolder may be disallowed. -->
  135. <!-- Crawl-delay -->
  136.     <!-- How many milliseconds a crawler should wait before loading and crawling page content. Note that Googlebot does not acknowledge this command, but crawl rate can be set in Google Search Console -->
  137. <!-- Sitemap -->
  138.     <!-- Used to call out the location of any XML sitemap(s) associated with this URL. Note this command is only supported by Google, Ask, Bing, and Yahoo -->
  139.  
  140. <!-- Pattern-matching -->
  141.     <!-- When it comes to the actual URLs to block or allow, robots.txt files can get fairly complex as they allow the use of pattern-matching to cover a range of possible URL options. Google and Bing both honor two regular expressions that can be used to identify pages or subfolders that an SEO wants excluded. These two characters are the asterisk (*) and the dollar sign ($) -->
  142.         <!-- * is a wildcard that represents any sequence of characters -->
  143.         <!-- $  matches the end of the URL -->
  144.  
  145.  
  146. <!-- In order to ensure your robots.txt file is found, always include it in your main directory or root domain -->
  147.  
  148.  
  149. <!-- Robots Meta Directives -->
  150.  
  151. <!-- Robots meta directives (sometimes called “meta tags”) are pieces of code that provide crawlers instructions for how to crawl or index web page content. Whereas robots.txt file directives give bots suggestions for how to crawl a website's pages, robots meta directives provide more firm instructions on how to crawl and index a page's content -->
  152. <!-- There are two types of robots meta directives: those that are part of the HTML page (like the meta robotstag) and those that the web server sends as HTTP headers (such as x-robots-tag). The same parameters (i.e., the crawling or indexing instructions a meta tag provides, such as “noindex” and “nofollow” in the example above) can be used with both meta robots and the x-robots-tag; what differs is how those parameters are communicated to crawlers. -->
  153.  
  154. <meta name="robots" content="noindex,nofollow">
  155.                             <!-- parameters -->
  156.  
  157.  
  158. <!-- Indexation-controlling parameters: -->
  159.  
  160.  
  161. <!-- Noindex -->
  162.     <!-- Tells a search engine not to index a page -->
  163. <!-- Index -->
  164.     <!-- Tells a search engine to index a page. Note that you don’t need to add this meta tag; it’s the default -->
  165. <!-- Follow -->
  166.     <!-- Even if the page isn’t indexed, the crawler should follow all the links on a page and pass equity to the linked pages -->
  167. <!-- Nofollow -->
  168.     <!-- Tells a crawler not to follow any links on a page or pass along any link equity -->
  169. <!-- Noimageindex -->
  170.     <!-- Tells a crawler not to index any images on a page -->
  171. <!-- None -->
  172.     <!-- Equivalent to using both the noindex and nofollow tags simultaneously -->
  173. <!-- Noarchive -->
  174.     <!-- Search engines should not show a cached link to this page on a SERP -->
  175. <!-- Nocache -->
  176.     <!-- Same as noarchive, but only used by Internet Explorer and Firefox -->
  177. <!-- Nosnippet -->
  178.     <!-- Tells a search engine not to show a snippet of this page (i.e. meta description) of this page on a SERP -->
  179. <!-- Noodyp/noydir [OBSOLETE] -->
  180.     <!-- Prevents search engines from using a page’s DMOZ description as the SERP snippet for this page. However, DMOZ was retired in early 2017, making this tag obsolete -->
  181. <!-- Unavailable_after -->
  182.     <!-- Search engines should no longer index this page after a particular date -->
  183.  
  184.  
  185. <!-- Types of robots meta directives -->
  186.     <!-- There are two main types of robots meta directives: the meta robots tag and the x-robots-tag. Any parameter that can be used in a meta robots tag can also be specified in an x-robots-tag -->
  187.  
  188. <!-- Meta robots tag -->
  189.     <!-- The meta robots tag, commonly known as “meta robots” or colloquially as a “robots tag,” is part of a web page’s HTML code and appears as code elements within a web page’s <head> section -->
  190.  
  191.     <meta name="robots" content="[PARAMETER]">
  192.             <!-- standard -->
  193.  
  194.     <!-- you can also provide directives to specific crawlers by replacing the “robots” with the name of a specific user-agent -->
  195.  
  196.     <meta name="googlebot" content="[DIRECTIVE]">
  197.  
  198.     <!-- Want to use more than one directive on a page? As long as they’re targeted to the same “robot” (user-agent), multiple directives can be included in one meta directive – just separate them by commas -->
  199.  
  200.     <meta name=“robots” content=“noimageindex,” “nofollow,” “nosnippet”>
  201.  
  202. <!-- X-robots-tag -->
  203.     <!-- the x-robots-tag can be included as part of the HTTP header to control indexing of a page as a whole, as well as very specific elements of a page -->
  204.  
  205.  
  206.  
  207. <!-- Schema.org Markup -->
  208.     <!-- Schema.org (often called Schema) is a semantic vocabulary of tags (or microdata) that you can add to your HTML to improve the way search engines read and represent your page in SERPs -->
  209.  
  210.         <div itemscope itemtype="https://schema.org/Bok">
  211.             <span intemprop="name"> Inbound Marketing and SEO: Insights from the Moz Bloh </span>
  212.             <span itemprop="author"> Rand Fishkin </span>
  213.         </div>
  214.  
  215.     <!-- What is Schema.org Structured Data -->
  216.         <!-- Schema.org is the result of collaboration between Google, Bing, Yandex, and Yahoo! to help you provide the information their search engines need to understand your content and provide the best search results possible at this time. Adding Schema markup to your HTML improves the way your page displays in SERPs by enhancing the rich snippets that are displayed beneath the page title -->
  217.  
  218.         <div itemprop="aggrefateRating" itemscope="https://schema.org/AggregateRating">
  219.             <span itemprop="ratingValue">[Aggregate rating given]</span> stars -
  220.             <span itemprop="reviewCount">[Number of reviews]</span> reviews
  221.         </div>
  222.  
  223. <!-- HTTP Status Codes -->
  224.     <!-- An HTTP status code is a server response to a browser’s request. When you visit a website, your browser sends a request to the site’s server, and the server then responds to the browser’s request with a three-digit code: the HTTP status code -->
  225.  
  226. <!-- Common status code classes -->
  227.     <!-- 1xxs - Infromational responses -->
  228.         <!-- The server is thinking through the request -->
  229.     <!-- 2xxs - Success! -->
  230.         <!-- The request was successfully completed and the server gave the browser the wxpected response -->
  231.     <!-- 3xxs - Redirection -->
  232.         <!-- You got redirected somwhere else. The request was received, but there's a redirect of some kind -->
  233.     <!-- 4xxs - Client errors -->
  234.         <!-- Page not found. The site or page couldn't be reached. (The request was made, but the page isn't valid - this is an error on the website's side of the conversation and often appears when a page doesn't exist on the site) -->
  235.     <!-- 5xxs - Server errors -->
  236.         <!-- Failure. A valid request was made by the client but the sever failed to complete the request -->
  237.  
  238.  
  239.     <!-- HTTP Status Code 200 - OK -->
  240. <!-- This is your ideal status code for your normal, everyday, properly functioning page. Visitors, bots, and link equity pass through linked pages like a dream. You don’t need to do anything and you can happily go about your day secure in the knowledge that everything is just as it should be -->
  241.  
  242.     <!-- HTTP Status Code 301 - Permanent Redirect -->
  243. <!-- A 301 redirect should be utilized any time one URL needs to be redirected to another permanently.  A 301 redirect means  that visitors and bots that land on that page will be passed to the new URL. In addition, link equity — the power transmitted by all those hard-earned links to your content — is also passed to the new URL through a 301 redirect. Despite talk from Google that all 3xx redirects are treated equally, tests have shown this is not completely true. A 301 redirect remains the preferred method of choice for permanent page redirects -->
  244.     <!-- HTTP Status Code 302 - Temporary Redirect -->
  245. <!-- A 302 redirect is similar to a 301 in that visitors and bots are passed to the new page, but link equity may not be passed along. We do not recommended using 302 redirects for permanent changes. Using 302s will cause search engine crawlers to treat the redirect as temporary, meaning that it may not  pass along the link equity that the magical 301 does -->
  246.     <!-- HTTP Status Code 404 - Not Found -->
  247. <!-- This means the file or page that the browser is requesting wasn’t found by the server. 404s don’t indicate whether the missing page or resource is missing permanently or only temporarily. You can see what this looks like on your site by typing in a URL that doesn't exist. It’s like hitting a brick wall. Just as you’ve experienced, your visitors will hit a page that has a 404 error and either try again (if you’re lucky) or wander away to another site that has the information they’re seeking -->
  248.     <!-- HTTP Status Code 410 - Gone -->
  249. <!-- A 410 is more permanent than a 404; it means that the page is gone. The page is no longer available from the server and no forwarding address has been set up -->
  250.     <!-- HTTP Status Code 500 - Internal Server Error -->
  251. <!-- Instead of the problem being with pages missing or not found, this status code indicates a problem with the server. A 500 is a classic server error and will affect access to your site. Human visitors and bots alike will be lost, and your link equity will go nowhere fast -->
  252.     <!-- HTTP Status Code 503 - Service Unavailable -->
  253. <!-- Another variety of the 500, a 503 response means that the server is unavailable. Everyone (human or otherwise) is asked to come back later. This could be due to temporarily overloading the server or maintenance of the server -->
  254.  
  255.         <!-- Page Speed -->
  256. <!-- Page speed is a measurement of how fast the content on your pages load -->
  257.     <!-- Optimize images -->
  258.         <!-- PNGs are generally better for graphics with fewer than 16 colors while JPEGs are generally better for photographs -->
  259.  
  260. <!-- Coversion Rate Optimization -->
  261.     <!-- Conversion rate optimization (CRO) is the systematic process of increasing the percentage of website visitors who take a desired action — be that filling out a form, becoming customers, or otherwise. The CRO process involves understanding how users move through your site, what actions they take, and what's stopping them from completing your goals -->
  262.  
  263.         <!-- What is a conversion -->
  264. <!-- A conversion is the general term for a visitor completing a site goal. Goals come in many shapes and sizes. If you use your website to sell products, the primary goal (known as the macro-conversion) is for the user to make a purchase.  There are smaller conversions that can happen before a user completes a macro-conversion, such as signing up to receive emails. These are called micro-conversions -->
  265.     <!-- Examples of conversions -->
  266.         <!-- Macro-conversions -->
  267.             <!-- purchasing a product from a site -->
  268.             <!-- requesting a quote -->
  269.             <!-- subscribing to a service -->
  270.     <!-- Examples of micro-conversions -->
  271.             <!-- signing up for email lists -->
  272.             <!-- creating an accoung -->
  273.             <!-- adding a product to the cart -->
  274.  
  275. <!-- What is a conversion rate -->
  276.     <!-- Your site's conversion rate is the number of times a user completes a goal divided by your site traffic. If a user can convert in each visit (such as by buying a product), divide the number of conversions by the number of sessions (the number of unique times a user came to your site). If you sell a subscription, divide the number of conversions by the number of users -->
  277.     <!-- Conversion rate optimization happens after the visit makes it to your site. This is different from conversion optimization for SEO or paid ads which focuses on who clicks through to your site from the organic search results, how many clicks you get, and which keywords are driving traffic -->
  278.  
  279. <!-- How to Calculate Conversion Rate -->
  280.     <!-- If a user can convert each time they visit the site -->
  281.         <!-- Imagine we own an ecommerce site — Roger's Robotics. A user could make a new purchase each session. We want to optimize so they make as many purchases as possible. If a user visited the site three times, that would be three sessions — and three opportunities to convert -->
  282.             <!-- Session 1: No conversion -->
  283.                 <!-- user was familiarizing themselves with the site and poking around -->
  284.             <!-- Session 2: User bought a shiny new antenna -->
  285.                 <!-- This is a conversion -->
  286.             <!-- Session 3: User came back and bought a new set of gears and a blinking light -->
  287.                     <!-- another conversion -->
  288.     <!-- If a user can only convert once -->
  289.         <!-- Now imagine we owned a second site — Roger's Monthly Gear Box. Our site sells a subscription for a monthly delivery of robot parts. A user could come back multiple times, but once they purchase a subscription, they won't convert again -->
  290.             <!-- Session 1: User come to the site for the first time to explore the service -->
  291.                 <!-- no conversion -->
  292.             <!-- Session 2: User subscribed to our monthly GearBox service -->
  293.                 <!-- this is our conversion -->
  294.             <!-- Session 3: User came back to read blog articles and poke around -->
  295. <!-- 5 Ways CRO benefits SEOs -->
  296.     <!-- Imporved customer insights -->
  297.         <!-- Conversion rate optimization can help you better  understand your key audience and find what language or messaging best speaks to their needs. Conversion rate optimization looks at finding the right customers for your business. Acquiring more people doesn't do your business any good if they're not the right kind of people -->
  298.     <!-- Better ROI -->
  299.         <!-- Higher conversion rate means making more of the resources you have. By studying how to get the most out of your acquisition efforts, you'll get more conversions without having to bring in more potential customers -->
  300.     <!-- Better scalability -->
  301.         <!-- While your audience size may not scale as your business grows, CRO lets you grow without running out of resources and prospective customers.  Audiences aren't infinite. By turning more browsers into buyers, you'll be able to grow your business without running out of potential customers -->
  302.     <!-- Better user experience -->
  303.         <!-- When users feel smart and sophisticated on your website, they tend to stick around.  CRO studies what works on your site.  By taking what works and expanding on it, you'll make a better user experience.  Users who feel empowered by your site will engage with it more — and some may even become evangelists for your brand -->
  304.     <!-- Enchanced trust -->
  305.         <!-- In order for a user to share their credit card, email, or any sort of personal information, they have to genuinely trust the site.  Your website is your number-one sales person.  Just like an internal sales team, your site needs to be professional, courteous, and ready to answer all of your customers' questions -->
  306.  
  307. <!-- The key to successful optimization -->
  308.     <!-- The analytics method -->
  309.         <!-- Where people enter your site, i.e., which webpage they land on first -->
  310.         <!-- Which features they engage with, i.e., where on a page or within your site do they spend their time -->
  311.         <!-- What channel and referrer brought them in, i.e., where they found and clicked on a link to your site -->
  312.         <!-- What devices and browsers they use -->
  313.         <!-- Who your customers are (age, demographic, and interest) -->
  314.         <!-- Where users abandon your conversion funnel, i.e., where or during what activity do users leave your site -->
  315.     <!-- The people method -->
  316.         <!-- On-site surveys -->
  317.         <!-- User testing -->
  318.         <!-- Satisfaction surveys -->
  319.     <!-- The bad method -->
  320.         <!-- Guesses, hunches and gut feelings -->
  321.         <!-- Doing it because your competitor is doing it -->
  322.         <!-- Executing changes based on the highest paid person's opinion -->
  323.  
  324. <!-- DOMAINS -->
  325.     <!-- Domain names are the unique, humand-readable Interne addresses of websites; they are made up of three parts: -->
  326.         <!-- a top-level domain (sometimes called an extension or domain suffix) -->
  327.         <!-- a domain name (or IP address) -->
  328.         <!-- an optional subdomain -->
  329.  
  330.          http://www.tinydancinghorse.com
  331. <!-- Protocol -->         http://    
  332. <!-- Subdomain -->        www.
  333. <!-- Domain name -->      tinydancinghorse
  334. <!-- Top-level Domain --> .com
  335. <!-- Root Domain -->      tinydancinghorse.com
  336.  
  337. <!-- Top-level domain -->
  338.     .com
  339.     .net
  340.     .org
  341.     .edu
  342. <!-- Domain name -->
  343.     www.example.org
  344.     https://moz.com
  345.     www.blogspot.com
  346. <!-- Root domain -->
  347.     moz.com
  348.     Ilovedogs.net
  349.     PawneeIN.gov
  350. <!-- Subdomain -->
  351.     <!-- "blog.example.com" and "english.example.com" are both subdomains of the "example.com" root domain. Subdomains are free to create under any root domain that a webmaster controls -->
  352.     http://www.example.com <!-- www is the subdomain-->
  353.     http://example.com <!-- has no subdomain-->
  354.  
  355. <!-- SEO best practices for domains -->
  356.     <!-- Make your domain name memorable -->
  357.     <!-- Use board keywords when sensible -->
  358.     <!-- Avoid hyphens if possible -->
  359.     <!-- Avoid non-.com top-level domains(TLDs) -->
  360.     <!-- Favor subfolders/subdirectories over subdomains -->
  361.     <!-- Don't sweat over domain age -->
  362.     <!-- Moving domains -->
  363.  
  364. <!-- URLs -->
  365.     <!-- A URL (Uniform Resource Locator), more commonly known as a "web address", specifies the location of a resource (such as a web page) on the internet. The URL also specifies how to retrieve that resource, also known as the "protocol", such as HTTP, HTTPS, FTP, etc -->
  366.  
  367.         http://www.exampledomain.com
  368.  
  369.     <!-- Imposed limits = shorter then 2083 characters -->
  370.     <!-- Optimal format -->
  371.  
  372.         http://www.example.com/category-keyword/primary-keyword.html
  373.  
  374.     <!--  A URL consists of a protocol, domain name, and path (which includes the specific subfolder structure where a page is located) and has the following basic format -->
  375.  
  376.         protocol://domain-name.top-level-domain/path
  377.  
  378.     <!-- The protocol indicates how a browser should retrieve information about a resource. The web standard is http:// or https:// (the "s" stands for "secure"), but it may also include things like mailto: (to open your default mail client) or ftp: (to handle file transfers) -->
  379.  
  380.         <!-- Why do URLs matter for SEO? -->
  381. <!-- Improved user experience -->
  382. <!-- Rankings -->
  383. <!-- Links -->
  384.  
  385.     <!-- SEO best practices for URLs -->
  386.         <!-- Keeping URLs as simple, relevant, compelling, and accurate as possible is key to getting both your users and search engines to understand them (a prerequisite to ranking well). Although URLs can include ID numbers and codes, the best practice is to use words that people can comprehend -->
  387.     <!-- URLs should be definitive but concise. By seeing only the URL, a user (and search engine!) should have a good idea of what to expect on the page -->
  388.     <!-- When necessary for readability, use hyphens to separate words. URLs should not use underscores, spaces, or any other characters to separate words -->
  389.     <!-- Use lowercase letters. In some cases, uppercase letters can cause issues with duplicate pages. For  example, moz.com/Blog and moz.com/blog might be seen as two distinct URLs, which might create issues with duplicate content -->
  390.     <!-- Avoid the use of URL parameters, if possible, as they can create issues with tracking and duplicate content. If parameters need to be used (UTM codes, e.g.), use them sparingly -->
  391.  
  392. <!-- Canonicalization -->
  393.     <!-- What is a canonical tag -->
  394.         <!-- A canonical tag (aka "rel canonical") is a way of telling search engines that a specific URL represents the master copy of a page. Using the canonical tag prevents problems caused by identical or "duplicate" content appearing on multiple URLs. Practically speaking, the canonical tag tells search engines which version of a URL you want to appear in search results -->
  395.             <link rel="canonical" href="https://moz.com/beginners-guide-to-content-marketing" />
  396.                 <!-- The rel=canonical tag indicates that the page on which this tag appears should be treated as a duplicate of the specified URL -->
  397.     <!-- Why does canonicalization matter? -->
  398.         <!-- Duplicate content is a complicated subject, but when search engines crawl many URLs with identical (or very similar) content, it can cause a number of SEO problems -->
  399.     <!-- Canonical tag best practices -->
  400.         <!-- Canonical tags can be self-referential -->
  401.         <!-- Proactively canonicalize your home-page -->
  402.         <!-- Spot-check your dynamic canonical tags -->
  403.         <!-- Avoid mixed signals -->
  404.             <!-- Search engines may avoid a canonical tag or interpret it incorrectly if you send mixed signals. In other words, don’t canonicalize page A -–> page B and then page B -–> page A. Likewise, don’t canonicalize page A -–> page B and then 301 redirect page B -–> page A. It’s also generally not a good idea to chain canonical tags (A-–>B, B-–>C, C–->D), if you can avoid it. Send clear signals, or you force search engines to make bad choices -->
  405.         <!-- Be careful canonicalizing near-duplicates -->
  406.         <!-- Canonicalize cross-domain duplicates -->
  407.     <!-- Canonical tags vs. 301 redirects -->
  408.         <!-- If you 301 redirect Page A-->Page B, then human visitors will be taken to Page B automatically and never see Page A. If you rel-canonical Page A-->Page B, then search engines will know that Page B is canonical, but people will be able to visit both URLs. Make sure your solution matches the desired outcome -->
  409. <!-- Redirection -->
  410.     <!-- Redirection is the process of forwarding one URL to a different URL -->
  411.         <!-- What is a Redirect? -->
  412.             <!-- A redirect is a way to send both users and search engines to a different URL from the one they originally requested. The three most commonly used redirects are 301, 302, and Meta Refresh -->
  413.     <!-- Types of Redirects -->
  414.         <!-- 301, "Moved Permanently"—recommended for SEO -->
  415.         <!-- 302, "Found" or "Moved Temporarily" -->
  416.         <!-- Meta Refresh  -->
  417.     <!-- 301 Moved Permanently -->
  418.         <!-- A 301 redirect is a permanent redirect which passes between 90-99% of link equity (ranking power) to the redirected page. 301 refers to the HTTP status code for this type of redirect. In most instances, the 301 redirect is the best method for implementing redirects on a website -->
  419.     <!-- 302 Found (HTTP 1.1) / Moved Temporarily (HTTP 1.0) -->
  420.         <!--  The Internet runs on a protocol called HyperText Transfer Protocol (HTTP) which dictates how URLs work. It has two major versions, 1.0 and 1.1. In the first version, 302 referred to the status code "Moved Temporarily." This was changed in version 1.1 to mean "Found." -->
  421.     <!-- 307 Moved Temporarily (HTTP 1.1 Only) -->
  422.         <!-- A 307 redirect is the HTTP 1.1 successor of the 302 redirect. While the major crawlers will treat it like a 302 in some cases, it is best to use a 301 for almost all cases. The exception to this is when content is really moved only temporarily (such as during maintenance) AND the server has already been identified by the search engines as 1.1 compatible -->
  423.     <!-- MetaRefresh -->
  424.         <!-- Meta refreshes are a type of redirect executed on the page level rather than the server level. They are usually slower, and not a recommended SEO technique. They are most commonly associated with a five-second countdown with the text "If you are not redirected in five seconds, click here." -->
  425.     <!-- SEO Best Practice -->
  426.         <!--  Serving a 301 indicates to both browsers and search engine bots that the page has moved permanently. Search engines interpret this to mean that not only has the page changed location, but that the content—or an updated version of it—can be found at the new URL -->
  427.     <!-- 301 Redirects in Apache -->
  428.         <!-- Problem -->
  429.             <!-- Back when we launched our first website, seomoz.org, it was hosted at www.socengine.com/seo/ rather than on its own domain. When the original developers were moving seomoz.org to its own dedicated server, they wanted it to be accessed as its own domain rather than as a subdirectory of socengine.com. They needed visitors accessing anything in www.socengine.com/seo/ to be redirected to www.seomoz.org. The redirection had to accommodate several file and folder name changes and had to be done with 301 redirects in order to be search engine-friendly. They also needed to forward http://seomoz.org, too, for aesthetic purposes and to avoid canonicalization errors -->
  430.         <!-- Soluton -->
  431.             <!-- The simplest approach to do this would have been to add 301 redirects to the PHP code that powered seomoz.org using PHP's header function. Utilizing the power of the apache module mod_rewrite, however, the developers realized they could match specific patterns for entire folders and redirect them to their new URLs without having to go through every PHP script -->
  432.         <!-- Installation -->
  433.             <!-- Most apache installations will have mod_rewrite installed by default. SEOmoz's original server ran the Linux distribution FreeBSD and mod_rewrite was included by default. To check to see if the module is installed, a developer can verify it is working by adding the following line to the apache configuration file or to an applicable .htaccess file -->
  434.  
  435.                 RewriteEngine ON
  436.  
  437.         <!-- Context -->
  438.             <!-- The per-server context requires that a developer must edit the apache configuration file, httpd.conf. The per-directory context uses .htaccess files that exist in each folder a user wants to configure. If a webmaster can not access httpd.conf, they will have to use .htaccess files -->
  439.         <!-- Regular Expressions (aka Regexes) -->
  440.             <!-- The following is a list of the characters and operators that are used in the regexes described in this document -->
  441.                 . Period-matches anything
  442.                 * Asterisk–matches zero or more of the preceding characters
  443.                 +  Plus sign–matches one or more of the preceding character
  444.                 ()  Parenthesis–enclosing a value in parenthesis will store what was matched in a variable to be used later; this is also referred to as a back-reference
  445.                 (value1|value2) enclosing two or more values in parenthesis and separating them with a pipe character is the equivalent of saying: “matching value1 OR value2”
  446.         <!-- Redirecting Specific Files and Folders from one Domain to Another -->
  447.     <!-- Example -->
  448.  
  449.     Redirect: http://www.socengine.com/seo/s... To: /somefile.php
  450.  
  451.     <!-- Solution -->
  452.     <!-- Add the following directive to the applicable file on socengine.com's server -->
  453.  
  454.     RedirectMatch 301 /seo/(.*) /$1
  455.  
  456.     <!-- Explanation -->
  457.         <!-- The regular expression /seo/(.*) tells apache to match the seo folder followed by zero or more of any characters. Surrounding the .* in parenthesis tells apache to save the matched string as a back-reference. This back-reference is placed at the end of the URL that was directed to, in this case, $1 -->
  458.     <!-- Redirecting Canonical Hostnames -->
  459.         <!-- Redirect -->  http://seomoz.org/
  460.         <!-- To -->        http://www.seomoz.org/
  461.         <!-- Redirect -->  http://mail.seomoz.org/
  462.         <!-- To -->        http://www.seomoz.org
  463.         <!-- Redirect -->  http://seomoz.org/somefile.php
  464.         <!-- To -->        http://www.seomoz.org/somefile...
  465.     <!-- Solution -->
  466.         <!-- Add the following directive -->
  467.  
  468.         RewriteCond %{HTTP_HOST} *!^www*.seomoz\.org [NC]<br>
  469.         RewriteRule (.*) http://www.seomoz.org/$1 [L,R=301]
  470.  
  471.     <!-- Explanation -->
  472.         <!-- This directive tells apache to examine the host the visitor is accessing, and if it does not equal www.seomoz.org, to redirect to www.seomoz.org. The exclamation point (!) in front of www.seomoz.org negates the comparison, saying, “If the host IS NOT www.seomoz.org, then perform RewriteRule.” In our case RewriteRule redirects them to www.seomoz.org while preserving the exact file they were accessing in a back-reference -->
  473.     <!-- Redirecting Without Preserving the Filename -->
  474.         <!-- Several files that existed on the old server were no long present on the new server. Instead of preserving the file names in the redirection (which would result in a 404 not found error on the new server), the old files needed to be redirected to the root URL of the new domain -->
  475.         <!-- Redirect -->       http://www.socengine.com/seo/s..
  476.         <!-- To -->             http://www.seomoz.org/
  477.     <!-- Solution -->
  478.         <!-- Add the following directive -->
  479.  
  480.             RedirectMatch 301 /seo/someoldfile.php http://www.seomoz.org
  481.  
  482.     <!-- Explanation -->
  483.         <!-- Omitting any parenthesis, all requests for /seo/someoldfile.php should redirect to the root URL of http://www.seomoz.org -->
  484.     <!-- Redirecting the GET String -->
  485.         <!-- Some of the PHP scripts had different names but the GET string stayed the same. The Moz developers needed to redirect the visitors to the new PHP scripts while preserving these GET strings. The GET string is the set of characters that come after a filename in the URL and are used to pass data to a web page. An example of a GET string in the URL /myfile.php?this=that&foo=bar would be ?this=that&foo=bar. -->
  486.         <!-- Redirect -->       http://www.socengine.com/seo/c...
  487.         <!-- To -->             http://www.seomoz.org/artcat.p...
  488.     <!-- Solution -->
  489.         <!-- Add the following directive -->
  490.  
  491.         RedirectMatch 301 /seo/categorydetail.php(.*) http://www.seomoz.org/artcat.php$1
  492.  
  493.     <!-- Explanation -->
  494.         <!-- Once again the regular expression (.*) tells apache to match zero or more of any character and save it as the back-reference $1. Since there is a $1 after /seo/categorydetail.php, it will now redirect the get string to this new PHP file -->
  495.     <!-- Redirecting Canonical Hostnames -->
  496.         <!-- Redirect -->  http://seomoz.org/
  497.         <!-- To -->        http://www.seomoz.org/
  498.         <!-- Redirect -->  http://mail.seomoz.org/
  499.         <!-- To -->        http://www.seomoz.org
  500.         <!-- Redirect -->  http://seomoz.org/somefile.php
  501.         <!-- To -->        http://www.seomoz.org/somefile...
  502.     <!-- Solution -->
  503.         <!-- Add the following directive -->
  504.  
  505.         RewriteCond %{HTTP_HOST} *!^www*.seomoz\.org [NC]<br>
  506. RewriteRule (.*) http://www.seomoz.org/$1 [L,R=301]
  507.  
  508.     <!-- Explanation -->
  509.         <!-- This directive tells apache to examine the host the visitor is accessing, and if it does not equal www.seomoz.org, to redirect to www.seomoz.org. The exclamation point (!) in front of www.seomoz.org negates the comparison, saying, “If the host IS NOT www.seomoz.org, then perform RewriteRule.” In our case RewriteRule redirects them to www.seomoz.org while preserving the exact file they were accessing in a back-reference -->
  510.     <!-- Redirecting Without Preserving the Filename -->
  511.         <!-- Several files that existed on the old server were no long present on the new server. Instead of preserving the file names in the redirection (which would result in a 404 not found error on the new server), the old files needed to be redirected to the root URL of the new domain -->
  512.         <!-- Redirect -->       http://www.socengine.com/seo/c...
  513.         <!-- To -->             http://www.seomoz.org/artcat.p...
  514.     <!-- Solution -->
  515.         <!-- Add the following directive -->
  516.  
  517.         RedirectMatch 301 /seo/someoldfile.php http://www.seomoz.org
  518.  
  519.     <!-- Explanation -->
  520.         <!-- Omitting any parenthesis, all requests for /seo/someoldfile.php should redirect to the root URL of http://www.seomoz.org -->
  521.     <!-- Redirecting the GET String -->
  522.         <!-- Some of the PHP scripts had different names but the GET string stayed the same. The Moz developers needed to redirect the visitors to the new PHP scripts while preserving these GET strings. The GET string is the set of characters that come after a filename in the URL and are used to pass data to a web page. An example of a GET string in the URL /myfile.php?this=that&foo=bar would be ?this=that&foo=bar. -->
  523.         <!-- Redirect -->       http://www.socengine.com/seo/c...
  524.         <!-- To -->             http://www.seomoz.org/artcat.p...
  525.     <!-- Solution -->
  526.         <!-- Add the following directive -->
  527.  
  528.         RedirectMatch 301 /seo/categorydetail.php(.*) http://www.seomoz.org/artcat.php$1
  529.     <!-- Explanation -->
  530.         <!-- Once again the regular expression (.*) tells apache to match zero or more of any character and save it as the back-reference $1. Since there is a $1 after /seo/categorydetail.php, it will now redirect the get string to this new PHP file -->
  531.         <!-- Redirect -->       http://www.socengine.com/seo/g...
  532.         <!-- To -->             http://www.seomoz.org/articles...
  533.     <!-- Solution -->
  534.         <!-- Add the following directive -->
  535.  
  536.         RedirectMatch 301 /seo/guide/(.*)\.(php|html) http://www.seomoz.org/articles/$1.php
  537.     <!-- Explanation -->
  538.         <!-- (*.) matches zero or more of any character and saves it as the back-reference $1. \.(php|html) tells apache to match a period followed by either “php” or “html” and saves it as the back-reference $2 (although this isn't used in this example). Notice the escaped period with a backslash. This is to ensure apache does not interpret the period as meaning “any character” but rather as an actual period. Enclosing “php” and “html” in parenthesis and separating them with a pipe “|” character means to match either one of the values. So if it were to say (php|html|css|js|jpg|gif) the regex would match any of the files with the extensions php, html, css, js, jpg, or gif -->
  539.     <!-- Conclusion -->
  540.         <!-- By harnessing the power of mod_rewrite and a little regular expression magic the original developers at Moz developed a set of simple rules for redirecting web pages. By using 301 redirects, they did this in a way that was search engine–friendly -->
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement