Any HTACCESS People In The House?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • RummyBoy
    Confirmed User
    • Dec 2009
    • 2157

    #1

    Tech Any HTACCESS People In The House?

    Code:
    RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(US|CA|CN|RU|BR|JP|PK|SA|ID|BY|IR|IQ)$
    RewriteRule ^/*$ http://www.domain.com/$1 [R=301,L]
    At the moment if I use the above code, Google doesn't spider our site, it spiders the redirect url (domain.com) because Googlebot has a US based IP.

    So the question is..... how do I use this code without redirecting search engine spiders such as Google. I found this advice:

    https://www.webmasterworld.com/apache/4276815.htm

    Above URL suggests it is best to:

    (1) Not redirect search engine spiders
    (2) Not redirect the google IP number set

    Using both options will be the safest bet but I'm a complete novice so I don't fully understand the advice. Anyone know how we need to edit this code to make it work as required?

    Nice little crossword puzzle for ya....
  • Panty Snatcher
    eat a peach for Hours
    • Aug 2013
    • 540

    #2
    just add an agent rule against the know bots

    Comment

    • RummyBoy
      Confirmed User
      • Dec 2009
      • 2157

      #3
      Well if I just take the code from the above page, I get something like this code:
      Does it look like it would work?


      Code:
      RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(US|CA|CN|RU|BR|JP|PK|SA|ID|BY|IR|IQ)$
      RewriteRule ^/*$ http://www.domain.com/$1 [R=301,L]
      RewriteCond %{REMOTE_ADDR} !^66\.249\.(6[4-9]|[78][0-9]|9[0-5])\.
      RewriteCond %{HTTP_USER_AGENT} !(Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Msnbot|Bingbot|Slurp|Teoma) [NC]

      Comment

      • timlover
        Confirmed User
        • Nov 2002
        • 540

        #4
        Take out the US in the htacess. Then use a JS redirect for US on the site pages...

        Comment

        • RummyBoy
          Confirmed User
          • Dec 2009
          • 2157

          #5
          Originally posted by timlover
          Take out the US in the htacess. Then use a JS redirect for US on the site pages...
          We want to continue using htaccess for numerous reasons its actually better... and JS can cause some issues on a JS heavy page.

          Comment

          • timlover
            Confirmed User
            • Nov 2002
            • 540

            #6
            Originally posted by RummyBoy
            We want to continue using htaccess for numerous reasons its actually better... and JS can cause some issues on a JS heavy page.
            Then your only other option is to deny by country range IP list for the US and edit the ranges where googlebots are located at.

            You can use this generator

            Block Visitors by Country | IP2Location.com

            Here are the bots ranges

            Robots IP Address Ranges - Googlebot, Yahoo Slurp, MSNBot

            Comment

            • Barry-xlovecam
              It's 42
              • Jun 2010
              • 18083

              #7
              Above URL suggests it is best to:

              (1) Not redirect search engine spiders
              (2) Not redirect the google IP number set
              If you game googlebot with different content than the people Google search refers will see: that domain will end up banned and delisted in Google search is what that means.

              So it that domain a throw away blackhat domain?

              Comment

              • SpyCam
                PoiSEO.com
                • Apr 2001
                • 1201

                #8
                Google now crawls from international IPs as well, by the way.

                Comment

                • RummyBoy
                  Confirmed User
                  • Dec 2009
                  • 2157

                  #9
                  If you game googlebot with different content
                  No, that's not the issue. It's a mainstream site and we are redirecting, for example, US (and other non-UK) users from a UK site to the International site. We were trying to avoid search engine bots precisely so that they can still correctly index the first site.

                  Google now crawls from international IPs as well, by the way.
                  That's what I expected but I was proven wrong. Once we instituted the htaccess redirect for US users, Googlebot indexed our site with content from the site we were redirecting the US users to which tells me that the Googlebot was redirected also. I checked our webstats on the redirecting site and found the Googlebot IP is a US IP (Mountain View).

                  So for our redirect to work, we need to not redirect search engines (primarily Bing & Google).

                  Comment

                  • Barry-xlovecam
                    It's 42
                    • Jun 2010
                    • 18083

                    #10
                    <link rel="alternate" href="http:///path/" hreflang="x-default">
                    <link rel="alternate" href="http:///path/" hreflang="en">
                    <link rel="alternate" href="http:///path/" hreflang="ru">
                    <link rel="alternate" href="http:///path/" hreflang="de">
                    <link rel="alternate" href="http:///path/" hreflang="pl">
                    <link rel="alternate" href="http:///path/" hreflang="es">
                    </head>
                    Let the googlebot find the right content.
                    Let your users select their right language.

                    That is the right way to do it.
                    Set a language cookie for the user if you want continuity.

                    This redirect crap is way overused and unnecessary IMHO.

                    Comment

                    Working...