GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   Tech Any HTACCESS People In The House? (https://gfy.com/showthread.php?t=1172944)

RummyBoy 08-28-2015 07:44 AM

Any HTACCESS People In The House?
 
Code:

RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(US|CA|CN|RU|BR|JP|PK|SA|ID|BY|IR|IQ)$
RewriteRule ^/*$ http://www.domain.com/$1 [R=301,L]

At the moment if I use the above code, Google doesn't spider our site, it spiders the redirect url (domain.com) because Googlebot has a US based IP.

So the question is..... how do I use this code without redirecting search engine spiders such as Google. I found this advice:

https://www.webmasterworld.com/apache/4276815.htm

Above URL suggests it is best to:

(1) Not redirect search engine spiders
(2) Not redirect the google IP number set

Using both options will be the safest bet but I'm a complete novice so I don't fully understand the advice. Anyone know how we need to edit this code to make it work as required?

Nice little crossword puzzle for ya....

Panty Snatcher 08-28-2015 09:42 AM

just add an agent rule against the know bots

RummyBoy 08-28-2015 08:37 PM

Well if I just take the code from the above page, I get something like this code:
Does it look like it would work?


Code:

RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(US|CA|CN|RU|BR|JP|PK|SA|ID|BY|IR|IQ)$
RewriteRule ^/*$ http://www.domain.com/$1 [R=301,L]
RewriteCond %{REMOTE_ADDR} !^66\.249\.(6[4-9]|[78][0-9]|9[0-5])\.
RewriteCond %{HTTP_USER_AGENT} !(Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Msnbot|Bingbot|Slurp|Teoma) [NC]


timlover 08-28-2015 08:51 PM

Take out the US in the htacess. Then use a JS redirect for US on the site pages...

RummyBoy 08-28-2015 11:23 PM

Quote:

Originally Posted by timlover (Post 20564315)
Take out the US in the htacess. Then use a JS redirect for US on the site pages...

We want to continue using htaccess for numerous reasons its actually better... and JS can cause some issues on a JS heavy page.

timlover 08-28-2015 11:38 PM

Quote:

Originally Posted by RummyBoy (Post 20564362)
We want to continue using htaccess for numerous reasons its actually better... and JS can cause some issues on a JS heavy page.

Then your only other option is to deny by country range IP list for the US and edit the ranges where googlebots are located at.

You can use this generator

Block Visitors by Country | IP2Location.com

Here are the bots ranges

Robots IP Address Ranges - Googlebot, Yahoo Slurp, MSNBot

Barry-xlovecam 08-29-2015 08:17 AM

Quote:

Above URL suggests it is best to:

(1) Not redirect search engine spiders
(2) Not redirect the google IP number set
If you game googlebot with different content than the people Google search refers will see: that domain will end up banned and delisted in Google search is what that means.

So it that domain a throw away blackhat domain?

SpyCam 08-29-2015 11:21 AM

Google now crawls from international IPs as well, by the way.

RummyBoy 08-29-2015 12:03 PM

Quote:

If you game googlebot with different content
No, that's not the issue. It's a mainstream site and we are redirecting, for example, US (and other non-UK) users from a UK site to the International site. We were trying to avoid search engine bots precisely so that they can still correctly index the first site.

Quote:

Google now crawls from international IPs as well, by the way.
That's what I expected but I was proven wrong. Once we instituted the htaccess redirect for US users, Googlebot indexed our site with content from the site we were redirecting the US users to which tells me that the Googlebot was redirected also. I checked our webstats on the redirecting site and found the Googlebot IP is a US IP (Mountain View).

So for our redirect to work, we need to not redirect search engines (primarily Bing & Google).

Barry-xlovecam 08-29-2015 12:29 PM

<link rel="alternate" href="http:///path/" hreflang="x-default">
<link rel="alternate" href="http:///path/" hreflang="en">
<link rel="alternate" href="http:///path/" hreflang="ru">
<link rel="alternate" href="http:///path/" hreflang="de">
<link rel="alternate" href="http:///path/" hreflang="pl">
<link rel="alternate" href="http:///path/" hreflang="es">
</head>
Let the googlebot find the right content.
Let your users select their right language.

That is the right way to do it.
Set a language cookie for the user if you want continuity.

This redirect crap is way overused and unnecessary IMHO.


All times are GMT -7. The time now is 12:32 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123