GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   GoogleBot Activity Script (https://gfy.com/showthread.php?t=225046)

cj-design 01-23-2004 07:49 PM

GoogleBot Activity Script
 
Hi guys,

I'm just finishing off a script that will allow you to log each and every crawl that Google makes on your website. Some screenshots can be found here:

http://www.cj-design.com/downloads/g...activity/demo/

The script will be up for free download very soon (just finishing off the installation guide) - so if you dont have anything like this on your site, stay tuned.

:thumbsup

P.S The script is written in PHP therefore can only be run in pages that can include PHP files (i.e .php, .php3, .shtml, .shtm)

hudson 01-23-2004 07:51 PM

:glugglug :thumbsup

davidd 01-23-2004 08:09 PM

Quote:

Originally posted by cj-design
Hi guys,

I'm just finishing off a script that will allow you to log each and every crawl that Google makes on your website. Some screenshots can be found here:

http://www.cj-design.com/downloads/g...activity/demo/

The script will be up for free download very soon (just finishing off the installation guide) - so if you dont have anything like this on your site, stay tuned.

:thumbsup

P.S The script is written in PHP therefore can only be run in pages that can include PHP files (i.e .php, .php3, .shtml, .shtm)

What are the features?

Is it any better than:

grep googlebot access.log | cut -f4 -d'/' | sed s/HTTP//g | sort | uniq

:)

This is a serious question, as here is my count from today alone, across my domains:

% grep googlebot * |wc -l
18974

cj-design 01-23-2004 08:13 PM

Its for webmasters that dont understand those commands, therefore it is layed out in a more attractive interface. Features are pretty minimal at the moment as its only V1.0; they include; sort the log, export the log, clear the log. In the future you will be able to search the log, say you want to see which pages "crawler8.googlebot.com" indexed.

Doesn't require MySQL either, so pretty simple all around.

hudson 01-23-2004 08:17 PM

Quote:



grep googlebot access.log | cut -f4 -d'/' | sed s/HTTP//g | sort | uniq


shhhhhh...top secret!!

Quote:



This is a serious question, as here is my count from today alone, across my domains:

% grep googlebot * |wc -l
18974

hmmmm...want to buy some scripts? :glugglug

flashfreak 01-23-2004 08:24 PM

nice but too easy

cj-design 01-23-2004 08:33 PM

I would hope that most of the webmasters on these forums would find the script 'easy' as you put it.

:winkwink:

Wiz 01-23-2004 08:37 PM

Your script can handle 150 to 250k pages? ;)

hudson 01-23-2004 08:39 PM

Quote:

Originally posted by Wiz
Your script can handle 150 to 250k pages? ;)
yo...domains....so you got 40k pages in a domain, well...only one stat file there :2 cents:

cj-design 01-23-2004 08:40 PM

Quote:

Originally posted by Wiz
Your script can handle 150 to 250k pages? ;)

If it was driven by MySQL, yeh - coming soon of course.

Brujah 01-23-2004 08:41 PM

Quote:

Originally posted by cj-design
Hi guys,

I'm just finishing off a script that will allow you to log each and every crawl that Google makes on your website. Some screenshots can be found here:

http://www.cj-design.com/downloads/g...activity/demo/

The script will be up for free download very soon (just finishing off the installation guide) - so if you dont have anything like this on your site, stay tuned.

:thumbsup

P.S The script is written in PHP therefore can only be run in pages that can include PHP files (i.e .php, .php3, .shtml, .shtm)

Looks great, keep us informed :)

cj-design 01-23-2004 08:48 PM

OK,

I typed like a bitch to finish it, download as you please:

http://www.cj-design.com/index.php?id=downloads&page=13

Report bugs to me via email - should be ok though

AdultNex 01-23-2004 08:52 PM

Hrm, I thought I remembered going to your site before... I was just on your site yesterday looking for a quote generator.

cj-design 01-23-2004 08:56 PM

Quote:

Originally posted by AdultNex
Hrm, I thought I remembered going to your site before... I was just on your site yesterday looking for a quote generator.
Ah the classic CJ Random Quote - never fails.

Kevin2 01-23-2004 09:10 PM

cj-design you have some great scripts on your site :thumbsup

cj-design 01-23-2004 09:11 PM

Quote:

Originally posted by Kevin2
cj-design you have some great scripts on your site :thumbsup
Thanks. Some of them are a bit out of date now due to PHP moving on and me becomming a better programmer, but they do the job!

hudson 01-23-2004 09:13 PM

Quote:

Originally posted by cj-design


Ah the classic CJ Random Quote - never fails.

what's the classic CJ Random Quote...

cj-design 01-23-2004 09:15 PM

:) do you really need to ask, everybody, this guy has never heard of the classic CJ Random Quote.... everybody? Oh...

Well here it is in action:

http://www.cj-design.com/downloads/randomquote/demo/

hudson 01-23-2004 09:17 PM

Quote:

Originally posted by cj-design
:) do you really need to ask, everybody, this guy has never heard of the classic CJ Random Quote.... everybody? Oh...

Well here it is in action:

http://www.cj-design.com/downloads/randomquote/demo/

nice site...very nice! ;-)

cj-design 01-23-2004 09:26 PM

Quote:

Originally posted by hudson


nice site...very nice! ;-)

cheers mate, its not often you post on these forums and recieve appraisals.

:winkwink:

hudson 01-23-2004 09:28 PM

Quote:

Originally posted by cj-design


cheers mate, its not often you post on these forums and recieve appraisals.

:winkwink:

hehe...:1orglaugh

davidd 01-25-2004 12:40 AM

Quote:

Originally posted by cj-design
Its for webmasters that dont understand those commands, therefore it is layed out in a more attractive interface. Features are pretty minimal at the moment as its only V1.0; they include; sort the log, export the log, clear the log. In the future you will be able to search the log, say you want to see which pages "crawler8.googlebot.com" indexed.

Doesn't require MySQL either, so pretty simple all around.

I downloaded and installed it a short while ago. I will let you know my results in a week or so. So far, I like what I see, as you have consolidated a lot of my cron's and hodge podge of shell scripts into one unit.

Update the install .html with:

If you are executing this file using SSI, you must add the location of your PHP binary to the top of the googlebot.php (i.e. #!/usr/local/bin/php), and googlebot.php must be CHMOD 755.

The one thing that I thought was kind of ->un-cool<- was you added output to the html pages via googlebot.php. So upon execution, the bot will see:

CJ GoogleBot Activity V1.0 is running on this site
<!-- CJ GoogleBot Activity V1.0 is running on this site -->

I am cool with self promotion, but not in the search engine game. If the above was used as an indicator by the Google people for blacklisting or deeper inspection, a large number of people would be penalized unknowingly.

My comments, should always be deemed as constructive criticism, I am not in the game of slamming people's work.

-dd

blackmonsters 01-25-2004 03:45 AM

What would I do with this information if I used the script?

Trax 01-25-2004 06:03 AM

how stable is that script?

DarkJedi 01-25-2004 06:15 AM

where to download it ?

AdultNex 01-25-2004 06:20 AM

Quote:

Originally posted by davidd


I downloaded and installed it a short while ago. I will let you know my results in a week or so. So far, I like what I see, as you have consolidated a lot of my cron's and hodge podge of shell scripts into one unit.

Update the install .html with:

If you are executing this file using SSI, you must add the location of your PHP binary to the top of the googlebot.php (i.e. #!/usr/local/bin/php), and googlebot.php must be CHMOD 755.

The one thing that I thought was kind of ->un-cool<- was you added output to the html pages via googlebot.php. So upon execution, the bot will see:

CJ GoogleBot Activity V1.0 is running on this site
<!-- CJ GoogleBot Activity V1.0 is running on this site -->

I am cool with self promotion, but not in the search engine game. If the above was used as an indicator by the Google people for blacklisting or deeper inspection, a large number of people would be penalized unknowingly.

My comments, should always be deemed as constructive criticism, I am not in the game of slamming people's work.

-dd

This too should not be looked upon as flaming. What is the purpose of having a self-promotion tagline when the "stats" aren't publicly displayed?

cj-design 01-25-2004 12:00 PM

Quote:

Originally posted by davidd


I downloaded and installed it a short while ago. I will let you know my results in a week or so. So far, I like what I see, as you have consolidated a lot of my cron's and hodge podge of shell scripts into one unit.

Update the install .html with:

If you are executing this file using SSI, you must add the location of your PHP binary to the top of the googlebot.php (i.e. #!/usr/local/bin/php), and googlebot.php must be CHMOD 755.

The one thing that I thought was kind of ->un-cool<- was you added output to the html pages via googlebot.php. So upon execution, the bot will see:

CJ GoogleBot Activity V1.0 is running on this site
<!-- CJ GoogleBot Activity V1.0 is running on this site -->

I am cool with self promotion, but not in the search engine game. If the above was used as an indicator by the Google people for blacklisting or deeper inspection, a large number of people would be penalized unknowingly.

My comments, should always be deemed as constructive criticism, I am not in the game of slamming people's work.

-dd

Hi dd,

Thanks for the installation file addition.

As for the tag line (html comment) - that was left in by mistake, It was put there just to test the thing was being included and I forgot to take it out - so ive also changed that.


Quote:

This too should not be looked upon as flaming. What is the purpose of having a self-promotion tagline when the "stats" aren't publicly displayed?
AdultNex - basically just repeated what dd said, so same to you really - its sorted now.


Quote:

where to download it ?
Read previous posts doofas

Quote:

how stable is that script?
pretty stable, not to be uses on sites that get crawled 500+ times a day (unless your gonna clear your own log every 30 minutes or something)

Quote:

What would I do with this information if I used the script?
It was written for two reasons (my own reasons):

A) To monitor GoogleBot (how it acts)

B) To monitor the pages of my site that were crawled

What I found out...

A) Googlebot (usually crawler2) will pick up a link, say from hotscripts.com and find its way to my site, crawler4, say, would then come back and crawl that link properly - also found some other stuff like crawler12 is a deep crawler (takes most of your site when it crawls)

B) I found that my recently updated pages were crawled in a matter of minutes (pretty sweet how it knows)

What you can do with the information you get....

A) The same as what I have done
B) Wipe your ass on it

acctman 01-25-2004 10:40 PM

it's free

pradaboy 01-26-2004 12:57 AM

:thumbsup thnx for sharing man... this is great!


All times are GMT -7. The time now is 06:05 PM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123