Is there an affordable online company that kills people who are doing complete site copies and are scraping content to displace you from google search results?
Service that kills mirrors and scrapers
Collapse
X
-
Service that kills mirrors and scrapers
Tags: None -
All cookies cleared! -
-
Comment
-
You can protect your sites in .htaccess from scrapers. Are your sites on WP installs?
Do you want me to paste some code here for you?Comment
-
It's really cat and mouse. UFW or iptables -- firewall them out -- if you have root. However, they will change IPs or AS networking so it is a never ending game.Comment
-
I have a site that's scraped to hell and back. If you exclude Googlebot and all of the scrapers, there's probably less than 2% remaining (loads by a browser).
Over the years I've added bits and pieces to log various interesting information. The big red flag that sticks out, at least for my site: scrapers use proxies, so their IPs can change without notice, but the headers they send are usually a fixed pattern that is nothing like a real browser, so they're super easy to block.
Even a simple CAPTCHA that is triggered after say 10 loads without presenting a cookie manages to block most of them. Some IPs are constantly bashing at the site, day after day, even though they are almost perpetually 403'd or firewalled.
Guess there is a market for a service like this, if one doesn't exist... but integrating it into a customer's existing site would be interesting...Comment
-
Any other questions
Code:</html>barry@paragon-DS-7:/media/barry/ADATA3$ dig NS gfy.com ; <<>> DiG 9.10.3-P4-Ubuntu <<>> NS gfy.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21089 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;gfy.com. IN NS ;; ANSWER SECTION: gfy.com. 86399 IN NS june.ns.cloudflare.com. gfy.com. 86399 IN NS theo.ns.cloudflare.com. ;; Query time: 56 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Sat Sep 02 12:42:15 EDT 2017 ;; MSG SIZE rcvd: 88 ======================= barry@paragon-DS-7:/media/barry/ADATA3$ curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 (Official Build) (64-bit)" "http://gfy.com/" <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en"> <head> <!-- no cache headers --> <meta http-equiv="Pragma" content="no-cache" /> <meta http-equiv="Expires" content="-1" /> <meta http-equiv="Cache-Control" content="no-cache" /> <!-- end no cache headers --> <meta name="robots" content="noarchive"> <title>GFY Webmaster Board</title>
Comment
-
what are you trying to show ?Any other questions
Code:</html>barry@paragon-DS-7:/media/barry/ADATA3$ dig NS gfy.com ; <<>> DiG 9.10.3-P4-Ubuntu <<>> NS gfy.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21089 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;gfy.com. IN NS ;; ANSWER SECTION: gfy.com. 86399 IN NS june.ns.cloudflare.com. gfy.com. 86399 IN NS theo.ns.cloudflare.com. ;; Query time: 56 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Sat Sep 02 12:42:15 EDT 2017 ;; MSG SIZE rcvd: 88 ======================= barry@paragon-DS-7:/media/barry/ADATA3$ curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 (Official Build) (64-bit)" "http://gfy.com/" <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en"> <head> <!-- no cache headers --> <meta http-equiv="Pragma" content="no-cache" /> <meta http-equiv="Expires" content="-1" /> <meta http-equiv="Cache-Control" content="no-cache" /> <!-- end no cache headers --> <meta name="robots" content="noarchive"> <title>GFY Webmaster Board</title>
Comment
-
-
They need to tweak their cloudflare setting is all
Not sure if they changed their old IP afternoon cloudflare, as I suggested before, if not they're still wide open.
Edit: nope they didn't change it and they have 64 hostnames pointing to that same IP :/
Step 10 here - change IP after setupComment
-
GFY IS using a Cloudflare IP
Luckily botnets are made of stupid computer users. Maybe, Cloudflare can control ddos spikes in traffic.
Cloudflare will not stop lone wolf hackers or scrapers. I just repeated that curl successfully from a Digital Ocean server I have in Amsterdam. So, the AS block is not checked at Cloudflare either.Comment
-
You dope.GFY IS using a Cloudflare IP
Luckily botnets are made of stupid computer users. Maybe, Cloudflare can control ddos spikes in traffic.
Cloudflare will not stop lone wolf hackers or scrapers. I just repeated that curl successfully from a Digital Ocean server I have in Amsterdam. So, the AS block is not checked at Cloudflare either.
The GFY server IP is the same as when it was first attacked.
Then GFY hid behind cloudflare, with the same IP, and continued to go down.
So clearly the only reason the attack stopped was because the attacker stopped hitting the original IP.
I pay for cloudflare business , and use cloudflare. And I looked up GFY's IP.
I just PM'd it to youComment
-
i still cant figure out gfy ip from that, what am i missing ?Any other questions
Code:</html>barry@paragon-DS-7:/media/barry/ADATA3$ dig NS gfy.com ; <<>> DiG 9.10.3-P4-Ubuntu <<>> NS gfy.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21089 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;gfy.com. IN NS ;; ANSWER SECTION: gfy.com. 86399 IN NS june.ns.cloudflare.com. gfy.com. 86399 IN NS theo.ns.cloudflare.com. ;; Query time: 56 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Sat Sep 02 12:42:15 EDT 2017 ;; MSG SIZE rcvd: 88 ======================= barry@paragon-DS-7:/media/barry/ADATA3$ curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 (Official Build) (64-bit)" "http://gfy.com/" <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en"> <head> <!-- no cache headers --> <meta http-equiv="Pragma" content="no-cache" /> <meta http-equiv="Expires" content="-1" /> <meta http-equiv="Cache-Control" content="no-cache" /> <!-- end no cache headers --> <meta name="robots" content="noarchive"> <title>GFY Webmaster Board</title>
thanksComment
-
BW I don't give a fuck how ...
PM Eric and tell what DNS server that needs to have its records fixed.Comment





Comment