Blog Plagiarism or Blog scraping: What to do?

Recently, I learned first hand about Blog Plagiarism, aka “scraping.” Basically, web scraping is when another site copies copyrighted content from one or more blogs as an illegal means to increase traffic and generate revenue.

I am not alone, this particular scraper site is also doing it to a dozen other fabric artists, too. They are also obnoxious and arrogant, in that they use “fu.org” in their URL.

How can I tell if this is happening to my blog? If you have wordpress as your blog host, as I do, then check your stats and monitor your traffic. Simply click on “Dashboard,” then “Blog Stats,” then scroll down the page to “Incoming Links.” If you place your cursor over each line, then you can see the website that is linking to you (hint: look down at the left bottom of your screen. You will note that most of the sites linking to you are people you know. A scraper usually has a really unusual address, so if you slowly check each linking address, you can tell if an address looks fishy)

Another way is to check your stats on any other appropriate traffic gathering site, for instance, in “technorati” you check your blog reactions. Be sure to check your stats and traffic frequently to see if your site, images or artwork are being unknowingly lifted and/or plagiarized.

I’m being scraped too! What do I do now? First off, you can do as I did and write a letter to the host server for the site. There are many “Whois” search engine sites, I used this one. [see postscript below for more info]

Does the site uses any particular RSS feeds such as Technorati? Thanks to Janice Harayda over at her wordpress blog titled “One Minute Book Review” for the tip that you can report the offending site that is infringing on your copyright at Technorati. Thanks, Janice!!

Next, you can hit them where it hurts- in the pocketbook! Follow the directions here if they are using Google Adsense to file a formal, legal inquiry. Keep filing a Notice of Copyright Infringement for every occurrence.

Lastly, you can file suit under US law for copyright infringement. The type of attorney you hire is one who specializes in “Intellectual Property.”

Anything else I can do? Yes, there is strength in numbers. If the scraper is also doing this to other similar blog sites, then contact them to let them know what the offending site is doing. Encourage them to contact Google Adsense and file a complaint and also the offending site’s host.

Many scrapers, once they have identified a potential harvesting site, will use certain “buzz words” they can program into the computer or “bot.” Avoid using the word that triggers the harvesting bot.

If all else fails, you can take your blog site private- which cuts off their RSS feed. I will certainly consider this if scraping gets too far out of hand.

—————————————————-

Hope you have found this post informative. If you have any questions, then feel free to leave your question or comment below.

December 11, 2007- The host server for the ____.unix-fu.org site was contacted at billing@mojohost.com (att: Brad Mitchell, CEO for MojoHost). The host responded in a very professional manner and does not tolerate plagiarism or scraping. If your site is being plagiarized by unix.fu.org site, then contact Brad to let him know.

December 12, 2007- the unix-fu site removed my posts.

I would like to thank Jonathan Bailey over at Plagiarism Today for his assistance! -Carla

13 thoughts on “Blog Plagiarism or Blog scraping: What to do?

  1. Carla,
    I’m sorry to hear you’re being ‘scraped’… this totally stinks.
    Whatever happened to civility??
    I enjoy reading your blog; your so creative and accomplish so much, it makes me tired!

  2. Thanks, Carla and Terri. So far, they haven’t scraped this post yet. Perhaps they will leave me alone for easier pickings.

    It is a sad commentary on how people will do anything to make money. Perhaps, they should get a real job. Grrrr….

    Happy Holidays though to you and yours!!

  3. You know Carla I have also seen a footer on some blogs in my reader that say along the lines of if you are reading this anywhere but an official blog reader then someone is guilty of plagerism etc please contact us so we can take legal action. I think Brotherhood of the Bean is one of the blogs I’ve seen this added in. I am sorry. I know the day someone decided to steal my header quilt and use it was a really cranky day for me… and one of the reasons I no longer have a fancy header quilt. Hope you get it resolved. It’s a fine line between I want to share but I want to protect myself some days.

  4. A few things.

    First, I am sorry to hear that this has happened to you. However, it seems to be something that will, eventually, happen to all bloggers.

    Second, looking at how you handled the situation. I don’t know if you’ve found the host. A whois tool tells you who owns the domain, but not who hosts the site. For that, you need an IP whois, which you can do at DomainTools.com. Just punch in the domain there and look at the server information.

    I have a videon on how to use it on my site as well some some stock DMCA notices to get your work removed. Also, if you need help, feel free to email me or post a follow up comment here and I’ll have a look at it to see what I can do.

    I can often track down sites that elude others.

    Let me know if I can help! I’ll gladly do what I can!

  5. Last night I took a moment to google my blog URL and what did I find? Yup, TWO of those fu.org sites scraping, or as they put it “syndicated”, one of my posts. Those rotten buggers! Their comment was:

    “It’s come to my attention that some users are syndicating from other people’s feeds to create their content. This is perfectly fine! That’s what RSS was made for!”

    HA! and double HA!

    If I took a book and copied it, put a new cover on it, and called it my own with a very tiny statement at the end giving the original author credit, do you think I could get away with it?? NO! So why should they think blogs are any different?

    The only positive thing is they removed the posts within 24 hours of my asking. But I will be keeping an eye on them.

  6. Pingback: Blog Content Theft « Deems’s Weblog

  7. Pingback: “article” concerning sploggers and scrapers « ++ got splog? ++

  8. I personally use the http://www.copygator.com website to find duplicated content. To me it has a number of benefits over copyscape and copyrightspot:

    1. it’s automated and brings me results instead of me searching for duplicated content. All i had to do was submit my feed and it started monitoring my feed showing me who’s republished my articles on the web.

    2. i get notified by email so it contacts me when it finds copies of my articles online.

    3. i use their image badge feature to alert me directly on my website when my content is being lifted.

    4. it’s a free service as opposed the “per page” cost of copyscape/copysentry.

  9. Sorry to hear that, but I am glad you have learned something new. you can also periodically check for plagiarism of your content on the internet. Here’s a free website which could help you in finding plagiarism. Free Plagiarism Checker

Leave a reply to Jonathan Bailey Cancel reply