RSS scraping is where a website owner takes an RSS feed and places it on their website without attribution or a link back to the original author. Some people feel any RSS placed on a website is scraping.
Have you ever had an RSS feed scraped? What did you do about it? How much content were you delivering via the feed? Did you change how you sent your posts after you found the scraper?
If you've never had a feed scraped, what would you do if you found your site scraped like that? Is RSS scraping something that should be allowed? What do you think of RSS scraping?Share Your Thoughts
- Bloggers should be pleased if other sites hook a scraper up to their RSS feed... that is what it is frickin' for bro! ...to increase participation! Are there really bloggers out there that want to prevent more user participation? Outdated copyright laws just keeps getting in the way again of progress... Jeez!
- —Guest benkeeler
RSS scraping and image bandwidth theft
- Some servers can be configured not to sent an image unless the image has been requested by a page from the same server. It would be possible for an RSS scraper to defeat this check but, in many cases, the perpetrator might not want to do the extra work. The Apache server can can be configured to prevent image bandwidth theft: http://www.thesitewizard.com/archive/bandwidththeft.shtml
- —Guest Ian E. Gorman
Are you really, really sure?
- Are you really, really sure about this? It has been my impression that anytime you syndicate content you are giving others permission to reference and/or re-publish that information. This is certainly true of traditional syndication like AP. Since syndication is built right into the name, i.e., RSS (really simple syndication), I'd be inclined to think anyone who puts their content into syndication that way is essentially giving permission to use the information. Nearly all fee directories list at least the introductory text of such RSS feeds, so if that's okay, where is one supposed to draw the line. If it is not okay, does that mean that all those very well established sites are in violation of the copyright law? I mean, the whole reason one creates an RSS feed in the first place, for most, is to get posts listed on just those kinds of sites in order to drive traffic. I believe if the original source is cited, it's okay -- but I don't know for sure, especially after reading this artic
- —Guest V4Vendetta
Victimized by Malicious RSS Scraper
- I've been the ongoing victim of a fiendish guy whose entire blog consists of my feed, including my photos. He doesn't stop there though - that is entirely legal because the links to my site would stay intact with a regular RSS feed. What he has done is to utilize some sort of "word-exchange" script which randomly substitutes nonsense words for the originals. The result is gibberish, which has nothing to do with the original article. I receive no benefit whatsover from this particular RSS feed. Also, since the photos are hotlinked to my server, in any other case it would be considered a theft of bandwidth. I tried every resource known to me, but all failed because this guy has completely cloaked his activities through manipulating the system. I do think that RSS scraping should be disallowed, however policing it would be virtually impossible in many cases - just as policing content theft has become. If I had a choice I'd skip RSS feeds entirely - not worth the grief.