If a technology develops without the continued support from a single major vendor can you still call it a standard? That's the question that two groups of enterprising XML developers are asking themselves today as they move forward on a syndication technology called Rich Site Summary (RSS).
RSS is an XML format designed to let content providers share headlines and content with other sites without having to create a completely new Web page. In essence, RSS defines specific criteria about a story including the headline, the URL, and a brief summary. When a content provider makes that information available in the form of an RSS feed, anyone can grab the information and put it on their Web site. The best part is, since the URL points back to the originator's site, traffic isn't lost.
"It's very easy. The content provider owns the hits and the content. It's like free advertising," says Rael Dornfest, a researcher with technology research firm O'Reilly and Associates and a member of the RSS-DEV Working Group.
RSS isn't a fringe technology. In the beginning, Netscape Communications supported RSS as a way to get free content for its NetCenter channels. Other sites, such as CNet, The Motley Fool, and Wired caught on, using the technology to propagate their headlines. The resulting interest and use has made RSS one of the most popular and well-used XML iterations out there today, despite the fact that Netscape, now a subsidiary by AOL Time Warner Inc., has since pulled its support of the technology.
The reason is fairly simple. RSS is a free and easy way to promote a site and its content without the need to advertise or create complicated content-sharing arrangements. It also levels the playing field for smaller sites. LISNews, a site that specializes in library and information science news and events has been using the technology for two years after stumbling on it by accident, says Blake Carver, the site's owner. "After I launched the site, someone emailed me and asked me if I had a feed set up. At the time, I didn't know anything about it but I decided to give it a try," he says.
Although Blake knew nothing about XML, he created an RSS feed within two hours of the initial email using preformatted code. Today, more than twenty sites grab headlines from LISNews, despite the fact that he's never advertised his RSS feed, he says. Blake's experience mirrors that of other sites. Thousands of sites around the world are able to share links and story descriptions without the need for existing content distribution deals—or the expense that often comes along with them. And RSS isn't just for small sites. Larger sites use automated database tools that strip out RSS information from pages—title, description, and links, among other things—and generate RSS feeds automatically.
What to Look Out For
1. Content Integrity
Of course, there are a few issues that you may want to think about before you create your own newsfeed or even grab a few newsfeeds for your site. Most importantly, make sure you don't forget the cardinal rule of the Web: just because a company creates a newsfeed, doesn't mean the content is automatically reputable, says Jeff Barr, a Redmond, Washington-based software developer who created the Syndic8 site. "Right now, there's no way to establish an individual's reputation so you really don't know if a news story is biased or a person has an agenda," he says. In the past, the various working groups have talked about creating a reputation system so people can feel more comfortable using newsfeeds that come from unknown sources. Today, however, there's nothing like that, which brings up another point: will you be sued if you post a link to a slanderous or problematic news story? To date, there have been several well-publicized lawsuits involving copyright infringement—sites that involve cutting and pasting instead of linking—but no specific suits that dealt with liability precedents. Even so, people should be careful about what they're linking to. "They should make sure they put a disclaimer on their sites explaining that they aren't liable for content that comes from a different source," says Rita Knox, vice president and research director with Gartner Inc., a Stamford, Connecticut-based research firm.
2. Bandwidth Issues
Aside from legal issues, sites thinking about creating newsfeeds should consider their bandwidth limitations. Every time a site checks to see if an RSS file is updated, it counts as a hit, which can be good and bad. Larger sites can rack up much-wanted page views, but smaller sites with limited bandwidth can have their entire infrastructure grind to a halt if they get too many RSS hits—something Blake of LISNews found out the hard way. "There was a medical library that was hitting the RSS file once a minute, which was too much," says Blake. Unfortunately, since sites don't have to register to use an RSS file, Blake had to email every email address he could find on the medical site's home page before he found someone who could turn off the automatic update feature. Blake was lucky. Some sites have been forced to take down RSS feeds because they were interfering with normal usage.
Finally, RSS feed creators should budget time—at least in the beginning—to check their newsfeeds for errors. If you're not using a database to generate the feeds, grammatical errors and XML errors can slip into the RSS feed rendering it unusable by other sites. There are ways around manual verification, such as sending RSS files to one of the many newsfeed Web sites for automatic or peer verification. However, if you're creating feeds several times a day or creating a large volume of feeds, this can be cost and time prohibitive.
"RSS is a text file at heart and sometimes human error is the biggest issue we have to deal with," says Syndica8's Barr. "Even an ampersand rendered incorrectly can screw up a feed. The last time AT&T had a big news day, all of the newsfeeds were completely messed up," he says.
While thousands of sites continue to use RSS, a feud of Apple/Microsoft proportions is brewing in the background.
Today, there are not one, but two, standards based on the original RSS design—.91 and 1.0. RSS .91, which is the most popular version of RSS on the Web is often called the purist's RSS while 1.0 is called the XML purist's approach.
RSS 1.0 was born out of the need to expand the depth and breadth of data in an RSS feed. Content providers, wanting to give potential syndication partners a way to insert additional fields into an RSS feed, such as language or time stamp, started cramming additional information into the feeds on their own. Unfortunately, by adding additional fields in a non-standard way problems can and do arise, says Rael Dornfest, an O'Reilly researcher who is also co-author of the 1.0 specification. "For me, an RSS subject could be a path or a URL. For someone else, it's keywords. Certain things mean different things for different people."
To solve the problem, RSS 1.0 developers added namespace declarations—universally defined tags—to the RSS code. In addition, RSS 1.0 also supports Resource Description Framework (RDF), a way of providing information about metadata. Most importantly, RDF makes RSS feeds searchable.
So which RSS format should you support? It depends on your familiarity with XML. In simple terms, 1.0 is more complex, making it harder for someone without XML experience to create an RSS file. However, if you need to provide more information about your specific newsfeeds, 1.0 may be a better option, says Dornfest. "There's a definite migration path between the two so it's really going to depend on the individual," he says.
Stepping Stones to RSS
If you want to see what's out there before you create your own RSS newsfeed, there are plenty of places you can go to get started. The easiest point of reference is one of the numerous Web sites that aggregate headlines and act as a portal for sites looking for headlines to grab. For example, NewsIsFree, My.UserLand, and Meerkat solicit RSS feeds and, in some cases, actively seek out newsfeeds, posting them to their sites.
Users who don't want to slog through thousands of sites manually can download newsfeed readers that can search using keywords or subjects. Headline Viewer and Amphetadesk are popular readers that can be downloaded from most shareware sites.
"There are centralized applications and server applications that read the RSS source and generate information on topics that you're interested in," says Dave Winer of UserLand Software, the developer of Radio UserLand, another newsfeed reader and aggregation tool.
Even with the problems that can be associated with RSS, it's still the best syndication option available today since, unfortunately, there's not much else out there. There are two main RSS competitors—ICE and NewsML—but neither has caught on in a way that threatens RSS' dominance of the syndication world. NewsML, which was developed by the International Press Telecommunications Council (IPTC), a consortium of print and wire providers, is also based on XML. Unlike RSS, instead of creating a separate newsfeed page, NewsML-compatible content features NewsML code on the same page as the actual content. This allows content providers to link to other related stories more effectively and to mark up non-text-based news, such as images, video and audio. "NewsML comes out of the XML development community and it's much more complicated," says Dave Winer of UserLand Software, one of the original RSS standards developers. "They don't understand what publishers deal with every day. They're interested in the benefit, but not the art."
Like RSS and NewsML, the Information and Content Exchange (ICE) standard is also an XML syndication protocol used to automate syndication. However, unlike NewsML or RSS, ICE requires cooperation between two or more parties since its premise is based on running an ICE syndication server in conjunction with a content management system. It doesn't allow just anyone to grab a specific newsfeed. Instead, it's designed to facilitate existing content distribution agreements, which few publishers use today.
Blake Carver of LISNews says he isn't likely to make a switch to the new technologies any time soon. "Our RSS file is our biggest refer outside of Google," he says. Pamela Parker, managing editor of ChannelSeven.com, a network of marketing sites agrees.
"We just put the code on our site and let the Web do its magic," says Parker. "We think there are a lot of sites out there that want marketing content, but don't want to create it. RSS lets us be there for them."