RSS 和 Atom 协议详解和不足

Author: yifei / Created: Dec. 27, 2016, 7:35 p.m. / Modified: Nov. 13, 2017, 7:36 p.m. / Edit


CDATA stands for Character Data and it means that the data in between these strings includes data that could be interpreted as XML markup, but should not be.

So we could use CDATA to smuggle some HTML into the XML document, so that the HTML doesn't confuse the XML document structure, and then use XSLT later to pull it out and spit it into a HTML document that is being output.   In short, you don't have to escape all the < and & in CDATA section  

RSS 2.0

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
        <title>Example Feed</title>
        <description>Insert witty or insightful remark here</description>
        <lastBuildDate>Sat, 13 Dec 2003 18:30:02 GMT</lastBuildDate>
        <managingEditor> (John Doe)</managingEditor>
            <title>Atom-Powered Robots Run Amok</title>
            <pubDate>Sat, 13 Dec 2003 18:30:02 GMT</pubDate>
            <description>Some text.</description>
            <source>Shit News</source>

RSS 协议的一些不足和改进方向

  1. 没有标识文章重要度的字段
  2. 没有途径把订阅数量等信息反馈给 RSS 提供方
  3. 没有品牌特性
  4. 没有机器推荐
  5. 如果能够把 RSS 包装成像是 Amazon Prime 那样的服务,用户可能会很愿意付钱

实际上文章的增删改查是一套组合操作,而只使用一个 RSS 作为列表显然是不够的,必然要拓展。

现代的 RSS 阅读器需要做三个方面

  1. 一个社区
  2. 能够把所有服务都提供RSS,包括不提供RSS的站点
  3. 评论服务
  4. 转码。有的 RSS 只提供了文章的摘要,有的 RSS 有实效性,有的 RSS 有自己的字体


    Atom 1.0

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="">
    <title>Example Feed</title>
    <subtitle>Insert witty or insightful remark here</subtitle>
    <link href=""/>
        <name>John Doe</name>
        <title>Atom-Powered Robots Run Amok</title>
        <link href=""/>
        <summary>Some text.</summary>