: C# 2008 Programmer

An Example Using RSS

An Example Using RSS

Let's now take a look at the usefulness of LINQ to XML. Suppose that you want to build an application that downloads an RSS document, extracts the title of each posting, and displays the link to each post.

Figure 14-13 shows an example of an RSS document.


Figure 14-13

To load an XML document directly from the Internet, you can use the Load() method from the XDocument class:

XDocument rss =
XDocument.Load(@"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");

To retrieve the title of each posting and then reshape the result, use the following query:

var posts =
from item in rss.Descendants("item")
select new {
Title = item.Element("title").Value,
URL = item.Element("link").Value
};

In particular, you are looking for all the <item> elements and then for each <item> element found you would extract the values of the <title> and <link> elements.

<rss>
<channel>
...
<item>
<title>...</title>
<link>...</link>
</item>
<item>
<title>...</title>
<link>...</link>
</item>
<item>
<title>...</title>
<link>...</link>
</item>
...

Finally, print out the title and URL for each post:

foreach (var post in posts) {
Console.WriteLine("{0}", post.Title);
Console.WriteLine("{0}", post.URL);
Console.WriteLine();
}

Figure 14-14 shows the output.


Figure 14-14

Query Elements with a Namespace

If you observe the RSS document structure carefully, you notice that the <creator> element has the dc namespace defined (see Figure14-15).


Figure 14-15

The dc namespace is defined at the top of the document, within the <rss> element (see Figure 14-16).


Figure 14-16

When using LINQ to XML to query elements defined with a namespace, you need to specify the namespace explicitly. The following example shows how you can do so using the XNamespace element and then using it in your code:

XDocument rss =
XDocument.Load(@"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");
XNamespace dcNamespace = "http://purl.org/dc/elements/1.1/";
var posts =
from item in rss.Descendants("item")
select new {
Title = item.Element("title").Value,
URL = item.Element("link").Value,
Creator = item.Element(dcNamespace + "creator").Value
};
foreach (var post in posts) {
Console.WriteLine("{0}", post.Title);
Console.WriteLine("{0}", post.URL);
Console.WriteLine("{0}", post.Creator);
Console.WriteLine();
}

Figure 14-17 shows the query result.


Figure 14-17

Retrieving Postings in the Last 10 Days

The <pubDate> element in the RSS document contains the date the posting was created. To retrieve all postings published in the last 10 days, you would need to use the Parse() method (from the DateTime class) to convert the string into a DateTime type and then deduct it from the current time. Here's how that can be done:

XDocument rss =
XDocument.Load(
@"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");
XNamespace dcNamespace = "http://purl.org/dc/elements/1.1/";
var posts =
from item in rss.Descendants("item")
where (DateTime.Now -
DateTime.Parse(item.Element("pubDate").Value)).Days < 10
select new {
Title = item.Element("title").Value,
URL = item.Element("link").Value,
Creator = item.Element(dcNamespace + "creator").Value,
PubDate = DateTime.Parse(item.Element("pubDate").Value)
};
Console.WriteLine("Today's date: {0}",
DateTime.Now.ToShortDateString());
foreach (var post in posts) {
Console.WriteLine("{0}", post.Title);
Console.WriteLine("{0}", post.URL);
Console.WriteLine("{0}", post.Creator);
Console.WriteLine("{0}", post.PubDate.ToShortDateString());
Console.WriteLine();
}


: 1.120. /Cache: 3 / 0