Parsing and Writing RSS Using Jython and Project ROME

Submitted By: Josh Juneau

Introduction

RSS is an old technology now...it has been around for years. However, it is a technology which remains very useful for disseminating news and other information. The ROME project on java.net is helping to make parsing, generating, and publishing RSS and Atom feeds a breeze for any Java developer.

Since I am particularly fond of translating Java to Jython code, I've taken simple examples from the Project ROME wiki and translated Java RSS reader and writer code into Jython. It is quite easy to do, and it only takes a few lines of code.

Keep in mind that you would still need to build a front-end viewer for such an RSS reader, but I think you will get the idea of how easy it can be just to parse a feed with Project ROME and Jython.

Setting Up The CLASSPATH

In order to use this example, you must obtain the ROME and JDOM jar files and place them into your CLASSPATH.

Windows:

set CLASSPATH=C:\Jython\Jython2.2\rome-0.9.jar;%CLASSPATH%
set CLASSPATH=C:\Jython\Jython2.2\jdom.jar;%CLASSPATH%

OSX:

export CLASSPATH=/path/to/rome-0.9.jar:/path/to/jdom.jar

Parsing Feeds

Parsing feeds is easy with ROME. Using ROME with Jython makes it even easier with the elegant Jython syntax. I am not a professional Python or Jython programmer, I am a Java programmer and DBA by profession, so my Jython interpretation may be even wordier than it should be.

I took the FeedReader example from the ROME site and translated it into Jython below. You can copy and paste the following code into your own FeedReader.py module and run it to parse feeds. However, the output is unformatted and ugly...creating a good looking front end is up to you.

FeedReader.py

########################################
# File: FeedReader.py
#
# This module can be used to parse an RSS feed
########################################
from java.net import URL
from java.io import InputStreamReader
from java.lang import Exception
from java.lang import Object
from com.sun.syndication.feed.synd import SyndFeed
from com.sun.syndication.io import SyndFeedInput
from com.sun.syndication.io import XmlReader

class FeedReader(Object):
   def __init__(self, url):
      self.inUrl = url

   def readFeed(self):
      ok = False
      #####################################
      # If url passed in is blank, then use a default
      #####################################
      if self.inUrl != '':
         rssUrl = self.inUrl
      else:
         rssUrl = "http://www.dzone.com/feed/frontpage/java/rss.xml"
      #####################################
      # Parse feed located at given URL
      #####################################
      try:
         feedUrl = URL(rssUrl)
         input = SyndFeedInput()
         feed = input.build(XmlReader(feedUrl))
         ####################################
         # Do something here with feed data
         ####################################
         print(feed)
         ok = True
      except Exception, e:
         print 'An exception has occurred', e
      if ok != True:
         print 'An error has occurred in this reader'

if __name__== "__main__":
    reader = FeedReader('')
    reader.readFeed()
    print '****************Command Complete...RSS has been parsed*****************'

Creating Feeds

Similar to parsing a feed, writing a feed is also quite easy. When one creates a feed, it appears to be a bit more complex than parsing, but if you are familiar with XML and it's general structure then it should be relatively easy.

Creating a feed is a three step process. You must first create the feed element itself, then you must add individual feed entries, and lastly you must publish the XML.

FeedWriter.py

########################################
# File: FeedReader.py
#
# This module can be used to create an RSS feed
########################################
from com.sun.syndication.feed.synd import *
from com.sun.syndication.io import SyndFeedOutput
from java.io import FileWriter
from java.io import Writer
from java.text import DateFormat
from java.text import SimpleDateFormat
from java.util import ArrayList
from java.util import List
from java.lang import Object

class FeedWriter(Object):
    ####################################
    # Set up the date format
    ####################################
    def __init__(self, type, name):
       self.DATE_PARSER = SimpleDateFormat('yyyy-MM-dd')
       self.feedType = type
       self.fileName = name

    def writeFeed(self):
       ok = False
       try:
        ################################
        # Create the feed itself
        ################################
          feed = SyndFeedImpl()
          feed.feedType =self.feedType
          feed.title = 'Sample Feed (created with ROME)'
          feed.link = 'http://rome.dev.java.net'
          feed.description = 'This feed has been created using ROME and Jython'
 
         ###############################
         # Add entries to the feed
         ###############################
          entries = ArrayList()
          entry = SyndEntryImpl()
          entry.title = 'ROME v1.0'
          entry.link = 'http://wiki.java.net/bin/view/Javawsxml/Rome01'
          entry.publishedDate = self.DATE_PARSER.parse("2004-06-08")
          description = SyndContentImpl()
          description.type = 'text/plain'
          description.value = 'Initial Release of ROME'
          entry.description = description
          entries.add(entry)
     
          entry = SyndEntryImpl()
          entry.title = 'ROME v2.0'
          entry.link = 'http://wiki.java.net/bin/view/Javawsxml/Rome02'
          entry.publishedDate = self.DATE_PARSER.parse("2004-06-16")
          description = SyndContentImpl()
          description.type = 'text/plain'
          description.value = 'Bug fixes, minor API changes and some new features'
          entry.description = description
          entries.add(entry)

          entry = SyndEntryImpl()
          entry.title = 'ROME v3.0'
          entry.link = 'http://wiki.java.net/bin/view/Javawsxml/Rome03'
          entry.publishedDate = self.DATE_PARSER.parse("2004-07-27")
          description = SyndContentImpl()
          description.type = 'text/plain'
          description.value = '<p>More Bug fixes, mor API changes, some new features and some Unit testing</p>'
          entry.description = description
          entries.add(entry)

          feed.entries = entries
         ###############################
         # Publish the XML
         ###############################
          writer = FileWriter(self.fileName)
          output = SyndFeedOutput()
          output.output(feed,writer)
          writer.close()
 
          print('The feed has been written to the file')

          ok = True
  
       except Exception, e:
          print 'There has been an exception raised',e

       if ok == False:
          print 'Feed Not Printed'

if __name__== "__main__":
    ####################################
    # You must change his file location
    # if not using Windows environment
    ####################################
    writer = FeedWriter('rss_2.0','C:\\TEMP\\testRss.xml')
    writer.writeFeed()
    print '****************Command Complete...RSS XML has been created*****************'

After you have created the XML, you'll obviously need to place it on a web server somewhere so that others can use your feed. The FeedWriter.py module would probably be one module amongst many in an application for creating and managing RSS Feeds, but you get the idea.

Conclusion

As you can see, using the ROME library to work with RSS feeds is quite easy. Using the ROME library within a Jython application is straight forward. As you have now seen how easy it is to create and parse feeds, you can apply these technologies to a more complete RSS management application if you'd like. The world of RSS communication is at your fingertips!

Resources

Project ROME Tutorials

JythonMonthly/Articles/October2007/1 (last edited 2014-06-12 09:43:11 by AdamBurke)