Homework Assignment 0: rss data Extraction/Organization

Download 15.06 Kb.
Size15.06 Kb.

Fall 2011

Homework Assignment 0:  RSS Data Extraction/Organization

The objectives of this homework assignment:

  1. Understand the concept of web scraping

  2. Design an algorithm to extract information from an RSS feed

  3. Organize information from an RSS feed after scraping

  4. Use the concepts of object-oriented programming to organize data

  5. Design an algorithm to search for objects by given criteria


Problem Description
You are working for a Mobile Media, Inc., a company that specializes in software solutions for media on the go. Your first project is to develop an Android application that organizes and displays the most recent news entries from the RSS feeds from Google News.

RSS (Really Simple Syndication) is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format. Your boss feels that extracting data directly from RSS feeds is fairly straightforward, but would be a good challenge for you. Your boss recommends looking at the source code of some RSS feeds to get a feel of the structure of the individual entries (more commonly known as ‘items’) in a feed.

In this assignment, you will examine the RSS feed for a basic Google search. The basic format of the URL for a Google news search as an RSS feed is http://news.google.com/news?q=SEARCHKEY&output=rss.
So your program should do the following:

  1. Read user input in the form of EditText input. EditText is an Android widget class very similar to a JTextBox. See Figure 1 below for a screenshot of the user input during application runtime.

  2. Fetch the source code of the RSS feed that corresponds to the search query and extract the data for each article listed at the time of access.

  3. Organize the information for each article into a NewsArticle class. Each object of this type should have the following information that can be extracted from the RSS feed:

    1. Title (Headline)

    2. Original URL

    3. Date/time published

    4. Basic description

  4. Store each NewsArticle class in an ArrayList object.

To do this, you must implement a searchForArticles() method that takes the URL of a feed as a parameter, and returns an ArrayList object containing NewsArticle objects that contain data collected from the URL page source.

If the scrape is successful, the program should display a list of articles with links that when clicked, should open the article in the Android internet browser application. See Figure 2 below for an example of this display during application runtime. Please remember that these displays have been implemented for you. Your search results display will work properly if your NewsArticle class has been properly implemented with the data members described above and if the searchArticles() method properly returns a non-empty ArrayList object. Some demonstration code is given in the searchArticles() method.
google1.png google2.png

Figure 2 - NewsDisplay, the screen that appears when you type a search query and press the Search Button. Contains the results of the RSS feed for the particular search query.

Figure 1 - NewsMain, the first screen that appears when you run the application. The user will input a string into the text box and press the Search Button to search an RSS feed for that search query.

Finally, to enhance the functionality of your program, your boss would like you to add a feature where the user can save search keys to a list of “Saved Searches”. The user can load these saved searches by pressing the MENU button at the NewsMain activity.

Clicking the “Saved Searches” option will load the activity SavedSearchesList. In this activity, the user will be able to view the saved searches. If the user clicks a search, it will run that search and go directly to NewsDisplay, where it displays articles.
It is acceptable to use a file located on the external storage of the Android device. Code has been provided that will create File objects for the file “savedSearches.txt” which is located at the root of the external storage.

SavedSearchesList Activity GUI

Extra Credit

Left: NewsMain with options menu displaying “Saved Searches” and “Favorites options

Right: FavoritesList Activity GUI

FavoritesList Activity GUI

Similar to the “saved searches” feature described above, your boss would like a feature where the user can save articles, called “Favorites”. When they are at the NewsDisplay menu viewing articles, the user should be able to hold-click the desired article and there will be an option to save it to their Favorites. The user can also view these “Favorites” through the MENU button at the NewsMain activity. Clicking the “Favorites” option will load the activity FavoritesList.java. In this activity, the user will be able to view saved Favorites. If the user clicks one of these favorites, it will load in the Android browser, just as if it were in the normal NewsDisplay.

Just as in the “saved searches” feature, it is acceptable to use a file located on the external storage of the Android device. Code has been provided that will create File objects for the file “favorites.txt” which is located at the root of the external storage.
Supplied Solution Components
The template Eclipse project including a graphical user interface (GUI) is provided.  If you wish to create your project, you can just copy the required files from the provided project into yours. The files that must be copied are:

  1. /src/packageName/MainMenu.java

  2. /src/packageName/NewsDisplay.java

  3. /src/packageName/NewsArticle.java

  4. /src/packageName/SavedSearchesList.java

  5. /src/packageName/FavoritesList.jav

  6. /res/layout/news_main.xml (XML layout for NewsMain.java)

  7. /res/layout/news_display.xml (XML layout for NewsDisplay.java)

  8. /res/layout/row.xml (used for ListView in NewsDisplay)

  9. /res/layout/list_item.xml (used for ListView in SavedSearchesList)

  10. /res/values/strings.xml (GUI variables)

  11. /res/menu/saves_menu.xml (Options menu in NewsMain)

  12. /AndroidManifest.xml (application settings, permissions, etc.)

Submission Requirements
You will need to submit the following:

  1. All of the .java classes in the /src folder that you implement (NewsDisplay, NewsArticle)

  2. If you did extra credit, please also submit SavedSearchesList.java and FavoritesList.java


The Android Package (.apk) that will work when directly installed to an Android device. The .apk file is located in the /bin directory of your eclipse project and is updated after each build. Make sure that you have done a final build and fully tested it before submitting this file.

Download 15.06 Kb.

Share with your friends:

The database is protected by copyright ©ininet.org 2022
send message

    Main page