When Piggy Bank collects “pure” information from a web page and the web page does not include the “pure” information itself, Piggy Bank needs to invoke software code to interpret the web page’s content and reconstruct the “pure” information contained inside. For each web page a different piece of software code is required to perform this translation — that piece of code is called a “screen scraper”.
Find
Screen scrapers are also described in “pure” information that you can collect from web pages. This wiki contains a list of available screen scrapers.
You are more than welcome to submit new scrapers to the list above if you write one yourself.
Inspect
Each screen scraper has two properties of interest:
- A URL pattern, which, if matched against a web page’s URL, tells Piggy Bank that the screen scraper can be used to re-construct information from that web page.
- A URL pointing to the screen scraper's code. You can click on this link to examine the code yourself.
Install
To install a screen scraper, first treat it like any other data by clicking the data coin
in your status bar (usually the lower right corner of the browser window) while viewing a web page or document describing the scraper. Save its information to your Piggy Bank, then activate it. To activate a screen scraper, click on the icon
at the upper right corner of the screen scraper’s information. That icon will change to this icon
to indicate that the screen scraper is activated.
To deactivate a screen scraper, click on its icon
. Note that to find the screen scrapers you have saved, go to the “My Piggy Bank” page and click on the “Screenscrapers” in the list of types on the right side.
For a brief visual recording of a user installing and using screen scrapers, see the Piggy Bank Screencasts; you can also read the accompanying tutorial.
Use
When you command Piggy Bank to collect “pure” information from a web page and Piggy Bank finds a suitable screen scraper for extracting information from that page, Piggy Bank automatically invokes the screen scraper’s code. If it is the first time that screen scraper is used, Piggy Bank will first ask you for permission to download that code. That code is then cached for subsequent invocations. Some scrapers contain more precise instructions regarding their use; be sure to make note of those if you intend to use a specific scraper.
In order to purge the cached code of a screen scraper and cause it to be re-fetched, deactivate the screen scraper and re-activate it.