Crowbar Web Service

Crowbar offers its services via an HTTP web service. This page describes the parameters and gives examples on how to interact with the service.

Design

Crowbar is based around an embarasingly simple HTTP server written in javascript and using some of the mozilla XPCOM components to access the network. A simple HTTP web service (sometimes called RESTful) is much easier to implement than a SOAP-based one and also clients are much easier to write (if you're not blinded by Visual Studio, that is).

Supported HTTP Methods

You can request both using GET or POST methods, the result is exactly the same. The reason for this is that different HTTP clients are easier to work with in GET (say, a web browser) and others in POST (say 'curl' from the command line).

As the HTML spec recommends for form submissions, the parameters are encoded using the 'application/x-www-form-urlencoded' mime-type. For GET, they are added after the ? to the request URL, and for POST, they are added as payload of the request.

You shouldn't care about this though as any HTTP client API or program will do the above automatically.

Supported Parameters

Name Default Values Description
url [none] URL the URL of the page that Crowbar should process
mode dump dump serializes the DOM of the requested page (after waiting for a delay configured using the 'delay' parameters below)
links lists the links the page contains
scrape uses the scraper located by the "scraper" parameter to scrape data from the page and return it in RDF/XML format [not yet implemented]
exhibit obtains the data out of the page by invoking the Exhibit embedded in the page and returns the data formatted as RDF/XML [not yet implemented]
view as-is as-is returns the result directly as the payload of the HTTP response
browser returns the result embedded in a HTML page (useful when invoking directly thru a browser)
delay 3000 [ms] how much Crowbar should wait after the page has terminated loading before attempting to serialized its DOM
scraper [none] URL the URL location of the scraper that Crowbar should use to extract data from the web page