Contents |
Architecture
Citeline is a web application built with Java and Javascript on top of the Butterfly web application framework. The data is currently stored on disk as flat files, while access control data, user profiles and exhibit ownership is stored in a Sesame triple store (which can be hosted in the same machine or shared across several clustered instances of Citeline).
Software Dependencies
Citeline depends on these projects:
- Butterfly - a javascript-based server-side web application framework (written as a Java servlet)
- Playground - a general-purpose data manipulation web application (written as a series of Butterfly modules)
- jsTeX - a javascript client-side library used to translate generic (yet simple) TeX commands into HTML (meant to be used with jsMath, which does the same for TeX Math commands)
- Exhibit - a javascript client-side library that turns static HTML pages into interactive faceted browsers
- Babel - used to convert BibTeX data into other formats (such as RDF and JSON)
Where is the code?
Citeline emerged as a citation-specific flavor of a general-purpose data manipulation infastructure that the SIMILE Project was undertaking called Playground. For that reason, Citeline's codebase is found inside the Playground codebase in the similar repository.
More specifically, you can browse Citeline's code at
http://simile.mit.edu/repository/playground/trunk/modules/citeline/
How do I install and run Citeline on my own machine?
First, obtain butterfly's source code and install it
$ svn co http://simile.mit.edu/repository/butterfly/trunk/ butterfly $ cd butterfly $ mvn install
Then, obtain playground's source code and install it
$ cd .. $ svn co http://simile.mit.edu/repository/playground/trunk/ playground $ cd playground $ mvn install
note that you need to have Subversion and Maven 2.0.4 or later properly installed on your machine (Maven will also require you to have Java installed if you don't already have it) and the svn and mvn commands need to be in your shell's path. If you don't know what this means or how to do it, you probably shouldn't be doing this.
Also note that both Subversion and Maven will need an internet connection to download all the software that will be needed to build Playground. In case you get file not found errors, make sure that you check you have installed them with the appropriate proxy settings for your network before reporting the failure to us (this happens regularly).
Once the build is complete and successful, run Citeline by executing
$ ./playground (on unix) $ playground.bat (on windows)
this will start a Jetty instance, running a Butterfly instance, executing Playground with Citeline flavor.
Access your local Citeline by pointing your browser to
http://127.0.0.1:9000/playground/
Code Description
Citeline is built as a butterfly module and is constructed as an extension of the playground module. Butterfly modules extension is similar to the way CSS stylesheets can 'cascade', meaning that if one URL is not present in the extending module, it will fall thru to the extended one. In this case, the Citeline module contains the citeline-specific code and configurations, while for everything that is more general about data playgrounds, it will fall back to the playground module. So, as a general rule, if you don't understand where some data is coming from, look in the Citeline module first and then in the playground one if you can't find it there.
Citeline Module
Here is a detailed look at the Citeline module, its folder structure and its main files:
[citeline]
index.html
exhibit-builder.html
+-- [scripts]
+-- [target]
+-- [MOD-INF]
module.properties
server.js
builder.js
index.html is the entry point to the citeline application. Note how a lot of the various page that citeline shows are actually different pieces of the same HTML page that get shown or hidden by a client-side javascript library embedded in the page. So don't look for many HTML pages but look inside that page for any tag that has a pane or dialog class.
exhibit-builder.html is the page that contains the skeleton code for the exhibit builder, which is populated dynamically via javascript on the client.
MOD-INF/module.properties is the Butterfly metadata for this module.
MOD-INF/server.js is javascript entry point for the server side controller. This module does not contain one, so Butterfly automatically cascades to use the one found in the playground module.
MOD-INF/main.js manages the citeline-specific URL space that doesn't have to deal with the exhibit builder.
MOD-INF/builder.js manages the server-side controlling logic of the exhibit builder.
scripts/ contains the client side javascript that is sent over to the browser and not executed by the server.
target/ is the temporary directory that Maven uses to store its own files, you can safely ignore its content.
Playground Module
Here is a detailed look at the Playground module, its folder structure and its main files:
[playground]
index.html
+-- [scripts]
+-- [target]
+-- [templates]
+-- [MOD-INF]
module.properties
server.js
main.js
builder.js
authentication.js
error.js
+-- src
+-- classes
+-- lib
index.html is the entry point to the Playground application. Note that the index.html file in the citeline module masks this file, meaning that when this file is requested by code in either the citeline module or code that was cascaded to the playground module from the citeline one, this file will always be ignored.
MOD-INF/module.properties is the Butterfly metadata for this module.
MOD-INF/server.js is the javascript entry point for the server side controller, which will then delegate to other javascript files in this folder. Note how this javascript is executed by the server and it's never sent over to the client side and note how the citeline module overwrites it.
MOD-INF/main.js manages the playground-specific URL space that doesn't have to deal with the exhibit builder. Note, this file is not used by citeline because it is shadowed by its own instance.
MOD-INF/builder.js manages the server-side controlling logic of the exhibit builder.
MOD-INF/authentication.js manages the server-side controlling logic that manages user authentication.
MOD-INF/error.js manages the server-side controlling logic that manages error handling and error reporting.
MOD-INF/src contains the source code of the Java objects that this module exposes to the javascript scripts on the server side.
MOD-INF/classes contains the compiled Java bytecode of the above objects.
MOD-INF/lib contains the Java libraries that the java objects exposed by the module need to function.
scripts/ contains the client side javascript that is sent over to the browser and not executed by the server.
templates/ contains the templates used by the exhibit builder to build the exhibits (basically these are the 'skins' used by the builder to come up with the different exhibit looks)
target/ is the temporary directory that Maven uses to store its own files, you can safely ignore its content.
Required Modules
This is the list of the other butterfly modules used by playground and what they do:
- skin - exposes the CSS styles and the images used by the module
- jquery - exposes JQuery and its plugins
- jsmath - exposes JSMath and its plugins
- jstex - exposes jsTeX
- ajax - exposes the SIMILE Ajax library (needed by Timeline and Exhibit)
- timeline - exposes Timeline
- exhibit - exposes Exhibit
- firebug - exposes the Firebug light error reporting javascript library
- uploader - wraps around the Apache Commons Uploading tool that is used to parse file uploading HTTP requests
- openid - wraps around the OpenID4Java library that provides OpenID functionality
- repository - exposes a repository that Playground uses to store data
- triplestore - exposes a triple store interface that playground uses to store information about users and metadata about playgrounds
- mailer - exposes a service to send email (used by the authentication system for confirmation information)
- cache - exposes a service to cache temporary objects in memory (to speed up performance)
- babel - exposes data translation services offered by Babel
- error_handler - exposes a service to handle errors
How Citeline Works
Citeline workflow is the following:
- when a browser accesses Citeline for the first time, a long-term persistent cookie is set. This is the browser identifier and will be used to identify all the actions that browser will perform on the system (for example, this is used by Citeline to send you back to the exhibit you were building when you left off).
- when a BibTeX file is uploaded to Citeline (either thru the web page upload form or thru Zotz), a new playground ID is randomly generated and a new playground is created in the data repository. Then the uploaded file is stored, as is, in the repository (in a file called
original) - Once the data has been fully transferred, Babel BibTeX transformer is invoked on the file (locally, babel is *not* used via the web service located on http://simile.mit.edu/babel/ but directly as a java library) and the transformed data is saved again in the repository in the same playground (in a file called
data, that contains RDF data in RDF/N3 serialization) - At this time, two other files are generated from this data via Babel, one contains the JSON serialization of this data (that will be used by Exhibit and it's called
json) and the other is a textual representation of the data calledtext(that will be embedded in the page that will be used to allow web crawlers to index the exhibits, since no web crawler is currently capable of extracting the information out of Ajax-driven dynamic web pages). - Another file called
metadatais generated and this contains the metadata associated with the playground. This metadata will store the state of all the configurations of this playground (such as facet selection, title, template style, location of the facet boxes, ordering, views, etc). - Once all the files are generates (this normally takes very little time even for pretty big files), citeline loads the exhibit-builder.html page which will load javascript libraries that will populate the exhibit builder (which is a javascript layer on top of the exhibit library itself). The exhibit will be run inside an <iframe> that will be controlled by the exhibit builder itself.
- Note that there is no save button: everytime the use makes a change in the state of the exhibit, the exhibit builder will automatically catch that change and save the state via ajax call. This state will influence the
metadatafile in the current playground. - When the user clicks on "preview" or "download", Citeline will assemble an HTML page that contains all the current configurations and information. Then the user can upload this page on her own web site. NOTE: the json data that will be used to populate her exhibit will always come from Citeline!
- If the user is not authenticated, there will only one playground at a time per browser, even if the user opens different windows (this is because of the browser-identifying cookie). If that cookies is lost, there is no way for the not-authenticated user to go back to the exhibit that she was working on, unless she saved the exhibit identifier.
- If the user wants to control more exhibits in Citeline, she has to authenticate.
- Authentication works via email registration and sign in, or via OpenID.
- Once authenticated, another cookies is set that identifies the session (kept on the server side). This session is temporary and will last only for a certain amount of time (configurable in the servlet container running Citeline on the server). After this time, the user will have to authenticate again to be able to see a list of her exhibits.
Adding a new Template
In order to add a new template to Citeline, go to the templates/ directory in the playground module and add a new folder with a unique name for your new template. The template should contain at least one file called layout.html. My suggestion is to copy an existing template and start modifying from there.
Once you're done adding it, you have to list it the exhibit-builder.html file in the citeline module, under the template selector <select id="template-select">.
Description of a Citeline Template
A Citeline template is a static HTML page that includes special attributes that are used by exhibit (those prefixed with ex: and by the exhibit builder (those prefixed by pg:).
Here is a breakdown of a template:
<div ex:role="collection" ex:itemTypes="Publication"></div>
tells Exhibit what types of items it should be used as the main focus of the collection. In this case, we are interested in objects of type "Publication" only.
Lenses are small HTML snippets that Exhibit uses to present the information about a particular item (in our case, a Publication).
<div ex:role="lens"
ex:itemTypes="Publication"
ex:onshow="renderTeX(this);">
<div ex:if-exists=".author">
<span ex:content=".author" class="author"></span>
<span ex:if-exists=".year">
(<span ex:content=".year"></span>)
</span>.
</div>
<h1 ex:content=".title"></h1>
</div>
The above div should be treated by Exhibit as a lens (the ex:role attribute says this) is for items of type "Publication" (ex:itemTypes) and should invoke the renderTeX(this); javascript function after the lens is generated (this is used to convert TeX encodings into HTML at runtime).
Also, the above lens is able to conditionally show fragments of HTML in case some data is present or not. Specifically in this example, if the publication has an author, populate the
<div pg:role="view-region" id="bibtex-views">
<div ex:role="view"
ex:viewLabel="List"
ex:orders=".year"
ex:directions="ascending"
ex:grouped="false"
ex:showAll="false">
</div>
</div>
The above indicates to the exhibit builder that the "#bibtex-view" div should be used as the "view-region" (which adds view creation and selection controls at the top). It also indicates to exhibit itself that the child div should be the actual view (ex:role) and it gives a bunch of parameters to Exhibit on how the view should default and how it should be have. Note how the type of view, if omitted, results in a list-type view by default.
<div pg:role="facet-region" id="bibtex-facets">
<div ex:role="facet"
ex:facetClass="TextSearch"
ex:facetLabel="Search">
</div>
<div ex:role="facet"
ex:expression=".pub-type"
ex:facetLabel="Publication Type"
ex:sortMode="count"
ex:height="10em">
</div>
<div ex:role="facet"
ex:expression=".year"
ex:facetLabel="Year"
ex:height="10em"
ex:sortDirection="reverse">
</div>
<div ex:role="facet"
ex:expression=".author"
ex:facetLabel="Author"
ex:sortMode="count"
ex:height="10em"
ex:formatter="renderTeX">
</div>
</div>
This tells the exhibit builder that the "#bibtex-facets" div should be the location of the facets (and this adds the facet selector at the bottom). Then it tells Exhibit which one should be the default facets used by the template, their queries (ex:expression), their labels (ex:facetLabel), their starting height (ex:height), what sort mode (ex:sortMode) and what javascript code should be used to format the facet values (here we use the same TeX escaping code we used for the lenses).
Note how the builder will later be able to add and remove facets from this selelection: the exhibit builder will use this template as the starting point, but then, if the metadata is present, it will overwrite these defaults with the state of your metadata as saved by the exhibit builder execution.
The good thing about this template system is that it doesn't really matter how complicated your HTML page is, Exhibit and the Exhibit Builder will inject themselves right into it.

