[status] weekly report

From: Stefano Mazzocchi <stefanom_at_mit.edu>
Date: Mon, 23 Jan 2006 12:07:04 -0500

Things I've done last week:

  1) start thinking seriously on how to make longwell scale to a size
where it becomes useful

  2) collected the entire MIT library catalog in MARC21 (1.2 million
records)

  3) started to write an RDFizer to transform all that data into RDF,
doing MARC21 -> MARCXML -> MODS/XML -> MODS/RDF/XML (the first 3 stages
are done).

  4) started working on a MODS/XML -> MODS/RDF/XML XSLT transformer

  5) started to work on how to scale Gadget using some sort of disk
index (to help achieving #5)



Things I plan to do this week (in this order due to dependencies):

  a) finish converting the MIT MARC records in MODS (converted 300K so
far but ran into massive I/O disk slowdowns due to the large number of
files... need to rethink the disk storage strategy)

  b) finish the work on Gadget so that I can generate the spectrum of
the MIT MODS dataset

  c) finish a first draft of the MODStoRDF XSLT stylesheet and get a
sense of where the problems are.


-- 
Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
-------------------------------------------------------------------
Received on Mon Jan 23 2006 - 17:06:33 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT