Last week:
- For UIST paper
- Added a sidebar to PB where PB can offer browsing controls while
keeping the original Web page (scraped) in place. This will be referred
to as "augmentation".
- Wrote an "augmenting" scraper for the Mozilla extensions site.
Concluded that such an augmenting scraper is pretty darn complicated to
write (400+ lines of Javascript).
- Thought about how to make scrapers more declarative.
- Added tree-edit-distance heuristics to Solvent to align items'
sub-DOMs. This will allow us to move a lot more of the exception
handling logic from scrapers into the scraping framework and thus make
scrapers more declarative.
- Had a discussion with Stefano on his Gadget 2 stuff.
This week:
- Prepare for a talk at North Eastern University ACM.
- Work on supporting declarative scrapers
- Work on augmentation support
On the backburner:
- 3.0 stuff
- Bloom
David
Received on Mon Jan 30 2006 - 17:16:35 EST
This archive was generated by hypermail 2.3.0
: Thu Aug 09 2012 - 16:39:18 EDT