Craigslist and coordinates bugfix [was RE: Craigslist scraper coordinates]

From: Prokopp, Christian <christian.prokopp_at_sap.com>
Date: Wed, 11 Jan 2006 10:44:30 +0800

Hi,

I used the latest build (at that time 4th of Jan.) and updated again to
the 10th of Jan. build - no change - still in Kansas :o)

I did a bit of bugfixing and stumbled over something...
The request send to google by piggy bank is e.g.:
maps.google.com/maps?output=js&q=loc%253A+Humboldt+St.+at+Silva+St.+Sant
a+Rosa+CA+US
On craigslist the link to google is e.g.:
maps.google.com/maps?q=loc%3A+Humboldt+St.+at+Silva+St.+Santa+Rosa+CA+US
I am not sure where this extra '25' comes from but if you change
craigslist-apt-listing-scraper.js
Line 72: address = href.substr(href.indexOf("?q=") + 3).replace(/\+/g, "
");
to
Line 72: address = href.substr(href.indexOf("?q=") + 10).replace(/\+/g,
" ");

This changes the request to e.g.
'maps.google.com/maps?output=js&q=Humboldt+St.+at+Silva+St.+Santa+Rosa+C
A+US' which works with google AND also fixes the added 'loc%3A ' in
front of every address scrapped into your semantic bank.

Now I am mostly in California and only the odd address can not be found
and ends up in Kansas. (yesterday I had ~70-80 out of 100 in Kansas and
barely any in CA now it is 20 out 100 in Kansas and a lot in CA)



I just checked a bit more and made another fix which will (hardcoded)
ignore the specific Kansas location which Google is so fond of.
Find 'scraping-utilities.js' either in your source
(SOURCEFOLDER\piggy-bank\firefox\chrome\content\scripts) or profile
(PROFILEFOLDER\piggy\extensions\{e29a3ba7-2b91-4bf1-8c04-b9738c77aa3d}\c
hrome\piggy-bank\content\scripts) and change:
Line 200: keyToLL[key] = ll;
to
Line 200: if(ll.indexOf("37.0625,-95.677068") < 0) keyToLL[key] = ll;

Follow both fixes and bye, bye Kansas :o)

All my changes are done with a local copy and I don't know how the
scrapers are bugfixed/updated and my subversion still fails with my
company's network setup so I'll leave it up to the project team to do or
don't something. If you want me to file a bug report for any of these
let me know.

Cheers,
Christian
Received on Wed Jan 11 2006 - 02:44:33 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT