Re: import from Google Maps?

From: David Huynh <dfhuynh_at_csail.mit.edu>
Date: Sun, 02 Oct 2005 08:51:04 -0400

Alf,

So in Solvent, switch to the Code tab, click on Insert Template and
choose One page. Then where in the code it says "Put your code here",
paste this code:

var xpath = '/HTML/BODY/DIV/DIV/DIV/TABLE/TBODY/TR/TD/DIV';
var elmts = utilities.gatherElementsOnXPath(doc, doc, xpath);
var gAddressParts = new Array();
for (var i = 0; i < elmts.length; i++) {
  gAddressParts.push(elmts[i].innerHTML);
}
var gAddress = gAddressParts.join(', ');
var uri = doc.location.href;

model.addStatement(uri, prefixRDF + "type", prefixLoc + "Property", false);
model.addStatement(uri, prefixDC + "title", doc.title, true);
model.addStatement(uri, prefixLoc + "address", gAddress, true);
geoHelper.add(uri, gAddress);


I've modified your code a little. Note that the URL parameter spn is not
the latitude/longitude pair. It's the span of the map, in effect
specifying the zoom level. So I've used the geoHelper object to look up
the latitude/longitude instead.

Run the code, then switch to the Results tab and click Show in Piggy
Bank. A new window will pop up with the data encoded in N3.

As for packaging everything into a final scraper, we haven't gotten to
implement that in Solvent yet. Please see the beginning of the following
page:
    http://simile.mit.edu/piggy-bank/screen-scrapers-howto.html

By the way, while trying out your example, I've discovered two bugs in
PB's map embedding code and have checked in a fix. You can check out the
code and build it yourself or get the latest xpi from
http://people.csail.mit.edu/dfhuynh/research/downloads/ .

Cheers,

David


Alf Eaton wrote:

> OK, I have the following:
>
> var xpath = '/HTML/BODY/DIV/DIV/DIV/TABLE/TBODY/TR/TD/DIV';
> var elmts = utilities.gatherElementsOnXPath(doc, doc, xpath);
> var gAddressParts = new Array();
> for (var i = 0; i < elmts.length; i++) {
> gAddressParts.push(elmts[i].innerHTML);
> }
> var gAddress = gAddressParts.join(', ');
> var latlong = window.content.location.href.split('spn=')[1];
>
> and a URL matching pattern, but I'm not sure how to fit this into
> Solvent's code template to make the final scraper.
>
> Any hints?
>
> alf.
>
> On 01 Oct 2005, at 11:47, David Huynh wrote:
>
>> You could give our second Firefox extension (called Solvent) a try:
>> http://people.csail.mit.edu/dfhuynh/research/downloads/
>> It needs the latest PB xpi found in the same place.
>>
>> Not all of Solvent is working yet. But your XPath there seems to
>> have grabbed the address just fine.
>>
>> David
>>
>>
>> Alf Eaton wrote:
>>
>>
>>> On 01 Oct 2005, at 10:55, David Huynh wrote:
>>>
>>>
>>>> Alf Eaton wrote:
>>>>
>>>>
>>>>
>>>>> Just wondering, because I really have a need for it at the
>>>>> moment, are there any scrapers for importing data directly into
>>>>> Piggy Bank from Google Maps pages?
>>>>>
>>>>> alf.
>>>>>
>>>>>
>>>>
>>>> Not that I'm aware of. Are you looking at such a page as this?
>>>>
>>>> http://maps.google.com/maps?
>>>> q=pizza&sll=37.062500,-95.677068&spn=1.034791,1.896927&sspn=37.88818
>>>> 1, 64.775391&num=10&start=0&hl=en
>>>>
>>>> David
>>>>
>>>>
>>>
>>> More like this, actually:
>>> http://maps.google.com/maps?q=75+bellevue+avenue,
>>> +toronto&spn=0.031208,0.062622&hl=en
>>>
>>> so it has the coordinates in the URL and the details in
>>> /HTML[1]/BODY[1]/DIV[2]/DIV[3]/DIV[3]/TABLE[1]/TBODY[1]/TR[1]/TD[2]
>>> or thereabouts (though that bit is in the external iframe, I think).
>>>
>>> alf.
>>>
>>
>>
>>
>
Received on Sun Oct 02 2005 - 12:48:05 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT