Re: Question about using Xpath with Frames

From: Chung Le <chung_le_at_dslextreme.com>
Date: Tue, 02 May 2006 08:42:48 -0700

With help from Viktor Zigo, the author of Xpather, I was able to come
up with a simple way to work with web pages with Frames.

Here is the function that does the work.
// collects and returns all FRAME nodes in the document
// each FRAME node has its own document in "contentDocument"
// so this function works recursively

function getFrame(docNode) {
    var frames = docNode.getElementsByTagName("FRAME");
    var foundFrames = new Array();
    for (var i=0; i<frames.length; i++) {
        foundFrames.push(frames[i]);
        var frameDoc = frames[i].contentDocument;
        if (frameDoc)
            foundFrames = foundFrames.concat(getFrame(frameDoc));
    }
    return(foundFrames);
}

Given an Xpath expression, it needs to be evaluated against each
collected frame.
Hope this helps.

-- Chung --

David Huynh wrote:

> Thanks, Chung. I was hoping that we can get inside frames using purely
> XPaths rather than a mix of XPaths and DOM calls.
>
> It looks like Mozilla has a bug filed about this, as mentioned at the
> end of
> http://wiki.mozilla.org/DOM:XPath_Generator
>
> David
>
> Chung Le wrote:
>
>> David,
>>
>> It seems to me that in order to process a DOM tree with FRAMEs, one
>> needs to navigate from the top of the document to get to the
>> appropriate frame. From there, you can get to its html doc via the
>> contentDocument attribute. For example, the document I want to
>> process has the following path:
>> /document1/html/frameset/frame[2]/document2/html/frameset/frame[1]/document3/html/framese/frame[2]/document4
>>
>>
>> Breaking it down to three xpaths for the frames, I got:
>> /html/frameset/frame[2]
>> /html/frameset/frame[1]
>> /html/framese/frame[2]
>>
>> To get document2,
>> var xpath = "/html/frameset/frame[2]";
>> var frameResult = doc.evaluate(xpath, doc, null,
>> XPathResult.ANY_UNORDERED_NODE_TYPE,null).singleNodeValue;
>> var document2 = frameResult.contentDocument;
>> It's probably not the most efficient way, but it works.
>>
>> Cheers,
>> -- Chung --
>>
>> David Huynh wrote:
>>
>>> Chung,
>>>
>>> I've downloaded Xpather and couldn't figure out their xpath feature,
>>> either :-) Let us know if you are successful at contacting Xpather's
>>> author. I'll let you know if I run into anything relevant.
>>>
>>> David
>>>
>>>
>>> Chung Le wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been successfully writing screen scrapers using Xpath and
>>>> JavaScript for my PiggyBank application. Recently, I discovered
>>>> that my scraper doesn't work if the DOM has Frames. I looked at
>>>> Xpathers which says that Frames are supported. But I couldn't find
>>>> any documentation about how to use the feature.
>>>> I would appreciate any help on how to get Xpath to work with
>>>> Frames. Many thanks.
>>>>
>>>> -- Chung --
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
>
Received on Tue May 02 2006 - 15:41:32 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT