Re: Scaling facetted browsing to a very large curpus

From: David R. Karger <>
Date: Tue, 2 Nov 2004 01:12:50 -0500

   Mailing-List: contact; run by ezmlm
   X-No-Archive: yes
   Reply-To: <>
   Date: Fri, 29 Oct 2004 14:35:21 -0400
   From: Stefano Mazzocchi <>
   X-LocalTest: Nonlocal Origin ([]
   X-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham
   X-SpamBouncer: 2.0 beta (10/20/04)
   X-SBNote: Bulk Email (From_Daemon/Listserv/Resent/Precedence)
   X-SBScore: 0 (Spam Threshold: 20) (Block Threshold: 5)
   X-SBClass: Bulk

   David R. Karger wrote:

> > A set of 1000 facets is a corpus.
> First of all, I would like to introduce a terminological distinction so
> that we know what we are talking about:
> facet := metadata field that is considered important enough in a
> particular search&browse (s&b) context.
> facet value := literal content of a facet
> Good clarification. I had assumed you meant 1000 facets. This
> affected the rest of my message (ie, by predicate I meant facet).


> well, one good way to explore the facet values is to look at the
> reverse arrows. ie, for a given facet value, what are the items that
> have that facet value. This can give the user some feel for how that
> value's facet is working.

   oh, interesting. Are you suggesting we use the literal as an identifier
   and group all the nodes that have the same literal as they were pointing
   to the same URI? hmmmm

That's it. I don't draw any distinction between forward and
backward arcs. They're all relations. And I don't differentiate
literals from other resources when it comes to thinking about how to
browse them.

> > Given the metadata, there are various tool for searching it. I'd like
> > to see Vineet's metadata-based fuzzy browser applied to this problem,
> > for example.
> Can you give us more information on this? Will this be part of the
> minimal Haystack distribution that Steve is working on?
> Best source of info is vineet. He'd be happy to demo it for
> you.
Received on Tue Nov 02 2004 - 06:12:50 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:17 EDT