[Insight-users] KD Tree Search Question...

Fri Mar 24 09:28:08 EST 2006

Tuhin,

This sounds like a bug.  The KdTree should do one or the other
but not sometimes one and sometimes the other.

Now, the question becomes what should it do?

I see use cases for both.  For instance, you may not know (realize)
that you point you are querying with is already in the KdTree and
you want to know if this position has already been used. So in this
case, you would want to be able to query a position in the domain
and if that position is in the tree, return it in the nearest
neighbor list.  This would occur if you are "testing" the tree
against data other than what it was built with.

On the other hand, if you built a tree, you may now want to 
reason/cluster based on the tree.  So when you query for the
k-nearest neighbors, you want the k-nearest that do not 
include the point the in question.

I see two solutions:

1) If the point exists in the tree, it is returned as part of the 
nearest neighbor set.  It would be an algorithms responsibility to 
decide whether to use that point (sort by distance and ignore any
point with a distance of zero).

2) Provide two sets of methods.  One which will return the query point
in the nearest neighbor set if the query point is in the tree. And one
which will not return the query point in the nearest neighbor set even
if the query point is the tree.

Any preferences?

The bottom line is that it should be deterministic and not do one
thing sometimes and another thing another time.  Can you construct
a small test case (maybe lower dimension) that illustrates the bug?

Jim

-----Original Message-----
From: insight-users-bounces+millerjv=crd.ge.com at itk.org
[mailto:insight-users-bounces+millerjv=crd.ge.com at itk.org]On Behalf Of
Tuhin Sinha
Sent: Thursday, March 23, 2006 5:43 PM
To: insight-users at itk.org
Subject: [Insight-users] KD Tree Search Question...

Hello insight-users,

I have a quick question about the KdTree implementation in ITK.  I have
started using the KdTree in ITK to help speed up nearest-neighbor
searches in high-dimensional spaces (D > ~500).  In the process of
testing the KdTree, I noticed that for a k-nearest-neighbor search: 1)
the nearest-neighbors are not returned in any specific order, and 2)
_sometimes_ the query point itself (which is guaranteed to be in the
KdTree) is returned as a nearest-neighbor.

The first observation is not a big issue, but the second one is
problematic for my specific needs.  I would like to enforce some
consistency in the behavior of the KdTree in terms of the returned
nearest-neighbors; either give me the query point every time or never.

  Does the behavior I am getting from the KdTree correlate with what the
developers intended?  That is, if one query's a KDTree in ITK with a
point that is in the KDTree what should be the returned
nearest-neighbor, itself or it's closest neighbor?  In my testing a
1-neighbor search yeilds both results; sometimes I get the query point,
and sometimes I don't.  Is the end-user supposed to enforce some sort of
redundancy checking of the returned nearest neighbors for KD searches in
ITK? Or, do I have a bug in my code?  I am running ITK 2.4 under Linux.

Thanks,
-- 
Tuhin K. Sinha, Ph.D.
Institute of Imaging Science
Department of Radiology and Radiological Sciences
Vanderbilt University
Medical Center North,  Suite R-1302
Nashville, TN 37232-2310

_______________________________________________
Insight-users mailing list
Insight-users at itk.org
http://www.itk.org/mailman/listinfo/insight-users