[Insight-users] Kd-tree implementation in ITK is buggy and unreliable

Fri Apr 25 10:07:40 EDT 2008

Hi Kevin,

Thanks again for providing this test.

It has been added to the toolkit as

   Insight/Testing/Code/Numerics/Statistics/itkKdTreeTest1.cxx

The reason why your test was failing is that a value
of bucket size = 16 is too low for the number of points
that you are feeding into the itkKdTree generator.

When increasing the bucket size to 1000, the test seems to
work fine.

Could you please let us know, the typical number of
points that you have been using (in your experiments)
and what value of BucketSize() have you used for them ?

    Thanks

       Luis

--------------------
Kevin H. Hobbs wrote:
> On Tue, 2008-04-22 at 20:15 +0100, Ali - wrote:
> 
>>Kevin,
>>
>>As you know, testing algorithms like nearest neighbour search on some specific datasets do not provide the sufficient condition for the health of the code. At the moment, I cannot think of a more general solution than this trial-error approach.
>>
> 
> 
> How about trial and error with random numbers?
> 
> 
>>The bug I mentioned seems to be different to the one reported. In the attached point-sets p1 and p2, if you try to find out the nearest neighbours of each point in p1 in p2 by the implementation of kd-tree in ITK, one of them is assigned to a wrong neighbour. By visualising the result it is easy to spot the problem.
> 
> 
> Since you just provide another specific data set I'm not going to use
> your data either, but I wrote a little test code based on the test I
> linked to but using the MersenneTwisterRandomVariateGenerator to set the
> tree and test points.
> 
> I put 1000 points in the tree. I queried the tree 1000 times with points
> that were also generated pseudo-randomly. 
> 
> For each query point I dumbly calculate the distance to every point in
> the tree I exit with a failure if this distance is less than the search
> distance... At least that's what I'm trying to do.
> 
> Unfortunately what I've written does seem to say that there is a
> problem. Repeated runs produce :
> 
> [kevin at gargon kdtest]$ ./kdtest
> 0.10136 0.133504
> Test FAILED.
> [kevin at gargon kdtest]$ ./kdtest
> 0.0690362 0.0897317
> Test FAILED.
> [kevin at gargon kdtest]$ ./kdtest
> 0.0148049 0.041369
> Test FAILED.
> [kevin at gargon kdtest]$ ./kdtest
> 0.0475471 0.0522354
> Test FAILED.
> [kevin at gargon kdtest]$ ./kdtest
> 0.0815875 0.141648
> Test FAILED.
> [kevin at gargon kdtest]$ ./kdtest
> 0.0730086 0.130535
> Test FAILED.
> 
> There probably should be a tolerence on the distance comparison but for
> now it dosen't matter these are too big to be due to numerical
> precision.