Allow for AbstractVector #29

KristofferC · 2016-08-14T22:03:00Z

Allow for entering a vector of points where a point is an instance of an AbstractVector. Tests should pass except the DataFreeTree stuff.

Need to benchmark things to see that things doesnt slow down.

Help in testing / benchmarking appreciated.

Should fix: #24, #17, #13

KristofferC · 2016-08-14T22:04:57Z

CC @andyferris

andyferris · 2016-08-15T03:42:17Z

src/NearestNeighbors.jl

@@ -1,11 +1,12 @@
-__precompile__()
+# __precompile__()


StaticArrays should now be precompiled, if that was the problem here (use 0.0.4 in the REQUIRE file above)

Yeah, it is just commented out because it is a WIP so I dont have to wait for recompilation :)

andyferris · 2016-08-15T04:22:12Z

Looks like a nice generalization, @KristofferC.

So I've managed to get it working with inrange on KDTree but there isn't really any speed advantage at the moment. One minor improvement (for the way we are doing things) is have a single point version of inrange without making another allocation...

To make a large improvement to the timings I think we will want more parts of the code, like HyperRectangle, to use SVector. I'll have a play!

andyferris · 2016-08-15T04:30:02Z

src/NearestNeighbors.jl

-    ndim_tree = size(tree.data, 1)
-    if ndim_points != ndim_tree
+function check_input{V1, V2}(tree::NNTree{V1}, points::Vector{V2})
+    if length(V1) != length(V2)


Here and elsewhere you are allowing V to be V <: AbstractVector, and then calling length() on the type. Do you think V <: StaticVector is more appropriate?

Not really since someone might want to use a FixedSizeArray for example. I don't want to bind V too much, I rather specify an interface that V has to follow (like definining length and eltype).

Well, that's true! (Except FixedSizeArray isn't a subtype of AbstractArray).

Yeah hehe. But still :P

KristofferC · 2016-08-15T10:00:21Z

I am not surprised that there is not much speed improvement (for now I'm glad if there aren't any regression). The majority of the time is spent in fetching data from memory and I have already been careful optimizing that as much as I can. What this PR brings is more of a generalization to have the possibility to use a vector of points instead of a matrix.

The only computations that are done on the hyper rectangle is in the creation of the tree. Otherwise, only computations in the dimension that the points got split are performed

NearestNeighbors.jl/src/kd_tree.jl

Line 175 in 6517578

split_diff_pow = eval_pow(M, split_diff)

andyferris · 2016-08-15T10:47:28Z

Right, makes sense! Thanks

andyferris · 2016-08-15T10:51:06Z

src/inrange.jl

-do_return_inrange(idxs, ::AbstractVector) = idxs[1]
-do_return_inrange(idxs, ::AbstractMatrix) = idxs
+function inrange{V, T <: Number}(tree::NNTree{V}, point::AbstractVector{T}, radius::Number, sortres=false)
+    idxs = inrange(tree, Vector{T}[point], radius, sortres)


Should be Vector{typeof(point)}[point].

Also, this one seems a little wasteful. We could copy the kernel from the above function (lines 21-29) - measured about a 10% speedup.

Yeah, there are some optimizations to be made with regards to only querying for 1 point at a time. Could also be good to provide an API where everything is preallocated, like the vector for storing indices and distances.

The easiest thing is probably to split up the knn function into an outer part and an inner part where the inner only takes a single point.

KristofferC · 2016-08-15T17:14:50Z

I updated this. It should deal with single point query better now. Feel free to try it out. :)

KristofferC · 2016-08-24T07:31:03Z

Some updates: Benchmarking is looking great:

https://gist.github.com/KristofferC/d3666e6428a473ea93ccfea396391ab3

KristofferC · 2016-08-24T07:35:06Z

I think changing the HyperSpheres from each having their own Vector to having SVector instead did great for cache locality.

codecov-io · 2016-08-24T09:16:08Z

Current coverage is 93.61% (diff: 94.41%)

Merging #29 into master will increase coverage by 9.48%

@@             master        #29   diff @@
==========================================
  Files            14         14          
  Lines           523        517     -6   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            440        484    +44   
+ Misses           83         33    -50   
  Partials          0          0

Powered by Codecov. Last update d8e718c...7043829

andyferris · 2016-08-24T23:06:00Z

OK, I had a look at those benchmark comparisons,. and it really seems to have helped BallTree which is great!

I've been using knn on KDTree which is where I saw no improvement. Oh well :)

How efficient is BallTree in general? When should I consider using it? We have 3D point clouds and I thought knowing about the geometry (3D space) would make it most efficient.

KristofferC · 2016-08-25T06:11:42Z

BallTree is not as heavily optimized as KDTree but should in theory be faster for higher dimensions. And it also works with more general metrics than Minkowski. If you run the benchmarks and run the markdown generator on the resulting file you can compare absolute values between performance of the trees. If your data is biased some way ypu should probably benchmark with that type of data.

KristofferC mentioned this pull request Aug 14, 2016

WIP: Allow more types in the tree data #16

Closed

2 tasks

andyferris reviewed Aug 15, 2016
View reviewed changes

KristofferC mentioned this pull request Aug 23, 2016

add benchmarks #30

Merged

KristofferC added 5 commits August 24, 2016 08:34

Allow for AbstractVector

a45d291

better handling of single point

a286424

fix buggie

880dd04

fixes

62143f4

fix some lint stuff

e3a061e

KristofferC force-pushed the static_arrays branch from 53ea5da to e3a061e Compare August 24, 2016 07:30

KristofferC added 5 commits August 24, 2016 09:41

fix extra k

35edce3

final updates

439eccd

update README and CI

ac22211

reduce monkey testing a bit

4956f8f

add reqs to REQUIRE

5d37c5c

KristofferC added 3 commits August 24, 2016 11:16

disable codecov comments

4ddb74d

fix

ab2c457

improve coverage

7043829

KristofferC merged commit 815f96a into master Aug 24, 2016

KristofferC deleted the static_arrays branch August 24, 2016 09:50

KristofferC mentioned this pull request Aug 24, 2016

Investigate into Fixed Size Arrays / dim parameterization #17

Closed

KristofferC mentioned this pull request Sep 8, 2016

using KDtree as a type? #32

Closed

KristofferC mentioned this pull request Oct 22, 2017

[Breaking] MethodError: no method matching length(::Type{Array{Float64,1}}) #54

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for AbstractVector #29

Allow for AbstractVector #29

KristofferC commented Aug 14, 2016

KristofferC commented Aug 14, 2016

andyferris Aug 15, 2016

KristofferC Aug 15, 2016

andyferris commented Aug 15, 2016

andyferris Aug 15, 2016

KristofferC Aug 15, 2016

andyferris Aug 15, 2016

KristofferC Aug 15, 2016

KristofferC commented Aug 15, 2016

andyferris commented Aug 15, 2016

andyferris Aug 15, 2016

KristofferC Aug 15, 2016 •

edited

Loading

KristofferC commented Aug 15, 2016

KristofferC commented Aug 24, 2016

KristofferC commented Aug 24, 2016

codecov-io commented Aug 24, 2016 •

edited

Loading

andyferris commented Aug 24, 2016

KristofferC commented Aug 25, 2016

Allow for AbstractVector #29

Allow for AbstractVector #29

Conversation

KristofferC commented Aug 14, 2016

KristofferC commented Aug 14, 2016

andyferris Aug 15, 2016

Choose a reason for hiding this comment

KristofferC Aug 15, 2016

Choose a reason for hiding this comment

andyferris commented Aug 15, 2016

andyferris Aug 15, 2016

Choose a reason for hiding this comment

KristofferC Aug 15, 2016

Choose a reason for hiding this comment

andyferris Aug 15, 2016

Choose a reason for hiding this comment

KristofferC Aug 15, 2016

Choose a reason for hiding this comment

KristofferC commented Aug 15, 2016

andyferris commented Aug 15, 2016

andyferris Aug 15, 2016

Choose a reason for hiding this comment

KristofferC Aug 15, 2016 • edited Loading

Choose a reason for hiding this comment

KristofferC commented Aug 15, 2016

KristofferC commented Aug 24, 2016

KristofferC commented Aug 24, 2016

codecov-io commented Aug 24, 2016 • edited Loading

Current coverage is 93.61% (diff: 94.41%)

andyferris commented Aug 24, 2016

KristofferC commented Aug 25, 2016

KristofferC Aug 15, 2016 •

edited

Loading

codecov-io commented Aug 24, 2016 •

edited

Loading