could be more efficient for mixtures of UnitRange and Vector

It seems like it should be possible to do this without collecting the full range

julia> intersect(1:typemax(Int), [1,3])
julia(30683,0x116666e00) malloc: can't allocate region
:*** mach_vm_map(size=4611686018427392000, flags: 100) failed (error code=3)
julia(30683,0x116666e00) malloc: *** set a breakpoint in malloc_error_break to debug
ERROR: OutOfMemoryError()
Stacktrace:
  [1] _growend!
    @ ./array.jl:982 [inlined]
  [2] resize!
    @ ./array.jl:1207 [inlined]
  [3] rehash!(h::Dict{Int64, Nothing}, newsz::Int64)
    @ Base ./dict.jl:184
  [4] sizehint!
    @ ./dict.jl:244 [inlined]
  [5] sizehint!
    @ ./set.jl:76 [inlined]
  [6] union!(s::Set{Int64}, itr::UnitRange{Int64})
    @ Base ./abstractset.jl:95
  [7] Set
    @ ./set.jl:10 [inlined]
  [8] _Set
    @ ./set.jl:25 [inlined]
  [9] Set
    @ ./set.jl:23 [inlined]
 [10] _shrink(shrinker!::Function, itr::UnitRange{Int64}, itrs::Tuple{Vector{Int64}})
    @ Base ./array.jl:2609
 [11] intersect(itr::UnitRange{Int64}, itrs::Vector{Int64})
    @ Base ./array.jl:2613
 [12] top-level scope
    @ REPL[10]:1

1 thought on “could be more efficient for mixtures of UnitRange and Vector

  1. Nice. Looks good to me!

    It’s more efficient than the vec vec version

    julia> @btime intersect(1:typemax(Int), [1,3])
      302.226 ns (8 allocations: 688 bytes)
    2-element Vector{Int64}:
     1
     3
    
    julia> @btime Base.intersect([1,2,3,4,5], [1,3])
      654.689 ns (15 allocations: 1.11 KiB)
    2-element Vector{Int64}:
     1
     3
    
    julia> @btime Base.intersect(1:typemax(Int), 1:2:3)
      0.044 ns (0 allocations: 0 bytes)
    1:2:3

    I’d also define

    intersect(vec::AbstractVector, r::AbstractRange) = intersect(r, vec)

    Could you PR this? I’d like to use this in another Base PR. Thanks!

Comments are closed.