Skip to content

long wait times for first run of @map and secondary problem of REPL hanging. #232

@affans

Description

@affans

I can only reproduce this with my data file which is hosted on github link. It's not a large file (270kb, 250 rows by 500 columns) of integer data, delimited by tab. Please download this file and save as symp.dat (or alter the code below accordingly).

using DataFrames
using Statistics
using Query
using Base.Filesystem
using CSV

headers = ["sim$i" for i = 1:500]  
dt = CSV.File("symp.dat", delim='\t', header=headers) |> DataFrame
dt.time = 1:250 ## add a time column for `melt` purposes

f(g) = g |> @map(mean(_))
@time f(dt)

Output:

julia> @time f(dt)
 31.975361 seconds (7.63 M allocations: 415.528 MiB, 0.84% gc time)

And then the REPL hangs. It hangs for about 2 minutes! before it finishes printing the 250-element query result.

julia> @time f(dt)
 31.190349 seconds (7.64 M allocations: 416.400 MiB, 0.94% gc time)

250-element query result
 0.249501
 0.265469
 0.0838323
 0.237525
 0.323353
 0.359281
 0.457086
 0.566866
 0.798403
 1.06587
... with 240 more elements

In summary:

  • First run takes about 35 seconds even when @map is in a function (maybe this has to something with the fact that dtis global?)
  • The REPL hangs for about 2 minutes after the @time macro prints its information. It seems to me it takes a long time to "collect" the results of the query and display it back in the REPL.

Version info:

julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, ivybridge)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions