-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VTK support #10
Comments
As a first step, I can test if I can use WriteVTK.jl to reproduce the AMR GaussianPulse Source. |
As a second step, we need a function that can extract cell data as "blocks" on each refinement level:
First set of functions implemented in 448b3bb |
Based on Mathieu's answer, a better file structure would be one image file per refinement level, and then use However, I am having some difficulties using I believe this is somehow a bug in VTK. The visibility behavior is changing from v5.9.0 to v5.9.1 nightly. Still waiting for Mathieu's answer. I found a persion posting a Python script for generating NonOverlappingAMR data on Discourse |
VTK OverlappingAMR with image file can support true 2D, but it requires several parameters to be consistent:
Since DCCRG works in 3D, and only produces fake 2D, it makes sense to always stick to 3D VTK formats. |
First implementation is here f30bdc8, but the performance is not so good. For converting a 36M 3D file with 1 level of refinement, julia> @time Vlasiator.write_vtk(meta)
14.460983 seconds (134.01 M allocations: 44.671 GiB, 7.07% gc time) This can certainly be improved. Currently most of the time is spent in reading individual parent cells: for (i, cid) = enumerate(get1stcell(maxamr, ncells)+1:get1stcell(maxamr+1, ncells))
cidparent = getparent(meta, cid)
if cidparent in cellid
celldata[iv][end][:,index_[i]] = readvariable(meta, var, cidparent)
end
end Changing this to cids = [get1stcell(maxamr, ncells)+1, get1stcell(maxamr+1, ncells)]
cidparents = Vector{Int64}(undef, cids[2]-cids[1]+1)
cidmask = similar(cidparents)
nc = 0
for (i, cid) = enumerate(cids[1]:cids[2])
if cid ∉ cellid
nc += 1
cidmask[nc] = i
cidparents[nc] = getparent(meta, cid)
end
end
celldata[iv][end][:,index_[cidmask[1:nc]]] = readvariable(meta, var, cidparents[1:nc]) makes it 5x faster and 7x more efficient in memory. With this change the 36M file conversion now is julia> @time Vlasiator.write_vtk(meta)
5.231831 seconds (18.53 M allocations: 5.272 GiB, 2.51% gc time) Another issue is that the averaging over values of integer values has round-off errors on lower resolution levels, e.g. |
Later I realized that reading cells together is more efficient in memory consumption and that should also be done in other coarse levels. julia> @btime write_vtk(meta)
5.551 s (18532171 allocations: 5.27 GiB) while the new one with coarse level also with bundled reading gives julia> @btime write_vtk(meta)
4.085 s (3562007 allocations: 168.68 MiB) The current version is ~4x faster and 267x more efficient in memory usage compared with the initial implementation. |
Until now velocity distributions saved are not ported to the generated VTK files. I don't think it is really needed now, but we can think about how to do that in the future. |
I test the conversion on a large 3D 1 level AMR VLSV file, and it becomes very expensive: julia> @time write_vtk(meta)
422.187386 seconds (59.59 M allocations: 2.583 GiB, 0.09% gc time, 0.22% compilation time) By moving the parent cell finding out of the variable loop, this can be improved: julia> @time write_vtk(meta)
196.123699 seconds (58.79 M allocations: 2.393 GiB, 0.17% gc time, 0.28% compilation time) Looking into the details, I find that filling the fine cells from their parents' values here celldata[iv][end][:, seq_[cidmask]] = readvariable(meta, var, cidparents) by itself takes 120s, which is about 60% of the total time. There is really not much I can do to further speed this up. I tried to add multithreading to the variable loop, but it turned out that since there is only one One thing to note is that |
I later realized that that the previous implementation from the finest to the coarest level cannot handle the case where there is no direct parent for finest cells on the 2nd finest level! julia> @time write_vtk(meta)
3.885190 seconds (18.45 M allocations: 562.523 MiB, 4.09% gc time) For the large 3D 1 level AMR VLSV file: julia> @time write_vtk(meta)
49.502307 seconds (148.39 M allocations: 4.404 GiB, 0.99% gc time) which is 4 times faster than before. |
Now what about time series files conversion? ParaView can understand that files with pattern |
Another micro optimization I find is that try to avoid using counters in the innermost loops. This causes a surprising 33% improvement in reading 2 variables from the 3D 1 level AMR data. |
Adding julia> @time write_vtk(meta) # test file as before
3.650974 seconds (16.24 M allocations: 528.821 MiB, 1.41% gc time) With 2 threads: julia> @time write_vtk(meta)
2.248806 seconds (16.24 M allocations: 528.837 MiB, 1.74% gc time) With 4 threads: julia> @time write_vtk(meta)
1.629885 seconds (16.24 M allocations: 528.871 MiB, 2.59% gc time) 50MB data: julia> @time write_vtk(meta) # 1 lvl AMR 50 MB
49.470840 seconds (130.16 M allocations: 4.132 GiB, 0.62% gc time) With 2 threads: julia> @time write_vtk(meta)
40.754638 seconds (130.16 M allocations: 4.132 GiB, 1.12% gc time) However, for some unknown reasons I cannot reproduce these results! What I can tell from the benchmark reports and large file timing is that FLoops.jl certainly helps with VDF and fsgrid reading, but not AMR file filling. |
By reducing the duplicate calls to julia> @time write_vtk(filename)
12.544818 seconds (36.07 M allocations: 1.440 GiB, 0.91% gc time) In fact, this can be improved further, by reusing arrays of the same type and size for all the variables. However, as of now I am unable to come up with an elegant code for this idea, and it is no longer the bottleneck. The bottleneck is still the |
Some profilings above is not trustworthy! I cannot reproduce the results! Or am I operating on a different file? git checkout v0.7.0 julia> @benchmark write_vtk(filename)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 31.622 s (1.46% GC) to evaluate,
with a memory estimate of 4.35 GiB, over 145913677 allocations. v0.7.1: git checkout v0.7.1 julia> @time write_vtk(filename)
27.981823 seconds (145.91 M allocations: 4.285 GiB, 1.96% gc time git checkout v0.6.2 julia> @time write_vtk(filename)
30.849516 seconds (145.91 M allocations: 4.350 GiB, 1.03% gc time) |
Further improvements! v0.7.5: julia> @time write_vtk("bulk.0000003.vlsv") # 1st time execution
28.804633 seconds (155.89 M allocations: 4.797 GiB, 2.23% gc time, 0.14% compilation time) v0.7.7 julia> @time write_vtk("bulk.0000003.vlsv") # 1st time execution
19.531925 seconds (77.73 M allocations: 3.039 GiB, 2.62% gc time, 0.20% compilation time) v0.8.1 (compile time up due to the switch to EzXML) julia> @time write_vtk(filename) # 1st time execution
18.405493 seconds (78.39 M allocations: 3.075 GiB, 3.00% gc time, 2.81% compilation time) v0.8.4 (bug fix on julia> @benchmark write_vtk(filename)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 14.580 s (2.62% GC) to evaluate,
with a memory estimate of 2.33 GiB, over 67060114 allocations. v0.8.5 (function barriers on celldata filling) julia> @time write_vtk(filename) # 1st time execution
15.531484 seconds (55.72 M allocations: 2.692 GiB, 3.02% gc time, 2.84% compilation time)
julia> @benchmark write_vtk(filename)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 10.266 s (3.25% GC) to evaluate,
with a memory estimate of 1.97 GiB, over 42733579 allocations. v0.8.6 (replace julia> @time write_vtk(filename) # 1st time execution
12.583741 seconds (73.09 M allocations: 3.304 GiB, 4.06% gc time, 3.92% compilation time)
julia> @benchmark write_vtk(filename)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 7.074 s (4.95% GC) to evaluate,
with a memory estimate of 2.59 GiB, over 60255050 allocations. v0.8.25 (doing function barriers properly!) julia> @benchmark write_vtk(file)
BenchmarkTools.Trial: 6 samples with 1 evaluation.
Range (min … max): 838.351 ms … 958.141 ms ┊ GC (min … max): 0.27% … 13.05%
Time (median): 872.960 ms ┊ GC (median): 2.04%
Time (mean ± σ): 881.696 ms ± 43.051 ms ┊ GC (mean ± σ): 4.42% ± 4.74%
█ █ █ █ █ █
█▁▁▁▁█▁▁▁▁▁▁▁█▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
838 ms Histogram: frequency by time 958 ms <
Memory estimate: 148.01 MiB, allocs estimate: 3579. v0.9.20 julia> @benchmark write_vtk(file)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
Range (min … max): 1.096 s … 1.222 s ┊ GC (min … max): 1.24% … 10.51%
Time (median): 1.131 s ┊ GC (median): 1.20%
Time (mean ± σ): 1.143 s ± 46.907 ms ┊ GC (mean ± σ): 3.96% ± 4.09%
█ ██ █ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁██▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
1.1 s Histogram: frequency by time 1.22 s <
Memory estimate: 151.54 MiB, allocs estimate: 3575. v0.9.21 after refactoring julia> @benchmark write_vtk(file)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
Range (min … max): 1.043 s … 1.124 s ┊ GC (min … max): 0.08% … 7.16%
Time (median): 1.097 s ┊ GC (median): 5.12%
Time (mean ± σ): 1.083 s ± 36.244 ms ┊ GC (mean ± σ): 3.82% ± 2.97%
█ █ █ █ █
█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
1.04 s Histogram: frequency by time 1.12 s <
Memory estimate: 133.15 MiB, allocs estimate: 3534. v0.9.22: improving julia> @benchmark write_vtk(file)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
Range (min … max): 1.046 s … 1.064 s ┊ GC (min … max): 0.12% … 0.11%
Time (median): 1.047 s ┊ GC (median): 0.12%
Time (mean ± σ): 1.052 s ± 7.656 ms ┊ GC (mean ± σ): 0.15% ± 0.11%
██ █ █ █
██▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
1.05 s Histogram: frequency by time 1.06 s <
Memory estimate: 125.95 MiB, allocs estimate: 3469. |
It is currently possible to convert VLSV format to VTK format which can be used directly in ParaView and VisIt. This is mostly useful for visualizing 3D AMR data.
I know very few about the AMR support in ParaView. This blog post briefly described what they have done for Enzo and Flash, which are both actually pretty similar to what DCCRG does under the hood. This discourse post described a user's experience in implementing his own AMR reader in Python, but his original data is save in hdf5 format already.
What he did is taking advantage of the vtk package bundled with paraview:
where vtkOverlappingAMR is a class in VTK.
There is another relatively old link describing how to render AMR data, but I found it hard to follow. However, this python interface for generating an AMR Gaussian pulse might be very useful.
In pvpython, it is actualy very easy to create a sample AMR dataset:
The sample scripts provided by the user in the above link should in principle work. But I feel like it would be better do it purely in python.
The structure of DCCRG makes it possible to know the cell neighbors given the level of refinement and the dimension on the coarsest level.
It might be possible to directly write AMR data in VTK format. See this issue.
When doing real 3D analysis, it would be really useful to have a ParaView reader then.
The text was updated successfully, but these errors were encountered: