Skip to content

Commit

Permalink
Merge pull request #15 from JuliaIO/teh/refactor
Browse files Browse the repository at this point in the history
Rewrite from scratch
  • Loading branch information
timholy committed Aug 3, 2015
2 parents 6886b7c + c9ca6f9 commit eeb94ff
Show file tree
Hide file tree
Showing 17 changed files with 710 additions and 182 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ julia:
notifications:
email: false
after_success:
- julia -e 'cd(Pkg.dir("FixedSizeArrays")); Pkg.add("FileIO"); using Coverage; Coveralls.submit(Coveralls.process_folder())'
- julia -e 'cd(Pkg.dir("FileIO")); Pkg.add("Coverage"); using Coverage; Coveralls.submit(Coveralls.process_folder())'
119 changes: 94 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,104 @@
# FileIO

[![Build Status](https://travis-ci.org/SimonDanisch/FileIO.jl.svg?branch=master)](https://travis-ci.org/SimonDanisch/FileIO.jl)
[![Build Status](https://travis-ci.org/JuliaIO/FileIO.jl.svg?branch=master)](https://travis-ci.org/JuliaIO/FileIO.jl)

FileIO aims to provide a common framework for detecting file formats
and dispatching to appropriate readers/writers. The two core
functions in this package are called `load` and `save`, and offer
high-level support for formatted files (in contrast with julia's
low-level `read` and `write`). To avoid name conflicts, packages that
provide support for standard file formats through functions named
`load` and `save` are encouraged to extend the definitions here.

## Installation

All Packages in JuliaIO are not registered yet. Please add via `Pkg.clone("git-url").

## Usage

If your format has been registered, it might be as simple as
```jl
using FileIO
obj = load(filename)
```
to read data from a formatted file. Likewise, saving might be as simple as
```
save(filename, obj)
```

Meta package for FileIO.
Purpose is to open a file and return the respective Julia object, without doing any research on how to open the file.
```Julia
f = file"test.jpg" # -> File{:jpg}
read(f) # -> Image
read(file"test.obj") # -> Mesh
read(file"test.csv") # -> DataFrame
If you just want to inspect a file to determine its format, then
```jl
file = query(filename)
s = query(io) # io is a stream
```
So far only Images are supported and MeshIO is on the horizon.

It is structured the following way:
There are three levels of abstraction, first FileIO, defining the file_str macro etc, then a meta package for a certain class of file, e.g. Images or Meshes. This meta package defines the Julia datatype (e.g. Mesh, Image) and organizes the importer libraries. This is also a good place to define IO library independant tests for different file formats.
Then on the last level, there are the low-level importer libraries, which do the actual IO.
They're included via Mike Innes [Requires](https://github.com/one-more-minute/Requires.jl) package, so that it doesn't introduce extra load time if not needed. This way, using FileIO without reading/writing anything should have short load times.

As an implementation example please look at FileIO -> ImageIO -> ImageMagick.
This should already work as a proof of concept.
Try:
```Julia
using FileIO # should be very fast, thanks to Mike Innes Requires package
read(file"test.jpg") # takes a little longer as it needs to load the IO library
read(file"test.jpg") # should be fast
read(File("documents", "images", "myimage.jpg") # automatic joinpath via File constructor
will return a `File` or `Stream` object that also encodes the detected
file format.

## Adding new formats

You register a new format by calling `add_format(fmt, magic,
extension)`. `fmt` is a `DataFormat` type, most conveniently created
as `format"IDENTIFIER"`. `magic` typically contains the magic bytes
that identify the format. Here are some examples:

```jl
# A straightforward format
add_format(format"PNG", [0x89,0x50,0x4e,0x47,0x0d,0x0a,0x1a,0x0a], ".png")

# A format that uses only ASCII characters in its magic bytes, and can
# have one of two possible file extensions
add_format(format"NRRD", "NRRD", [".nrrd",".nhdr"])

# A format whose magic bytes might not be at the beginning of the file,
# necessitating a custom function `detecthdf5` to find them
add_format(format"HDF5", detecthdf5, [".h5", ".hdf5"])

# A fictitious format that, unfortunately, provides no magic
# bytes. Here we have to place our faith in the file extension.
add_format(format"DICEY", (), ".dcy")
```

You can also declare that certain formats require certain packages for
I/O support:

```jl
add_loader(format"HDF5", :HDF5)
add_saver(format"PNG", :ImageMagick)
```
These packages will be automatically loaded as needed.

Users are encouraged to contribute these definitions to the
`registry.jl` file of this package, so that information about file
formats exists in a centralized location.

## Implementing loaders/savers

In your package, write code like the following:

```jl
using FileIO

function load(f::File{format"PNG"})
io = open(f)
skipmagic(io, f) # skip over the magic bytes
# Now do all the stuff you need to read a PNG file
end

# You can support streams and add keywords:
function load(s::Stream{format"PNG"}; keywords...)
io = stream(s) # io is positioned after the magic bytes
# Do the stuff to read a PNG file
end

function save(f::File{format"PNG"}, data)
io = open(f, "w")
# Don't forget to write the magic bytes!
write(io, magic(format"PNG"))
# Do the rest of the stuff needed to save in PNG format
end
```
Please open issues if things are not clear or if you find flaws in the concept/implementation.

If you're interested in working on this infrastructure I'll be pleased to add you to the group JuliaIO.
## Help

You can get an API overview by typing `?FileIO` at the REPL prompt.
Individual functions have their own help too, e.g., `?add_format`.
4 changes: 4 additions & 0 deletions REQUIRE
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
julia 0.3
# The rest are needed only on julia 0.3
Compat
Docile
107 changes: 92 additions & 15 deletions src/FileIO.jl
Original file line number Diff line number Diff line change
@@ -1,19 +1,96 @@
module FileIO

import Base: read,
write,
(==),
open,
abspath,
readbytes,
readall

export File,
@file_str,
readformats,
writeformats,
ending

include("core.jl")
if VERSION < v"0.4.0-dev"
using Docile, Compat
immutable Pair{A,B}
first::A
second::B
end
Base.first(p::Pair) = p.first
Base.last(p::Pair) = p.second
end

export DataFormat,
File,
Formatted,
Stream,

@format_str,

add_format,
del_format,
add_loader,
add_saver,
filename,
info,
load,
magic,
query,
save,
skipmagic,
stream,
unknown

include("query.jl")
include("loadsave.jl")
include("registry.jl")

@doc """
- `load(filename)` loads the contents of a formatted file, trying to infer
the format from `filename` and/or magic bytes in the file.
- `load(strm)` loads from an `IOStream` or similar object. In this case,
the magic bytes are essential.
- `load(File(format"PNG",filename))` specifies the format directly, and bypasses inference.
- `load(f; options...)` passes keyword arguments on to the loader.
""" ->
function load(s::Union(AbstractString,IO); options...)
q = query(s)
check_loader(q)
load(q; options...)
end

@doc """
- `save(filename, data...)` saves the contents of a formatted file,
trying to infer the format from `filename`.
- `save(Stream(format"PNG",io), data...)` specifies the format directly, and bypasses inference.
- `save(f, data...; options...)` passes keyword arguments on to the saver.
""" ->
function save(s::Union(AbstractString,IO), data...; options...)
q = query(s)
check_saver(q)
save(q, data...; options...)
end

# Fallbacks
load{F}(f::Formatted{F}; options...) = error("No load function defined for format ", F, " with filename ", filename(f))
save{F}(f::Formatted{F}, data...; options...) = error("No save function defined for format ", F, " with filename ", filename(f))

end # module

if VERSION < v"0.4.0-dev"
using Docile
end

@doc """
`FileIO` API (brief summary, see individual functions for more detail):
- `format"PNG"`: specifies a particular defined format
- `File{fmt}` and `Stream{fmt}`: types of objects that declare that a resource has a particular format `fmt`
- `load([filename|stream])`: read data in formatted file, inferring the format
- `load(File(format"PNG",filename))`: specify the format manually
- `save(filename, data...)` for similar operations involving saving data
- `io = open(f::File, args...)` opens a file
- `io = stream(s::Stream)` returns the IOStream from the query object `s`
- `query([filename|stream])`: attempt to infer the format of `filename`
- `unknown(q)` returns true if a query can't be resolved
- `skipmagic(io, fmt)` sets the position of `io` to just after the magic bytes
- `magic(fmt)` returns the magic bytes for format `fmt`
- `info(fmt)` returns `(magic, extensions)` for format `fmt`
- `add_format(fmt, magic, extension)`: register a new format
- `add_loader(fmt, :Package)`: indicate that `Package` supports loading files of type `fmt`
- `add_saver(fmt, :Package)`: indicate that `Package` supports saving files of type `fmt`
""" -> FileIO
42 changes: 0 additions & 42 deletions src/core.jl

This file was deleted.

89 changes: 0 additions & 89 deletions src/core_prototype.jl

This file was deleted.

Loading

0 comments on commit eeb94ff

Please sign in to comment.