PubliSci

Note: this software is under active development! Until it hits v 1.0.0, the overall API and usage pattern is subject to change.

Installation

gem install publisci

Usage

DSL

Most of the gem's functions can be accessed through its DSL

require 'publisci'
include PubliSci::DSL

# Specify input data
data do
  # use local or remote paths
  source 'https://github.com/wstrinz/publisci/raw/master/spec/csv/bacon.csv'

  # specify datacube properties
  dimension 'producer', 'pricerange'
  measure 'chunkiness'

  # set parser specific options
  option 'label_column', 'producer'
end

# Describe dataset
metadata do
  dataset 'bacon'
  title 'Bacon dataset'
  creator 'Will Strinz'
  description 'some data about bacon'
  date '1-10-2010'
end

# Send output to an RDF::Repository
#  can also use 'generate_n3' to output a turtle string
repo = to_repository

# run SPARQL queries on the dataset
PubliSci::QueryHelper.execute('select * where {?s ?p ?o} limit 5', repo)

# export in other formats
PubliSci::Writers::ARFF.new.from_store(repo)

Gem executable

Running the gem using the publisci executable will attempt to find and run an triplifier for your input.

For example, the following

publisci https://github.com/wstrinz/publisci/raw/master/spec/csv/bacon.csv

Is equivalent to the DSL code

require 'publisci'
include PubliSci::DSL

data do
  source 'https://github.com/wstrinz/publisci/raw/master/spec/csv/bacon.csv'
end

generate_n3

The API doc is online. For more code examples see the test files in the source tree.

Custom Parsers

Building a parser simply requires you to implement a generate_n3 method, either at the class or instance level. Then register it using Publisci::Dataset.register_reader(extension, class) using your reader's preferred file extension and its class. This way, if you call the Dataset.for method on a file with the given extension it will use your reader class.

Including or extending the Publisci::Readers::Base will give you access to many helpful methods for creating a triplifying your data. There is a post on the project blog with further details about how to design and implement a parser.

The interface is in the process of being more rigdly defined to separate parsing, generation, and output, and it is advisable to you make your parsing code as stateless as possible for better handling of large inputs. Pull requests with parsers for new formats are greatly appreciated however!

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see

https://github.com/wstrinz/publisci

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

Cite

If you use this software, please cite

The Ruby Science Foundation. 2013. SciRuby: Tools for scientific computing in Ruby. http://sciruby.com.

and one of

Biogems.info

This Biogem is published at (http://biogems.info/index.html#publisci)

Name		Name	Last commit message	Last commit date
Latest commit History 238 Commits
bin		bin
examples		examples
features		features
lib		lib
resources		resources
scripts		scripts
spec		spec
.document		.document
.gitignore		.gitignore
.rspec		.rspec
.travis.yml		.travis.yml
Gemfile		Gemfile
LICENSE.txt		LICENSE.txt
README.md		README.md
README.rdoc		README.rdoc
Rakefile		Rakefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PubliSci

Installation

Usage

DSL

Gem executable

Custom Parsers

Project home page

Cite

Biogems.info

Copyright

About

Releases

Packages

Languages

License

SciRuby/publisci

Folders and files

Latest commit

History

Repository files navigation

PubliSci

Installation

Usage

DSL

Gem executable

Custom Parsers

Project home page

Cite

Biogems.info

Copyright

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages