Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README & Markdown files #50

Merged
merged 18 commits into from
Aug 26, 2017
Merged

README & Markdown files #50

merged 18 commits into from
Aug 26, 2017

Conversation

athityakumar
Copy link
Member

No description provided.

daru-io.gemspec Outdated
| SciRuby Team |
*************************************************

EOF
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is how daru-io looks on installation. Added a straw to the booze to literally show "Export" from "daru". 😉

image

@athityakumar athityakumar force-pushed the documentation branch 2 times, most recently from e300e0e to aa983c8 Compare August 10, 2017 00:52
CONTRIBUTING.md Outdated

# Developing the gem

1. Install the extensions dependencies : Mongo, R and Redis. For any issue related to these, kindly refer to the steps mentioned in the `.travis.yml` file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As those dependencies are not always easy to install, maybe we can soften this requirement? Like, "if you want to run entire specs suite... But to work on some particular feature..." or something like that.

README.md Outdated

*Note - Each IO module has it's own set of dependencies. Have a look at the [Importers](#importers) and [Exporters](#exporters) section for dependency-specific information.*

# Importers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, it is only one first-level heading in Markdown ;)
And, also: maybe provide some generic info on importers? Like "When included, each importer provide DataFrame.from_<format> method, which is blah-blah"

README.md Outdated
Daru::IO::Importers::ActiveRecord.new(args).call

#! Usage from Daru::DataFrame
Daru::DataFrame.from_activerecord(args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that two manners of usage could be placed only in importers preface above, here you can just show the most obvious way (the second one)

README.md Outdated
```

- Have a look at [this documentation](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/ActiveRecord) to know more about this Importer.
- Dependency : activerecord gem.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of repeating "Have a look...." do something short and formal, like this:

ActiveRecord

  • Dependencies: activerecord
  • Usage:
require 'daru/io/importers/active_record'
Daru::DataFrame.from_activerecord(connection) # and maybe some useful options example

WDYT?

README.md Outdated
```

- Have a look at [this documentation](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/Avro) to know more about this Importer.
- Dependency : avro gem.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And snappy?..

README.md Outdated
- Have a look at [this documentation](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/Redis) to know more about this Importer.
- Dependency : Standard library JSON gem, redis gem and a running `redis-server`.

### SQL Importer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is in fact "sqlite file importer" or something like that?..

README.md Outdated

## Contributing
Contributions are always welcome. Please have a look at the [contribution guidelines](CONTRIBUTING.md) first. :tada:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, that the section "creating your own importer/exporter" (as a separate lib/gem) is necessary. As well as (at the very beginning of README) a statement that it is not only a bunch of them, but also the "framework for easy creation of new importers/exporters".

daru-io.gemspec Outdated
| SciRuby Team |
*************************************************

EOF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please-please-please don't do this!
As a commercial programmer, who does dozens of bundle install per day, I TOTALLY HATE this "nice stuff". It takes the screen place and instead of looking at list of some updated gems I should look at some irrelevant ASCII-art.
We'll remove post-install message from daru eventually.

This option (post_install_message) is for important stuff, like on ver. 2.0.1 it can say "If you are updating from 1.5, please do ...", not for fun!!!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, never saw this from this perspective. Acknowledged. 👍

@@ -3,6 +3,8 @@
module Daru
module IO
module Exporters
# RDS Exporter Class, that extends +to_rds+ method to +Daru::DataFrame+
# instance variables
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is wise to set YARD to use Markdown markup, and use it in comments.

@athityakumar
Copy link
Member Author

Made the proposed changes. Please have a look. 😄

@zverok @v0dro - Also, from a user's perspective, I feel that YARD docs should first show information about the @param and @return before @example. But I've not been able to find a way of reordering these sections. Any insights regarding this?

@zverok
Copy link
Collaborator

zverok commented Aug 12, 2017

But I've not been able to find a way of reordering these sections. Any insights regarding this?

I believe the only way is to make your own YARD templates, but I am not sure that rubydoc.info would respect them.

README.md Outdated

## Installation
While supporting various IO modules, daru-io also provides an easier way of adding more Importers / Exporters
by means of monkey-patching. **It's strongly recommended to have a look at [this section](#tweaking-io-modules), if you're interested in tweaking and creating custom Importers / Exporters.**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that mentioning of monkey-patching is appropriate here. In fact, we "patch" almost nothing. It is a framework for defining importers/exporters and (optionally) "plugging" them as DataFrame methods.

It is important, because "monkey-patching" is considered quite questionable practice in modern Ruby, and start the explanation with proud statement about monkey-patching is bad for gem's reputation.

README.md Outdated
Daru::IO::Importers::ActiveRecord.new(args).call

#! Usage from Daru::DataFrame
Daru::DataFrame.from_activerecord(args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use more realistic examples, with meaningful arg.names. The idea that for simplest usage it should be possible to just copy-paste the example from README and go with it.

README.md Outdated

- **Docs**: [rubydoc.info](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/ActiveRecord)
- **Gem Dependencies**: `activerecord` gem
- **Other Dependencies**: None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of it? If the idea is "you should not install anything else", it is a lie: to use ActiveRecord, you probably should have PostgreSQL or MySQL installed. But I believe that this line is not necessary (in form of **Other Dependencies**: None) at all.

README.md Outdated
Imports a **Daru::DataFrame** from a **.csv** or **.csv.gz** file.

- **Docs**: [rubydoc.info](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/CSV)
- **Gem Dependencies**: Standard library `csv`, `open-uri` and `zlib` gems
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove the line. It communicates no information at all, just says something about internals. User definitely already has standard library and absolutely doesn't need to know which part of it you plan to require.

README.md Outdated
Imports a **Daru::DataFrame** from a **.json** file / response.

- **Docs**: [rubydoc.info](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/JSON)
- **Gem Dependencies**: Standard library `json` gem and `jsonpath` gem
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only list jsonpath here.

README.md Outdated

- **Docs**: [rubydoc.info](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Importers/SQL)
- **Gem Dependencies**: `dbd-sqlite3`, `activerecord`, `dbi` and `sqlite3` gems
- **Other Dependencies**: SQL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What?..

README.md Outdated
df.to_sql(args)
```

# Tweaking IO modules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe "Tweaking" is a good word here. Let's do a section named "Creating your own importers/exporters"

README.md Outdated
TODO: Write usage instructions here
Behaviour of existing IO modules can also be changed according to your needs, similar to the above example.
For example, if the CSV Importer has to be tweaked with a faster processing gem, simply replace the above
(psuedo)code with `class Daru::IO::Importers::CSV`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good suggestion? I believe it is better to suggest something like

class MyNamespace::CustomCSV < Daru::IO::Importers::CSV
  Daru::DataFrame.register_io_module :custom_csv, self

WDYT?..


Please proceed with a Pull Request only after you're assigned. It'd be sad if your Pull Request (and your hardwork) isn't accepted just because it isn't idealogically compatible.

# Developing the gem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good!

LICENSE.md Outdated
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) 2017 athityakumar
Copyright (c) 2017 [Athitya Kumar](https://github.com/athityakumar/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll be trasferring the repo to SciRuby post GSOC so make it Athithya and SciRuby (with links).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added SciRuby into the LICENSE. 😄

Copy link
Collaborator

@zverok zverok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Sorry for pedantism.

README.md Outdated
Daru::IO::Importers::CSV.new('path/to/csv.gz/file', skiprows: 10, compression: :gzip).call

#! Usage from Daru::DataFrame
Daru::DataFrame.from_csv('path/to/csv.gz/file', skiprows: 10, compression: :gzip)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a bit confuzing (like file csv.gz, and some sub-part of it named file), let's use path/to/file.csv.gz

README.md Outdated

#! Usage from Daru::DataFrame
require 'daru/io/importers/excel'
Daru::DataFrame.from_excel('path/to/xlsx/file', sheet: 2, skiprows: 10, skipcols: 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and here and everywhere in likewise cases path/to/file.xlsx

README.md Outdated
Daru::IO::Importers::HTML.new('website/domain/url', match: 'market', name: 'Shares analysis').call

#! Usage from Daru::DataFrame
Daru::DataFrame.from_html('website/domain/url', match: 'market', name: 'Shares analysis')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say

# local
Daru::DataFrame.from_html('path/to/file.html', match: 'market', name: 'Shares analysis')
# or remote
Daru::DataFrame.from_html('https://some/url/with/tables', match: 'market', name: 'Shares analysis')

...this showcase is MUCH more useful than adding obvious Daru::IO::Importers::HTML.new().call to each importer, which I am against.

README.md Outdated

[(Go to Table of Contents)](#table-of-contents)

Imports a **Daru::DataFrame** from a **.dat** plaintext file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some link/explanation about expected format of the file?

README.md Outdated
Daru::IO::Importers::SQL.new('path/to/sqlite.db/file', 'SELECT * FROM test').call

#! Usage from Daru::DataFrame
Daru::DataFrame.from_sql('path/to/sqlite.db/file', 'SELECT * FROM test')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or DBI connection ;)

README.md Outdated

[(Go to Table of Contents)](#table-of-contents)

Exports a **Daru::DataFrame** into an SQL table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "To database table through DBI connection". And, don't we export to SQLite files?..

README.md Outdated

- **Docs**: [rubydoc.info](http://www.rubydoc.info/github/athityakumar/daru-io/master/Daru/IO/Exporters/SQL)
- **Gem Dependencies**: `dbd-sqlite3`, `dbi` and `sqlite3` gems
- **Other Dependencies**: SQL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "SQL dependency"?.. "You should have SQL installed"?

daru-io.gemspec Outdated
Daru-IO is a plugin-gem to Daru gem, which stands for Data Analysis in RUby. Daru-IO extends support for many Import and Export methods of Daru::DataFrame. This gem is intended to help Rubyists who are into Data Analysis or Web Development, by serving as a general purpose conversion library that takes input in one format (say, JSON) and converts it another format (say, Avro) while also making it incredibly easy to getting started on analyzing data with daru.

While supporting various IO modules, daru-io also provides an easier way of adding new Importers / Exporters.
MSG
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is over-complication. Why?

@@ -3,15 +3,41 @@

module Daru
module IO
# Base IO Class that contains generic helper methods, to be
# used by other **Importers::Base** and **Exporters::Base** via inheritence
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{Importers::Base} would be auto-linked.

class Avro < Base
Daru::DataFrame.register_io_module :to_avro, self

# Exports +Daru::DataFrame+ to an Avro file.
# Exports **Daru::DataFrame** to an Avro file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`Daru::DataFrame`

would be prettier, probably. And above too.

athityakumar added a commit that referenced this pull request Aug 25, 2017
The to_format method is pending, and is quite ambigous as in , returning which format instance would make more sense. Also, the YARD docs is yet to be updated as it's been updated for  markdown support in PR #50 and would be better to proceed after that PR is merged.
Copy link
Collaborator

@zverok zverok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it! :allthethings:

@athityakumar athityakumar merged commit e8de525 into master Aug 26, 2017
@athityakumar athityakumar mentioned this pull request Aug 26, 2017
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants