Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incostencies in behaviour of DataFrame. #515

Open
4 tasks
weqopy opened this issue Jul 5, 2019 · 2 comments
Open
4 tasks

Incostencies in behaviour of DataFrame. #515

weqopy opened this issue Jul 5, 2019 · 2 comments

Comments

@weqopy
Copy link
Contributor

weqopy commented Jul 5, 2019

EDIT (@v0dro):
Following is a list of method that should be implemented/corrected to get more consistency:

  • Vector#last.
  • DataFrame#last.
  • Return type of DataFrame#[] must be consistent when using a timeseries. It currently returns either a numerical value of another Vector or DataFrame depending on what you pass into #[].
  • Return nil when element not present in the DataFrame (currently raises error).

Ideally these should be split into separate issues and tackled one at a time.


I'd like to use this data to show the situation made me confused:

[25] pry(main)> dates=["2018-03-30", "2018-04-02", "2018-04-27", "2018-05-31", "2018-06-29", "2018-07-31", "2018-08-31", "2018-09-28", "2018-10-31", "2018-11-30"]
=> ["2018-03-30",
 "2018-04-02",
 "2018-04-27",
 "2018-05-31",
 "2018-06-29",
 "2018-07-31",
 "2018-08-31",
 "2018-09-28",
 "2018-10-31",
 "2018-11-30"]
[26] pry(main)> val=[1.00000001, 0.9999, 0.9908, 1.0885, 1.0586, 1.0374, 0.9456, 0.9638, 0.8397, 0.8788]
=> [1.00000001, 0.9999, 0.9908, 1.0885, 1.0586, 1.0374, 0.9456, 0.9638, 0.8397, 0.8788]
[27] pry(main)> id=Daru::DateTimeIndex.new(dates)
=> #<Daru::DateTimeIndex(10) 2018-03-30T00:00:00+00:00...2018-11-30T00:00:00+00:00>
[28] pry(main)> df = Daru::DataFrame.new({val: val}, index: id)
=> #<Daru::DataFrame(10x1)>
                   val
 2018-03-30 1.00000001
 2018-04-02     0.9999
 2018-04-27     0.9908
 2018-05-31     1.0885
 2018-06-29     1.0586
 2018-07-31     1.0374
 2018-08-31     0.9456
 2018-09-28     0.9638
 2018-10-31     0.8397
 2018-11-30     0.8788
  • first & last
[29] pry(main)> df.val.first
=> 1.00000001
[30] pry(main)> df.val.last 
NoMethodError: undefined method `last' for #<Daru::Vector:0x00007f43dbc591f0>
from /usr/local/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/vector.rb:1420:in `method_missing'
# which I supposed it returns 0.8788
  • The return type
[31] pry(main)> df.val['2018-03-30','2018-04-30']
=> #<Daru::Vector(3)>
                                       val
 2018-03-30T00:00:00+           1.00000001
 2018-04-02T00:00:00+               0.9999
 2018-04-27T00:00:00+               0.9908
[32] pry(main)> df.val['2018-04']
=> #<Daru::Vector(2)>
                                       val
 2018-04-02T00:00:00+               0.9999
 2018-04-27T00:00:00+               0.9908
[33] pry(main)> df.val['2018-03-30','2018-04-01']
=> 1.00000001
[34] pry(main)> df.val['2018-03']
=> 1.00000001
# which I supposed [33] and [34] both return:
# => #<Daru::Vector(1)>
#                                        val
#  2018-03-30T00:00:00+           1.00000001
  • errors and a not error
[48] pry(main)> df.val['2018']
=> #<Daru::Vector(10)>
                                       val
 2018-03-30T00:00:00+           1.00000001
 2018-04-02T00:00:00+               0.9999
 2018-04-27T00:00:00+               0.9908
 2018-05-31T00:00:00+               1.0885
 2018-06-29T00:00:00+               1.0586
 2018-07-31T00:00:00+               1.0374
 2018-08-31T00:00:00+               0.9456
 2018-09-28T00:00:00+               0.9638
 2018-10-31T00:00:00+               0.8397
 2018-11-30T00:00:00+               0.8788
[49] pry(main)> df.val['2017']
ArgumentError: Key 2017 is out of bounds
from /usr/local/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/date_time/index.rb:362:in `[]'
[50] pry(main)> df.val['2019']
ArgumentError: Key 2019 is out of bounds
from /usr/local/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/date_time/index.rb:362:in `[]'
[52] pry(main)> df.val['2018-12']
ArgumentError: bad value for range
from /usr/local/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/date_time/index.rb:547:in `slice_between_dates'
[53] pry(main)> df.val['2018-02']
=> #<Daru::Vector(10)>
                                       val
 2018-03-30T00:00:00+           1.00000001
 2018-04-02T00:00:00+               0.9999
 2018-04-27T00:00:00+               0.9908
 2018-05-31T00:00:00+               1.0885
 2018-06-29T00:00:00+               1.0586
 2018-07-31T00:00:00+               1.0374
 2018-08-31T00:00:00+               0.9456
 2018-09-28T00:00:00+               0.9638
 2018-10-31T00:00:00+               0.8397
 2018-11-30T00:00:00+               0.8788
# I supposed all those errors and [53] could return #<Daru::Vector(0)> #
@kojix2
Copy link
Member

kojix2 commented Aug 23, 2019

I think the gods left room for you to contribute. I'm sorry I'm just kidding.

@v0dro
Copy link
Member

v0dro commented May 30, 2020

I'm editing the issue comment to make an itemized list of issue items that can be tackled by a group of volunteers.

@v0dro v0dro changed the title Why Daru has these inconsistent situations? Incostencies in behaviour of DataFrame. May 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants