Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unstack in master expects sorted groups #1

Open
wants to merge 23 commits into
base: enhance_join
Choose a base branch
from

Commits on Nov 13, 2015

  1. Configuration menu
    Copy the full SHA
    c1540a5 View commit details
    Browse the repository at this point in the history
  2. fix typo

    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    d0e3081 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    49f28b0 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f4507c8 View commit details
    Browse the repository at this point in the history
  5. test empty frames joins

    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    35f007f View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    f7c9b61 View commit details
    Browse the repository at this point in the history
  7. test empty frames groupby()

    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    66fc915 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3d0b312 View commit details
    Browse the repository at this point in the history
  9. nonunique(): use concrete types

    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    7d2219e View commit details
    Browse the repository at this point in the history
  10. simplify Sort.lt()

    use colordering(DFPerm) and getindex(DFPerm) to squeeze multiple lt()
    methods into one
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    0bcda39 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    39af873 View commit details
    Browse the repository at this point in the history
  12. more stable join()

    - don't encode the indexing columns, use DataFrameRow hashes instead
    - do only the parts of left-right rows matching that are required for a
      particular join kind
    - avoid vcat() that is very slow for PooledDataVector
    - now join respects left-frame order for all join kinds, so the
      tests/data.jl test were updated
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    bfc4d24 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    1ea31c0 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    799eb35 View commit details
    Browse the repository at this point in the history
  15. DFRowIterator: cache nrow

    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    9a05eaa View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    7148b97 View commit details
    Browse the repository at this point in the history
  17. groupby(): use _group_rows()

    sorting order is changed from NA first to NA last (it matches
    the default data frame sorting)
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    618dc09 View commit details
    Browse the repository at this point in the history
  18. hash() and isequal() that require DF and row ix

    so that DataFrameRow object doesn't need to be created
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    9b4d4fc View commit details
    Browse the repository at this point in the history
  19. _RowGroupDict: reimplement with dedicated hashing

    instead of using Dict{DataFrameRow,Int}, implement its own that
     - doesn't require DataFrameRow objects
     - calculates hashes in memory-efficient manner
     - keeps row hashes for efficient comparison
    
    use it for join(), groupby(), nonunique()
    
    disable DataFrameRowTests that use _RowGroupDict methods no longer
    available
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    573931c View commit details
    Browse the repository at this point in the history
  20. delete nonuniquekey() method

    it's not used and it is no loner faster than nonunique()
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    7c263e7 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    c4972fe View commit details
    Browse the repository at this point in the history
  22. make sorting of row groups optional

    by default no sorting is applied to preserve original ordering
    (the initial order of the 1st rows is preserved) and make things faster
    alyst committed Nov 13, 2015
    Configuration menu
    Copy the full SHA
    50f3154 View commit details
    Browse the repository at this point in the history

Commits on Nov 23, 2015

  1. unstack in master expects sorted groups

    Johan Gustafsson committed Nov 23, 2015
    Configuration menu
    Copy the full SHA
    3cdb6f0 View commit details
    Browse the repository at this point in the history