Skip to content

Conversation

@jreback
Copy link
Contributor

@jreback jreback commented Mar 16, 2017

on top of #15694

xref #2770

Here's an example of what we could do with this

In [1]: df = pd.DataFrame({'value': [1, 2, 3, 4]}, index=pd.MultiIndex(
   ...:              levels=[['a', 'b'], ['bb', 'aa']],
   ...:              labels=[[0, 0, 1, 1], [0, 1, 0, 1]]))

In [2]: df
Out[2]: 
      value
a bb      1
  aa      2
b bb      3
  aa      4

In [14]: df.index.is_lexsorted()
Out[14]: True

In [15]: df.index.is_monotonic
Out[15]: False

sorting makes this monotonic & usually lexsorted (but not always)

In [3]: df2 = df.sort_index()

In [4]: df2
Out[4]: 
      value
a aa      2
  bb      1
b aa      4
  bb      3

In [12]: df2.index.is_lexsorted()
Out[12]: False

In [13]: df2.index.is_monotonic
Out[13]: True

If we expose a method .remove_unused_labels() (or even just do this under the hood on certain operations.

In [5]: df3 = df2.copy()

In [6]: df3.index._reconstruct(sort=True)
Out[6]: 
MultiIndex(levels=[['a', 'b'], ['aa', 'bb']],
           labels=[[0, 0, 1, 1], [0, 1, 0, 1]])

In [7]: df3.index = df3.index._reconstruct(sort=True)

In [8]: df3
Out[8]: 
      value
a aa      2
  bb      1
b aa      4
  bb      3

In [9]: df3.index.is_lexsorted()
Out[9]: True
In [11]: df3.index.is_monotonic

@codecov
Copy link

codecov bot commented Mar 16, 2017

Codecov Report

Merging #15700 into master will decrease coverage by <.01%.
The diff coverage is 91.66%.

@@            Coverage Diff             @@
##           master   #15700      +/-   ##
==========================================
- Coverage   91.01%   91.01%   -0.01%     
==========================================
  Files         143      143              
  Lines       49400    49448      +48     
==========================================
+ Hits        44963    45006      +43     
- Misses       4437     4442       +5
Impacted Files Coverage Δ
pandas/core/groupby.py 95.48% <100%> (+0.51%) ⬆️
pandas/core/frame.py 97.86% <100%> (-0.1%) ⬇️
pandas/core/reshape.py 99.27% <100%> (-0.01%) ⬇️
pandas/core/sorting.py 97.81% <100%> (+0.03%) ⬆️
pandas/core/series.py 94.79% <85.71%> (-0.08%) ⬇️
pandas/indexes/multi.py 96.37% <90.69%> (-0.22%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/indexes/base.py 96.08% <0%> (-0.06%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94720d9...f3ec8ac. Read the comment docs.

@jreback jreback force-pushed the unused branch 2 times, most recently from dbf1c94 to aa6190f Compare March 22, 2017 13:58
@jreback
Copy link
Contributor Author

jreback commented Mar 22, 2017

going to roll this into #15694

@jreback jreback closed this Mar 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant