ENH: Implement MultiIndex.insert_level for inserting levels at specified positions #62610

Chiwendaiyue · 2025-10-07T13:03:07Z

What does this PR do?

Adds insert_level method to MultiIndex for inserting new levels at specified positions.

Motivation

This addresses the feature request in issue #62558 for a simple way to add levels to MultiIndex at given positions.

Key changes

Implement insert_level method in pandas/core/indexes/multi.py
Add comprehensive test suite in pandas/tests/indexes/multi/test_insert_level.py
Handle various edge cases and error conditions

Testing

All new tests pass
Existing functionality remains unchanged
Handles scalar values and array-like inputs
Proper error handling for invalid inputs

Example usage

idx = pd.MultiIndex.from_tuples([('A', 1), ('B', 2)])
result = idx.insert_level(1, 'new_level')

- Implement insert_level method for MultiIndex to insert new levels at specified positions - Add comprehensive test cases for the new functionality - Fix level names handling to match expected behavior Resolves: MultiIndex level insertion feature request

Alvaro-Kothe

Thanks for the PR. Here are some suggestions against the test suite.

It seems that you committed fastparquet and pyarrow accidentally.

Also, can you add an entry to whatsnew?

pandas/core/indexes/multi.py

Alvaro-Kothe · 2025-10-12T13:20:21Z

pandas/tests/indexes/multi/test_insert_level.py

+    def setup_method(self):
+        self.simple_idx = pd.MultiIndex.from_tuples(
+            [("A", 1), ("B", 2), ("C", 3)], names=["level1", "level2"]
+        )
+        self.empty_idx = pd.MultiIndex.from_tuples([], names=["level1", "level2"])


NIT: I would prefer that you define this in the test body that should use it.

Alvaro-Kothe · 2025-10-12T13:21:52Z

pandas/tests/indexes/multi/test_insert_level.py

+        )
+        self.empty_idx = pd.MultiIndex.from_tuples([], names=["level1", "level2"])
+
+    def test_insert_level_basic(self):


Can you parametrize this test?

Alvaro-Kothe · 2025-10-12T13:23:09Z

pandas/tests/indexes/multi/test_insert_level.py

+        result = self.simple_idx.insert_level(0, "new_val", name="new_level")
+        assert result.names[0] == "new_level"
+
+    def test_insert_level_edge_positions(self):


This can go into the first test.

Alvaro-Kothe · 2025-10-12T13:23:58Z

pandas/tests/indexes/multi/test_insert_level.py

+        result_end = self.simple_idx.insert_level(2, "end")
+        assert result_end.nlevels == 3
+
+    def test_insert_level_error_cases(self):


Can you parametrize this?

Alvaro-Kothe · 2025-10-12T13:25:36Z

pandas/tests/indexes/multi/test_insert_level.py

+        with pytest.raises(ValueError, match="Length of values must match"):
+            self.simple_idx.insert_level(1, ["too", "few"])
+
+    def test_insert_level_with_different_data_types(self):


These tests could go into the first test too.

Alvaro-Kothe · 2025-10-12T13:27:08Z

pandas/tests/indexes/multi/test_insert_level.py

+        def test_debug_names():
+            idx = pd.MultiIndex.from_tuples(
+                [("A", 1), ("B", 2), ("C", 3)], names=["level1", "level2"]
+            )
+            print("Original names:", idx.names)
+
+            result = idx.insert_level(0, "new_value")
+            print("Result names:", result.names)
+
+            expected = pd.MultiIndex.from_tuples(
+                [("new_value", "A", 1), ("new_value", "B", 2), ("new_value", "C", 3)],
+                names=[None, "level1", "level2"],
+            )
+            print("Expected names:", expected.names)


Why does this exist?

That was a leftover debug function from earlier development. I've deleted it from the test file.

Alvaro-Kothe · 2025-10-12T13:37:22Z

pandas/tests/indexes/multi/test_insert_level.py

+
+        tm.assert_index_equal(original, self.simple_idx)
+
+        assert result.nlevels == original.nlevels + 1


This assertion feels redundant against the first test.

Alvaro-Kothe · 2025-10-12T13:39:40Z

pandas/tests/indexes/multi/test_insert_level.py

+    def test_insert_level_with_name(self):
+        result = self.simple_idx.insert_level(0, "new_val", name="new_level")
+        assert result.names[0] == "new_level"


You could also test adding names in the first test and remove this test.

Thanks for your patience!!!! I've made all the requested improvements such as “Moved test data into individual test methods”. All tests continue to pass and the implementation is ready for review!

Co-authored-by: Álvaro Kothe <[email protected]>

Alvaro-Kothe · 2025-10-14T17:03:17Z

pandas/tests/indexes/multi/test_insert_level.py

+            (2, "end"),
+        ],
+    )
+    def test_insert_level_edge_positions(self, position, value):


This test would also be better in the first test. Asserting the whole series is better than asserting its size.

Alvaro-Kothe · 2025-10-14T17:06:37Z

pandas/tests/indexes/multi/test_insert_level.py

+        "value",
+        [100, 1.5, None],
+    )
+    def test_insert_level_with_different_data_types(self, value):


Same here, you are only asserting its size. Move it to the first test to verify with tm.assert_index_equal(result, expected)

I've consolidated all appropriate tests into the main parametrized test to provide comprehensive validation with tm.assert_index_equal, if here are some other problems, I would fix them immediately, thank you!!!

… test

Alvaro-Kothe · 2025-10-15T15:29:32Z

pandas/tests/frame/test_query_eval.py

The changes in this file seems unrelated.

Sorry, It might be remnants of a merge conflict - looks like an issue from resolving conflicts.I have made it the same as the main branch.

Alvaro-Kothe · 2025-10-15T15:30:49Z

fastparquet

I think you committed this by accident

Alvaro-Kothe · 2025-10-15T15:30:55Z

pyarrow

I think you committed this by accident

git rm pyarrow

Alvaro-Kothe · 2025-10-15T15:31:52Z

doc/source/whatsnew/v3.0.0.rst

-
+

You don't need to remove the -

Sorry, I fixed it

Alvaro-Kothe · 2025-10-19T12:47:56Z

pandas/core/indexes/multi.py

+                new_tuple = list(tup)
+                new_tuple.insert(position, value[i])
+                new_tuples.append(tuple(new_tuple))


Is it really necessary, this conversion of tuple -> list -> tuple?

Alvaro-Kothe · 2025-10-19T12:58:46Z

pandas/core/indexes/multi.py

+                new_tuple = list(tup)
+                new_tuple.insert(position, value[i])
+                new_tuples.append(tuple(new_tuple))
+            else:


Why do you need this branch? Do any of your test cases fall here?

Alvaro-Kothe · 2025-10-19T13:14:13Z

pandas/core/indexes/multi.py

+        if self.names is not None:
+            new_names = list(self.names)
+        else:
+            new_names = [None] * self.nlevels
+
+        new_names.insert(position, name)


Suggested change

if self.names is not None:

new_names = list(self.names)

else:

new_names = [None] * self.nlevels

new_names.insert(position, name)

if self.names is not None:

new_names = self.names[:position] + [name] + self.names[position + 1:]

else:

new_names = [None] * (position) + [name] + [None] * (self.nlevel - position)

Is there a case where self.names is None?

I research the Constructors, If the user has not named it, the result._names = [None] * len(levels) would make it a list class. Can it be regard as self.names would never be None? Maybe here can be new_names = self.names[:position] + [name] + self.names[position:]

Can it be regard as self.names would never be None?

The MultiIndex constructor contains these lines

pandas/pandas/core/indexes/multi.py

Lines 329 to 332 in a329dc3

result._names = [None] * len(levels)

if names is not None:

# handles name validation

result._set_names(names)

That indicates it would never be None. So I think it's safe to remove the branching.

Alvaro-Kothe · 2025-10-19T13:15:33Z

pyarrow

git rm pyarrow

…ersions & REF: Remove unnecessary else branch in MultiIndex.insert_level & REF: Simplify names handling in MultiIndex.insert_level

Alvaro-Kothe · 2025-10-19T16:58:07Z

pandas/core/indexes/multi.py

+            if position == 0:
+                new_tuple = (value[i],) + tup
+            elif position == len(tup):
+                new_tuple = tup + (value[i],)
+            else:


You don't need this if-else conditions. The slicing in the else branch handles the edges.

Alvaro-Kothe · 2025-10-19T16:59:42Z

pandas/core/indexes/multi.py

-        >>> idx.insert_level(1, ["L1", "L2"], name="z")
-        MultiIndex([('A', 'L1', 1), ('B', 'L2', 2)],
-                   names=['x', 'z', 'y'])
-        """


I think you removed the documentation accidentally.

Alvaro-Kothe

LGTM

Chiwendaiyue · 2025-10-21T10:56:32Z

@Alvaro-Kothe Just checking if there's anything else needed before this can be merged? All checks are passing and the feature is ready. Thanks!

Alvaro-Kothe · 2025-10-21T11:07:33Z

It needs the approval of a core member. Just wait.

mroeschke · 2025-10-21T17:23:57Z

doc/source/whatsnew/v3.0.0.rst

 - :meth:`pandas.concat` will raise a ``ValueError`` when ``ignore_index=True`` and ``keys`` is not ``None`` (:issue:`59274`)
 - :py:class:`frozenset` elements in pandas objects are now natively printed (:issue:`60690`)
 - Add ``"delete_rows"`` option to ``if_exists`` argument in :meth:`DataFrame.to_sql` deleting all records of the table before inserting data (:issue:`37210`).
+- Added :meth:`MultiIndex.insert_level` to insert new levels at specified positions in a MultiIndex (:issue:`62558`)


Can you put insert_level in the API reference?

mroeschke · 2025-10-21T17:24:29Z

pandas/core/indexes/multi.py

+            Value(s) to use for the new level. If scalar, broadcast to all items.
+            If array-like, length must match the length of the index.


Can you make this accept only an array-like?

mroeschke · 2025-10-21T17:26:08Z

pandas/core/indexes/multi.py

        result = self._reorder_ilevels(order)
        return result

+    def insert_level(self, position: int, value, name=None):


Suggested change

def insert_level(self, position: int, value, name=None):

def insert_level(self, position: int, value, name: Hashable=lib.no_default) -> MultiIndex:

mroeschke · 2025-10-21T17:29:07Z

pandas/core/indexes/multi.py

+        for i, tup in enumerate(self):
+            new_tuple = tup[:position] + (value[i],) + tup[position:]
+            new_tuples.append(new_tuple)
+
+        new_names = self.names[:position] + [name] + self.names[position:]
+
+        return MultiIndex.from_tuples(new_tuples, names=new_names)


Instead of using from_tuples (which will be slow), can you calculate the level and codes of the new values (you can find examples of this in this file) and interpose that with the existing self.levels and self.codes?

Hi @mroeschke
I'm working on this and have a question about the optimal approach.
I started with a levels/codes based implementation:

def insert_level(self, position: int, value, name: Hashable = lib.no_default) -> MultiIndex: #... new_level = Index(value) new_codes_for_level = new_level.get_indexer(value) new_levels = self.levels[:position] + [new_level] + self.levels[position:] new_codes = self.codes[:position] + [new_codes_for_level] + self.codes[position:] new_names = self.names[:position] + [name] + self.names[position:] return MultiIndex(levels=new_levels, codes=new_codes, names=new_names, verify_integrity=False)

However, I'm encountering issues with None value handling where new_level.get_indexer(value) fails when the new level contains duplicate values (like [None, None, None]).Should I:Continue with levels/codes optimization and implement custom codes mapping to handle duplicates, similar to how from_arrays handles it internally?Or just use the simpler from_tuples approach that builds new tuples and delegates to the well-tested MultiIndex.from_tuples method?The from_tuples approach would be more reliable for edge cases but potentially less performant for large indices.
Thanks for your guidance!

mroeschke · 2025-10-21T17:30:12Z

pandas/tests/indexes/multi/test_constructors.py

+def test_insert_level_integration():
+    idx = MultiIndex.from_tuples([("A", 1), ("B", 2)])
+
+    df = pd.DataFrame({"data": [10, 20]}, index=idx)
+    new_idx = idx.insert_level(0, "group1")
+    df_new = df.set_index(new_idx)
+
+    assert df_new.index.nlevels == 3
+    assert len(df_new) == 2


Could you remove this test? I don't think it's necessarily testing anything new

mroeschke · 2025-10-21T17:30:26Z

pandas/tests/indexes/multi/test_insert_level.py

+import pandas._testing as tm
+
+
+class TestMultiIndexInsertLevel:


Could you remove this class and make all the tests free functions?

…/codes operations

cloudboat added 2 commits October 7, 2025 20:43

Chiwendaiyue marked this pull request as draft October 7, 2025 13:05

Chiwendaiyue marked this pull request as ready for review October 7, 2025 13:18

Chiwendaiyue changed the title ~~multiindex-insert-level~~ ENH: Implement MultiIndex.insert_level for inserting levels at specified positions Oct 7, 2025

Chiwendaiyue marked this pull request as draft October 7, 2025 13:29

cloudboat added 4 commits October 7, 2025 21:57

ENH: Add insert_level method to MultiIndex with formatting fixes

5f0caf0

STYLE: Format code with ruff

97a98e5

FIX: Remove undefined pd reference

1a9ddc5

Chiwendaiyue marked this pull request as ready for review October 7, 2025 15:49

Chiwendaiyue added 2 commits October 8, 2025 17:59

Merge branch 'main' into shiny-new-feature

2199e6e

Merge branch 'main' into shiny-new-feature

44985ad

Alvaro-Kothe reviewed Oct 12, 2025

View reviewed changes

Chiwendaiyue and others added 3 commits October 12, 2025 22:54

Update pandas/core/indexes/multi.py

9e8676d

Co-authored-by: Álvaro Kothe <[email protected]>

Merge branch 'main' into shiny-new-feature

094958d

DOC: Add whatsnew entry for MultiIndex.insert_level

77f3af8

Alvaro-Kothe reviewed Oct 14, 2025

View reviewed changes

cloudboat and others added 2 commits October 15, 2025 10:58

TEST: Comprehensive consolidation of all test cases into parametrized…

7bf3067

… test

Merge branch 'main' into shiny-new-feature

00a346f

Alvaro-Kothe reviewed Oct 15, 2025

View reviewed changes

cloudboat and others added 3 commits October 16, 2025 00:15

FIX: Remove accidental binary file and update whatsnew

c4ecf7a

FIX: Revert accidental changes to test_query_eval.py

8e0068a

Merge branch 'main' into shiny-new-feature

87bd44b

Alvaro-Kothe reviewed Oct 19, 2025

View reviewed changes

PERF: Optimize MultiIndex.insert_level to avoid unnecessary type conv…

79e04b6

…ersions & REF: Remove unnecessary else branch in MultiIndex.insert_level & REF: Simplify names handling in MultiIndex.insert_level

Alvaro-Kothe reviewed Oct 19, 2025

View reviewed changes

REF: Simplify tuple construction in MultiIndex.insert_level

1447886

Alvaro-Kothe approved these changes Oct 19, 2025

View reviewed changes

Chiwendaiyue added 2 commits October 20, 2025 12:35

Merge branch 'main' into shiny-new-feature

e2917d0

Merge branch 'main' into shiny-new-feature

471c2d6

mroeschke reviewed Oct 21, 2025

View reviewed changes

mroeschke requested changes Oct 21, 2025

View reviewed changes

mroeschke added the MultiIndex label Oct 21, 2025

Chiwendaiyue and others added 3 commits October 22, 2025 19:23

Merge branch 'main' into shiny-new-feature

e2334ac

Add API reference documentation & Implement insert_level using levels…

eb320fb

…/codes operations

all changes without levels/codes operations

9f80ad1


		tm.assert_index_equal(original, self.simple_idx)

		assert result.nlevels == original.nlevels + 1

	result._names = [None] * len(levels)
	if names is not None:
	# handles name validation
	result._set_names(names)

		Value(s) to use for the new level. If scalar, broadcast to all items.
		If array-like, length must match the length of the index.

	def insert_level(self, position: int, value, name=None):
	def insert_level(self, position: int, value, name: Hashable=lib.no_default) -> MultiIndex:

		import pandas._testing as tm


		class TestMultiIndexInsertLevel:

Uh oh!

ENH: Implement MultiIndex.insert_level for inserting levels at specified positions #62610

Are you sure you want to change the base?

ENH: Implement MultiIndex.insert_level for inserting levels at specified positions #62610

Conversation

Chiwendaiyue commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Alvaro-Kothe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Chiwendaiyue Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alvaro-Kothe left a comment

Choose a reason for hiding this comment

Uh oh!

Chiwendaiyue commented Oct 21, 2025

Uh oh!

Alvaro-Kothe commented Oct 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Chiwendaiyue commented Oct 7, 2025 •

edited

Loading

Chiwendaiyue Oct 19, 2025 •

edited

Loading