@@ -5454,64 +5454,69 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
54545454 limit = limit , downcast = downcast )
54555455
54565456 _shared_docs ['replace' ] = ("""
5457- Replace values given in 'to_replace' with 'value'.
5457+ Replace values given in `to_replace` with `value`.
5458+
5459+ Values of the %(klass)s are replaced with other values dynamically.
5460+ This differs from updating with ``.loc`` or ``.iloc``, which require
5461+ you to specify a location to update with some value.
54585462
54595463 Parameters
54605464 ----------
5461- to_replace : str, regex, list, dict, Series, numeric, or None
5465+ to_replace : str, regex, list, dict, Series, int, float, or None
5466+ How to find the values that will be replaced.
54625467
54635468 * numeric, str or regex:
54645469
5465- - numeric: numeric values equal to `` to_replace` ` will be
5466- replaced with `` value` `
5467- - str: string exactly matching `` to_replace` ` will be replaced
5468- with `` value` `
5469- - regex: regexs matching `` to_replace` ` will be replaced with
5470- `` value` `
5470+ - numeric: numeric values equal to `to_replace` will be
5471+ replaced with `value`
5472+ - str: string exactly matching `to_replace` will be replaced
5473+ with `value`
5474+ - regex: regexs matching `to_replace` will be replaced with
5475+ `value`
54715476
54725477 * list of str, regex, or numeric:
54735478
5474- - First, if `` to_replace`` and `` value` ` are both lists, they
5479+ - First, if `to_replace` and `value` are both lists, they
54755480 **must** be the same length.
54765481 - Second, if ``regex=True`` then all of the strings in **both**
54775482 lists will be interpreted as regexs otherwise they will match
5478- directly. This doesn't matter much for `` value` ` since there
5483+ directly. This doesn't matter much for `value` since there
54795484 are only a few possible substitution regexes you can use.
54805485 - str, regex and numeric rules apply as above.
54815486
54825487 * dict:
54835488
54845489 - Dicts can be used to specify different replacement values
54855490 for different existing values. For example,
5486- {'a': 'b', 'y': 'z'} replaces the value 'a' with 'b' and
5487- 'y' with 'z'. To use a dict in this way the `` value` `
5488- parameter should be `` None` `.
5491+ `` {'a': 'b', 'y': 'z'}`` replaces the value 'a' with 'b' and
5492+ 'y' with 'z'. To use a dict in this way the `value`
5493+ parameter should be `None`.
54895494 - For a DataFrame a dict can specify that different values
54905495 should be replaced in different columns. For example,
5491- {'a': 1, 'b': 'z'} looks for the value 1 in column 'a' and
5492- the value 'z' in column 'b' and replaces these values with
5493- whatever is specified in `` value`` . The `` value` ` parameter
5496+ `` {'a': 1, 'b': 'z'}`` looks for the value 1 in column 'a'
5497+ and the value 'z' in column 'b' and replaces these values
5498+ with whatever is specified in `value`. The `value` parameter
54945499 should not be ``None`` in this case. You can treat this as a
54955500 special case of passing two lists except that you are
54965501 specifying the column to search in.
54975502 - For a DataFrame nested dictionaries, e.g.,
5498- {'a': {'b': np.nan}}, are read as follows: look in column 'a'
5499- for the value 'b' and replace it with NaN. The `` value` `
5503+ `` {'a': {'b': np.nan}}`` , are read as follows: look in column
5504+ 'a' for the value 'b' and replace it with NaN. The `value`
55005505 parameter should be ``None`` to use a nested dict in this
55015506 way. You can nest regular expressions as well. Note that
55025507 column names (the top-level dictionary keys in a nested
55035508 dictionary) **cannot** be regular expressions.
55045509
55055510 * None:
55065511
5507- - This means that the `` regex` ` argument must be a string,
5508- compiled regular expression, or list, dict, ndarray or Series
5509- of such elements. If `` value`` is also ``None`` then this
5510- **must** be a nested dictionary or `` Series`` .
5512+ - This means that the `regex` argument must be a string,
5513+ compiled regular expression, or list, dict, ndarray or
5514+ Series of such elements. If `value` is also ``None`` then
5515+ this **must** be a nested dictionary or Series.
55115516
55125517 See the examples section for examples of each of these.
55135518 value : scalar, dict, list, str, regex, default None
5514- Value to replace any values matching `` to_replace` ` with.
5519+ Value to replace any values matching `to_replace` with.
55155520 For a DataFrame a dict of values can be used to specify which
55165521 value to use for each column (columns not in the dict will not be
55175522 filled). Regular expressions, strings and lists or dicts of such
@@ -5521,45 +5526,50 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
55215526 other views on this object (e.g. a column from a DataFrame).
55225527 Returns the caller if this is True.
55235528 limit : int, default None
5524- Maximum size gap to forward or backward fill
5525- regex : bool or same types as `` to_replace` `, default False
5526- Whether to interpret `` to_replace`` and/or `` value` ` as regular
5527- expressions. If this is ``True`` then `` to_replace` ` *must* be a
5529+ Maximum size gap to forward or backward fill.
5530+ regex : bool or same types as `to_replace`, default False
5531+ Whether to interpret `to_replace` and/or `value` as regular
5532+ expressions. If this is ``True`` then `to_replace` *must* be a
55285533 string. Alternatively, this could be a regular expression or a
55295534 list, dict, or array of regular expressions in which case
5530- `` to_replace` ` must be ``None``.
5531- method : string, optional, {'pad', 'ffill', 'bfill'}
5532- The method to use when for replacement, when `` to_replace` ` is a
5533- scalar, list or tuple and `` value`` is None.
5535+ `to_replace` must be ``None``.
5536+ method : {'pad', 'ffill', 'bfill', `None` }
5537+ The method to use when for replacement, when `to_replace` is a
5538+ scalar, list or tuple and `value` is `` None`` .
55345539
5535- .. versionchanged:: 0.23.0
5536- Added to DataFrame
5540+ .. versionchanged:: 0.23.0
5541+ Added to DataFrame.
5542+ axis : None
5543+ .. deprecated:: 0.13.0
5544+ Has no effect and will be removed.
55375545
55385546 See Also
55395547 --------
5540- %(klass)s.fillna : Fill NA/NaN values
5548+ %(klass)s.fillna : Fill NA values
55415549 %(klass)s.where : Replace values based on boolean condition
5550+ Series.str.replace : Simple string replacement.
55425551
55435552 Returns
55445553 -------
5545- filled : %(klass)s
5554+ %(klass)s
5555+ Object after replacement.
55465556
55475557 Raises
55485558 ------
55495559 AssertionError
5550- * If `` regex`` is not a ``bool`` and `` to_replace` ` is not
5560+ * If `regex` is not a ``bool`` and `to_replace` is not
55515561 ``None``.
55525562 TypeError
5553- * If `` to_replace`` is a ``dict`` and `` value` ` is not a ``list``,
5563+ * If `to_replace` is a ``dict`` and `value` is not a ``list``,
55545564 ``dict``, ``ndarray``, or ``Series``
5555- * If `` to_replace`` is ``None`` and `` regex` ` is not compilable
5565+ * If `to_replace` is ``None`` and `regex` is not compilable
55565566 into a regular expression or is a list, dict, ndarray, or
55575567 Series.
55585568 * When replacing multiple ``bool`` or ``datetime64`` objects and
5559- the arguments to `` to_replace` ` does not match the type of the
5569+ the arguments to `to_replace` does not match the type of the
55605570 value being replaced
55615571 ValueError
5562- * If a ``list`` or an ``ndarray`` is passed to `` to_replace` ` and
5572+ * If a ``list`` or an ``ndarray`` is passed to `to_replace` and
55635573 `value` but they are not the same length.
55645574
55655575 Notes
@@ -5573,10 +5583,15 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
55735583 numbers *are* strings, then you can do this.
55745584 * This method has *a lot* of options. You are encouraged to experiment
55755585 and play with this method to gain intuition about how it works.
5586+ * When dict is used as the `to_replace` value, it is like
5587+ key(s) in the dict are the to_replace part and
5588+ value(s) in the dict are the value parameter.
55765589
55775590 Examples
55785591 --------
55795592
5593+ **Scalar `to_replace` and `value`**
5594+
55805595 >>> s = pd.Series([0, 1, 2, 3, 4])
55815596 >>> s.replace(0, 5)
55825597 0 5
@@ -5585,6 +5600,7 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
55855600 3 3
55865601 4 4
55875602 dtype: int64
5603+
55885604 >>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
55895605 ... 'B': [5, 6, 7, 8, 9],
55905606 ... 'C': ['a', 'b', 'c', 'd', 'e']})
@@ -5596,20 +5612,24 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
55965612 3 3 8 d
55975613 4 4 9 e
55985614
5615+ **List-like `to_replace`**
5616+
55995617 >>> df.replace([0, 1, 2, 3], 4)
56005618 A B C
56015619 0 4 5 a
56025620 1 4 6 b
56035621 2 4 7 c
56045622 3 4 8 d
56055623 4 4 9 e
5624+
56065625 >>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])
56075626 A B C
56085627 0 4 5 a
56095628 1 3 6 b
56105629 2 2 7 c
56115630 3 1 8 d
56125631 4 4 9 e
5632+
56135633 >>> s.replace([1, 2], method='bfill')
56145634 0 0
56155635 1 3
@@ -5618,20 +5638,24 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
56185638 4 4
56195639 dtype: int64
56205640
5641+ **dict-like `to_replace`**
5642+
56215643 >>> df.replace({0: 10, 1: 100})
56225644 A B C
56235645 0 10 5 a
56245646 1 100 6 b
56255647 2 2 7 c
56265648 3 3 8 d
56275649 4 4 9 e
5650+
56285651 >>> df.replace({'A': 0, 'B': 5}, 100)
56295652 A B C
56305653 0 100 100 a
56315654 1 1 6 b
56325655 2 2 7 c
56335656 3 3 8 d
56345657 4 4 9 e
5658+
56355659 >>> df.replace({'A': {0: 100, 4: 400}})
56365660 A B C
56375661 0 100 5 a
@@ -5640,45 +5664,87 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
56405664 3 3 8 d
56415665 4 400 9 e
56425666
5667+ **Regular expression `to_replace`**
5668+
56435669 >>> df = pd.DataFrame({'A': ['bat', 'foo', 'bait'],
56445670 ... 'B': ['abc', 'bar', 'xyz']})
56455671 >>> df.replace(to_replace=r'^ba.$', value='new', regex=True)
56465672 A B
56475673 0 new abc
56485674 1 foo new
56495675 2 bait xyz
5676+
56505677 >>> df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True)
56515678 A B
56525679 0 new abc
56535680 1 foo bar
56545681 2 bait xyz
5682+
56555683 >>> df.replace(regex=r'^ba.$', value='new')
56565684 A B
56575685 0 new abc
56585686 1 foo new
56595687 2 bait xyz
5688+
56605689 >>> df.replace(regex={r'^ba.$':'new', 'foo':'xyz'})
56615690 A B
56625691 0 new abc
56635692 1 xyz new
56645693 2 bait xyz
5694+
56655695 >>> df.replace(regex=[r'^ba.$', 'foo'], value='new')
56665696 A B
56675697 0 new abc
56685698 1 new new
56695699 2 bait xyz
56705700
56715701 Note that when replacing multiple ``bool`` or ``datetime64`` objects,
5672- the data types in the `` to_replace` ` parameter must match the data
5702+ the data types in the `to_replace` parameter must match the data
56735703 type of the value being replaced:
56745704
56755705 >>> df = pd.DataFrame({'A': [True, False, True],
56765706 ... 'B': [False, True, False]})
56775707 >>> df.replace({'a string': 'new value', True: False}) # raises
5708+ Traceback (most recent call last):
5709+ ...
56785710 TypeError: Cannot compare types 'ndarray(dtype=bool)' and 'str'
56795711
56805712 This raises a ``TypeError`` because one of the ``dict`` keys is not of
56815713 the correct type for replacement.
5714+
5715+ Compare the behavior of ``s.replace({'a': None})`` and
5716+ ``s.replace('a', None)`` to understand the pecularities
5717+ of the `to_replace` parameter:
5718+
5719+ >>> s = pd.Series([10, 'a', 'a', 'b', 'a'])
5720+
5721+ When one uses a dict as the `to_replace` value, it is like the
5722+ value(s) in the dict are equal to the `value` parameter.
5723+ ``s.replace({'a': None})`` is equivalent to
5724+ ``s.replace(to_replace={'a': None}, value=None, method=None)``:
5725+
5726+ >>> s.replace({'a': None})
5727+ 0 10
5728+ 1 None
5729+ 2 None
5730+ 3 b
5731+ 4 None
5732+ dtype: object
5733+
5734+ When ``value=None`` and `to_replace` is a scalar, list or
5735+ tuple, `replace` uses the method parameter (default 'pad') to do the
5736+ replacement. So this is why the 'a' values are being replaced by 10
5737+ in rows 1 and 2 and 'b' in row 4 in this case.
5738+ The command ``s.replace('a', None)`` is actually equivalent to
5739+ ``s.replace(to_replace='a', value=None, method='pad')``:
5740+
5741+ >>> s.replace('a', None)
5742+ 0 10
5743+ 1 10
5744+ 2 10
5745+ 3 b
5746+ 4 b
5747+ dtype: object
56825748 """ )
56835749
56845750 @Appender (_shared_docs ['replace' ] % _shared_doc_kwargs )
0 commit comments