Fix (E712) changing `==`/`!=` to `is`/`is not` is not correct for some types #4560

zanieb · 2023-05-21T15:51:41Z

Summary

Generally, comparisons to True, False, and None singletons should use obj is True instead of obj == True.

However, it is common for libraries to override the ==/__eq__ operator to create simple APIs for filtering data. In these cases, correcting == to is changes the meaning of the program and breaks the user's code. The same applies for != and is not.

This is a tracking issue for all invalid corrections from this rule.

Types with the issue

pandas.DataFrame often used with DataFrame.mask, DataFrame.where
pandas.Series often used with Series.mask, Series.where
numpy.Array often used with Array.where
sqlalchemy.Column often used with Query.having, Query.filter, Query.where

If an issue with an unlisted type is encountered please reply and I will edit to add it here.

Resolution

Eventually, ruff is likely to detect these cases by inferring the datatype involved and exclude it from the suggested fix.

In the meantime, you may:

Disable rule E712
Use an alternative comparison method that is not ambiguous (e.g. pandas.Series.eq)

Examples

import numpy

numpy.array([True, False]) == False
# array([False,  True])

numpy.array([True, False]) is False
# False

import pandas

pandas.Series([True, False]) == False
# 0    False
# 1    True
# dtype: bool

pandas.Series([True, False]) is False
# False

# Alternative safe syntax
pandas.Series([True, False]).eq(False)

pandas.DataFrame({"x": [True, False]}) == False
#       x
# 0  False
# 1  True

pandas.DataFrame({"x": [True, False]}) is False
# False

# Alternative safe syntax
pandas.DataFrame({"x": [True, False]}).eq(False)

import sqlalchemy
c = sqlalchemy.Column("foo", sqlalchemy.Boolean)

c == True
# <sqlalchemy.sql.elements.BinaryExpression object at 0x12ed532e0>

c is True
# False

# Alternative safe syntax
c.is_(True)
c.is_not(False)

Related issues

The text was updated successfully, but these errors were encountered:

ndevenish · 2023-06-19T14:36:53Z

I just spent a while tracking down this exact issue, an error introduced by ruff and reduced to the almost identical:

import numpy as np

arr = np.array([False, True, False, True])
print(repr(arr == False))
# array([ True, False,  True, False])
print(repr(arr is False))
# False

Reading the other thread, it sounds like this autofix can't be made safe. I would suggest disabling it completely then, because using truthiness in numpy comparison isn't a rare operation, and expecting everyone to "know" not to do this seems to defeat the point of an autocorrecting linter.

charliermarsh · 2023-06-19T14:44:14Z

Reading the other thread, it sounds like this autofix can't be made safe. I would suggest disabling it completely then, because using truthiness in numpy comparison isn't a rare operation, and expecting everyone to "know" not to do this seems to defeat the point of an autocorrecting linter.

I think making this a suggested fix (as it is now) will have the same effect, once we introduce --fix and --fix-unsafe (the former of which will only make automatic fixes, while the latter will include suggested fixes, which may include changes in behavior).

The problem with removing the autofix entirely is that it doesn't really reduce the burden or expectation on the user, because this diagnostic will still be raised, and so users will still be required to look at the code and understand whether or not to change it. Performing the fix automatically has the downside of silently breaking the code, but requiring users to opt-in to the change explicitly seems (to me) as safe as pointing them to the diagnostic without including any possible fix.

nicornk · 2023-07-18T07:10:30Z

Any conclusion on this one? This issue broke a bunch of our pyspark code by converting code similar to:

df = df.where(F.col("colName") == True)

to

df = df.where(F.col("colName") is True)

which leads to TypeError: condition should be string or Column

Thank you

dstoeckel · 2023-07-18T07:15:11Z

Adding to @nicornk: PEP-8 actually strongly discourages using is with boolean constants:

Don’t compare boolean values to True or False using ==:
# Correct:
if greeting:
# Wrong:
if greeting == True:
Worse:

# Wrong:
if greeting is True:

So while the autofix may be unsafe, I would argue the diagnostics itself is harmful. The correct suggestion would be to drop == True entirely (though PySpark is certainly another special case, as == False would need a conversion to the unary not ~ operator ...)

zanieb · 2023-07-18T13:53:20Z

@nicornk you can use the unambiguous alternate syntax e.g. df = df.where(F.col("colName").is_(True)) or disable the rule. Additionally, once Ruff has type inference, we will avoid suggesting a change to col is True.

@dstoeckel while I agree that if foo is definitely better than if foo == True, there are entirely valid uses for checking if something is True or False without having a __bool__ cast occur and there are cases outside of if statments where the user must perform the cast. Unfortunately the diagnostic must try to guess the intent of the user when suggesting a fix which puts us in a tough spot. I'm not sure this issue is the correct place to discuss the validity of the rule as a whole though, I'd like to keep this scoped to a discussion of false positives based on type inference.

conbrad · 2023-07-18T15:38:04Z

Additionally, once Ruff has type inference

Is there an existing ongoing effort for this that's public?

zanieb · 2023-10-17T20:32:21Z

Note as of v0.1.0 we do not apply unsafe fixes by default — so this fix will not be applied by default.

NeilGirdhar · 2023-11-06T13:36:33Z

there are entirely valid uses for checking if something is True or False without having a __bool__ cast occur

Of course this is a matter of opinion, but if you want to check this, I think the Pythonic way to do so is to use:

if isinstance(x, bool) and x:
# or
match x:
    case True:

is True is far more often misused in my opinion.

zanieb · 2023-11-06T16:05:18Z

Please let's not make this issue a debate about how if _ == True, if _ is True, and if _ should be used or whether E712 is valid in general.

The libraries that are the focus of this issue have designed APIs where specific comparisons are necessary. This issue is intended to track support for patterns in those APIs. I'd recommend creating a new discussion if you want to discuss broader concerns.

VictorGob · 2023-12-22T10:23:38Z

Just to add an example, of an error that took me a while to fix.

import pandas as pd

# Example dataframe
df = pd.DataFrame({"id": [1, 2, 3, 4, 5], "col_2": [True, False, True, False, True]})

# This works, but ruff raises: Comparison to `False` should be `cond is False`
a = df[df["col_2"] == False]
print(a)

# This does not work, pandas will raise 'KeyError: False'
b = df[df["col_2"] is False]
print(b)

simonpanay · 2024-02-16T09:43:40Z

Another example with sqlalchemy2 ( where syntax has changed a lot comparing with versions 1.x):

    query = request.dbsession.execute(
        select(Station, func.min(Check.result))  # pylint: disable=E1102
        .join(Check.channel)
        .join(Channel.station)
        .where(
            Station.triggered == False,
        )
    ).all()

Here the Station.triggered == False raises the E712
If replaced by Station.triggered is False the result is not what is expected

psychedelicious · 2024-06-12T23:44:23Z

Same thing with E711.

Would be nice to have a brief mention in the rule's docs calling out common situations where this is unsafe and why

zanieb · 2024-06-13T00:48:23Z

Thanks @psychedelicious. Would you be willing to open a pull request?

- Add fix safety blurbs for E711 `NoneComparison` & E712 `TrueFalseComparison` - same for both rules. - Minor formatting for E711 `NoneComparison`.

The fixes for rules E711 `NoneComparison` and E712 `TrueFalseComparison` are marked unsafe due to possible runtime behavior changes with libraries that override `__eq__` and `__ne__` methods. - Add a "Fix safety" section to each rule explaining why the fixes are unsafe, commonly affected library methods, and alternatives. The sections are identical for each rule. - Minor formatting tweak for E711's docs.

- Link to the relevant GH issue instead of copying examples/alternatives from the GH issue.

The fixes for rules E711 `NoneComparison` and E712 `TrueFalseComparison` are marked unsafe due to possible runtime behavior changes with libraries that override `__eq__` and `__ne__` methods. - Add a "Fix safety" section to each rule explaining why the fixes are unsafe, plus a link to a GH issue with more detail. The sections are identical for each rule. - Minor formatting tweak for E711's docs.

torzsmokus · 2024-08-02T12:04:26Z

But why do we change ==/!= to is/is not at all?? PEP8 says it is even worse.

torzsmokus · 2024-08-02T12:07:44Z

oh, checking #8164 that seems to deal with the same question…

NeilGirdhar · 2024-08-02T12:25:06Z

Now that Ruff is moving towards having type information, this issue may eventually warrant some refinement? If x has Boolean type, then if x is appropriate and if x is True is inappropriate. If x has a broader type, then either could be fine.

jbcpollak · 2024-08-02T12:54:06Z

If x has a broader type, then either could be fine.

I would argue that if x is not Boolean type, "is True" is always wrong - for example if x is numpy.bool_, comparing it with is will be wrong.

dangotbanned · 2024-08-03T07:40:51Z

I ran into this when writing this example for the next version of altair - based on upstream example

Would be applicable to:

All of the types in altair.expr.core
Derived in altair.vegalite.v5.api

NeilGirdhar · 2024-08-03T08:37:04Z

I would argue that if x is not Boolean type, "is True" is always wrong - for example if x is numpy.bool_, comparing it with is will be wrong.

By broader, I mean something like Any or bool | int or np.bool | bool, etc.

## Summary See: #4560

subnix · 2024-09-18T12:07:37Z

Note about SQLAlchemy:

The .is_ and .is_not methods may not be safe alternatives to the equality (==/!=) operators, because IS and = are different SQL operators. Consider the following example in MySQL:

SELECT 10 IS TRUE;
/* 1 */
SELECT 10 = TRUE;
/* 0 */
SELECT NULL IS NOT FALSE;
/* 1 */
SELECT NULL != FALSE;
/* NULL */

Thus, the safe alternative for comparing to booleans is:

c == True
# <sqlalchemy.sql.elements.BinaryExpression object at 0x1026203e0>
c == sqlalchemy.true()
# <sqlalchemy.sql.elements.BinaryExpression object at 0x1026222a0>

zanieb mentioned this issue May 21, 2023

[ruff --fix] (v0.0.239) When working with numpy arrays mask == True is not the same as mask is True #2443

Closed

charliermarsh added bug Something isn't working type-inference Requires more advanced type inference. labels May 21, 2023

zanieb changed the title ~~Fix changing ==/!= to is/is not is not correct for some types~~ Fix (E712) changing ==/!= to is/is not is not correct for some types May 21, 2023

charliermarsh mentioned this issue Jul 5, 2023

False Positive for E712 with np.where #5526

Closed

Seltyk mentioned this issue Aug 2, 2023

[auth] Fix session token refresh chaoss/augur#2474

Merged

1 task

zanieb mentioned this issue Oct 17, 2023

Pandas boolean if statement correction bug #8023

Closed

zanieb mentioned this issue Dec 9, 2023

E712 fix results in unsafe behavior #9063

Closed

paulf81 mentioned this issue Dec 15, 2023

Fix is None issue and re-run examples NREL/flasc#154

Merged

charliermarsh mentioned this issue Feb 8, 2024

E712 fix silently breaks Pandas/Polars queries #9883

Closed

charliermarsh mentioned this issue Mar 11, 2024

E712 "true-false-comparison" incorrectly raises on array like #10344

Closed

psychedelicious mentioned this issue Jun 13, 2024

Update docs for E711, E712 (#4560) #11859

Merged

psychedelicious added a commit to psychedelicious/ruff that referenced this issue Jun 18, 2024

Update docs for literal-comparisons (astral-sh#4560)

70a9270

- Link to the relevant GH issue instead of copying examples/alternatives from the GH issue.

AlexWaygood pushed a commit that referenced this issue Jun 18, 2024

Update docs for E711, E712 (#4560) (#11859)

104608b

zanieb mentioned this issue Jul 11, 2024

ruff doesn't find the same E721 violations as new flake8 does #12290

Closed

This was referenced Aug 9, 2024

E721 suggestion to use "is" rather than "==" results in broken code if using pandas/numpy #12765

Closed

Add known problems warning to type-comparison rule #12769

Merged

charliermarsh added a commit that referenced this issue Aug 9, 2024

Add known problems warning to type-comparison rule (#12769)

c906b01

## Summary See: #4560

cclauss mentioned this issue Nov 21, 2024

ruff check --fix --unsafe-fixes blacklanternsecurity/bbot#1997

Merged

cclauss mentioned this issue Dec 15, 2024

PEP8: Never use equality operators to compare to singletons darrenburns/posting#158

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix (E712) changing `==`/`!=` to `is`/`is not` is not correct for some types #4560

Fix (E712) changing `==`/`!=` to `is`/`is not` is not correct for some types #4560

zanieb commented May 21, 2023 •

edited

Loading

ndevenish commented Jun 19, 2023

charliermarsh commented Jun 19, 2023

nicornk commented Jul 18, 2023

dstoeckel commented Jul 18, 2023

zanieb commented Jul 18, 2023

conbrad commented Jul 18, 2023

zanieb commented Oct 17, 2023 •

edited

Loading

NeilGirdhar commented Nov 6, 2023

zanieb commented Nov 6, 2023

VictorGob commented Dec 22, 2023

simonpanay commented Feb 16, 2024

psychedelicious commented Jun 12, 2024

zanieb commented Jun 13, 2024

torzsmokus commented Aug 2, 2024

torzsmokus commented Aug 2, 2024

NeilGirdhar commented Aug 2, 2024

jbcpollak commented Aug 2, 2024

dangotbanned commented Aug 3, 2024

NeilGirdhar commented Aug 3, 2024 •

edited

Loading

subnix commented Sep 18, 2024

Fix (E712) changing ==/!= to is/is not is not correct for some types #4560

Fix (E712) changing ==/!= to is/is not is not correct for some types #4560

Comments

zanieb commented May 21, 2023 • edited Loading

Summary

Types with the issue

Resolution

Examples

Related issues

ndevenish commented Jun 19, 2023

charliermarsh commented Jun 19, 2023

nicornk commented Jul 18, 2023

dstoeckel commented Jul 18, 2023

zanieb commented Jul 18, 2023

conbrad commented Jul 18, 2023

zanieb commented Oct 17, 2023 • edited Loading

NeilGirdhar commented Nov 6, 2023

zanieb commented Nov 6, 2023

VictorGob commented Dec 22, 2023

simonpanay commented Feb 16, 2024

psychedelicious commented Jun 12, 2024

zanieb commented Jun 13, 2024

torzsmokus commented Aug 2, 2024

torzsmokus commented Aug 2, 2024

NeilGirdhar commented Aug 2, 2024

jbcpollak commented Aug 2, 2024

dangotbanned commented Aug 3, 2024

NeilGirdhar commented Aug 3, 2024 • edited Loading

subnix commented Sep 18, 2024

Fix (E712) changing `==`/`!=` to `is`/`is not` is not correct for some types #4560

Fix (E712) changing `==`/`!=` to `is`/`is not` is not correct for some types #4560

zanieb commented May 21, 2023 •

edited

Loading

zanieb commented Oct 17, 2023 •

edited

Loading

NeilGirdhar commented Aug 3, 2024 •

edited

Loading