Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested modification to use format conversion field #106

Open
mwtoews opened this issue Aug 10, 2021 · 2 comments
Open

Suggested modification to use format conversion field #106

mwtoews opened this issue Aug 10, 2021 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mwtoews
Copy link

mwtoews commented Aug 10, 2021

The Format String Syntax includes a conversion field, which can be one of:

  • !s to call str()
  • !r to call repr()
  • !a to call ascii()

E.g. take a look at the following:

import numpy as np  # ensure this is version >=1.14

x = np.float32(101.1)
# bad output: 101.0999984741211
print("{}".format(x))
# good output: 101.1
print("{!s}".format(x))
print("{}".format(str(x)))
print("%s" % x)
print("%s" % str(x))
print("{!r}".format(x))
print("{}".format(repr(x)))
print("%r" % x)
print("%s" % repr(x))

(side note about the "bad output" is that is due to conversion from float32 to Python's native 64-bit representation, and we never want to see this as it's a bad conversion).

Currently, flynt v.0.65 doesn't attempt to modify the conversion field of the format string. For example:

  • "{}".format(str(x)) (and "%s" % str(x)) should be translated to f"{x!s}" rather than f"{str(x)}"
  • "{}".format(repr(x)) (and "%s" % repr(x)) should be translated to f"{x!r}" rather than f"{repr(x)}".
  • "{}".format(ascii(x)) (and "%s" % ascii(x)) should be translated to f"{x!a}" rather than f"{ascii(x)}".

The motivation behind this suggestion is that the resulting conversion is shorter, preserves output intent (i.e. no unexpected conversions, demonstrated with numpy), and tests with %timeit consistently show measurable efficiencies with format strings that directly use the conversion field. This is shown with:

In [1]: from dis import dis

In [2]: dis(compile('f"{x!s}"', '', 'exec'))
  1           0 LOAD_NAME                0 (x)
              2 FORMAT_VALUE             1 (str)
              4 POP_TOP
              6 LOAD_CONST               0 (None)
              8 RETURN_VALUE

In [3]: dis(compile('f"{str(x)}"', '', 'exec'))
  1           0 LOAD_NAME                0 (str)
              2 LOAD_NAME                1 (x)
              4 CALL_FUNCTION            1
              6 FORMAT_VALUE             0
              8 POP_TOP
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE

As you can see, fewer processing steps are required when format conversions are embedded.

@ikamensh
Copy link
Owner

Hi, thanks for this suggestion.

TLDR: I'm open to having this in flynt, but unlikely to find time to implement it as of now. Unless it's provably safe, this should be under --aggressive flag or some new flag and not a default behavior.

The main concern of flynt is to convert .format calls and % formatting to f-strings. As I understand, any code that has .format() call could together with e.g. ascii() was not using the possible formatting notation before.

So this is like an opportunity to solve one more problem, namely not using the !r, !s, !a formatting modifiers. I suspect that many programmers actually don't know about them at all.

Implementation

  • Implementation-wise, this could be a part of NodeTransformer visit to all AST Call nodes.
  • Putting it under some flag or not: this can be default behavior of flynt iff under the hood exactly the same mechanisms are guaranteed to be called.
  • !s formatting specifier - is it any different from not having !s? I thought that everything without a specifier will be treated as !s; if it is so, maybe it should always be omitted?

@ikamensh ikamensh added enhancement New feature or request help wanted Extra attention is needed labels Aug 16, 2021
@mwtoews
Copy link
Author

mwtoews commented Aug 16, 2021

I'd imagine transferring simple str(x) their equivalent !s conversion field is safe, as this is the expected behaviour since Python 3.0 (and possibly Python 2.6, but this only had !r and !s). But yes, f{x!s} is different than f{x}.

The numpy demo above is not obvious, so here is a clearer example of the class behaviour:

class Foo:
    def __format__(self, format_spec):
        return f"format with spec: {format_spec!r}"
    def __str__(self):
         return "str"
    def __repr__(self):
         return "repr–"

obj = Foo()
print(f"{obj}")
print(f"{obj:some things}")
print(f"{obj!s}")
print(f"{obj!r}")
print(f"{obj!a}")
print(f"{obj!a:-^30}")

prints

format with spec: ''
format with spec: 'some things'
str
repr–
repr\u2013
----------repr\u2013----------

I'll admit that I thought !s was the same as without too, but internally it calls either __str__ or __format__ methods, where implemented. And the content of format_spec (after :) is a whole other thing that has nothing to do with this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants