feat(features) Use dataclasses for flagpole instead of pydantic #75859

markstory · 2024-08-08T19:13:42Z

These changes switch flagpole from using pydantic as the base classes to dataclasses from stdlib. During the pydantic 2 upgrade the time to parse features shot way up which was unexpected. We have also not been as impressed with feature flag match times and suspected that pydantic might be contributing overhead.

The changes of this pull request re-implement flagpole with basic python dataclasses. The new implementation has reduced time to build feature flags, which should help improve the overall performance of our feature flagging. Improvements were measured both with cProfile, and local mini-benchmarks. (scripts provided below)

cProfile results

The cprofile script builds a feature 1000 times and collects profiling data from those operations.

current master (pydantic)

         9004 function calls in 1.125 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.125    1.125 <string>:1(<module>)
     1000    1.123    0.001    1.124    0.001 __init__.py:117(from_feature_dictionary)
        1    0.001    0.001    1.125    1.125 flagpole-profile:21(main)
     2000    0.001    0.000    0.001    0.000 typing.py:2424(get_origin)
        1    0.000    0.000    1.125    1.125 {built-in method builtins.exec}
     6000    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

after (dataclasses)

         41004 function calls in 0.009 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.010    0.010 <string>:1(<module>)
     3000    0.000    0.000    0.000    0.000 <string>:2(__init__)
     1000    0.001    0.000    0.009    0.000 __init__.py:128(from_feature_dictionary)
     1000    0.001    0.000    0.008    0.000 __init__.py:134(<listcomp>)
     2000    0.002    0.000    0.004    0.000 conditions.py:193(_condition_from_dict)
     3000    0.002    0.000    0.007    0.000 conditions.py:217(from_dict)
     3000    0.000    0.000    0.004    0.000 conditions.py:219(<listcomp>)
     2000    0.000    0.000    0.000    0.000 enum.py:1093(__new__)
     2000    0.000    0.000    0.000    0.000 enum.py:1255(value)
     2000    0.000    0.000    0.000    0.000 enum.py:193(__get__)
     2000    0.000    0.000    0.001    0.000 enum.py:686(__call__)
        1    0.000    0.000    0.010    0.010 flagpole-profile:21(main)
        1    0.000    0.000    0.010    0.010 {built-in method builtins.exec}
     1000    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    19000    0.001    0.000    0.001    0.000 {method 'get' of 'dict' objects}

While significantly more functions were called the overall runtime is much better.

flagpole-profile script

#!/usr/bin/env python

from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import cProfile

configure()

feature_names = (
    # Largest feature flag is 2688 bytes
    "organizations:user-feedback-ui",
    # 404 bytes is around median
    "organizations:performance-chart-interpolation",
    # 97 bytes is smallest feature.
    "organizations:new-weekly-report",
)

feature_config = {}


def main():
    feature_name = "organizations:user-feedback-ui"
    # feature_name = "organizations:performance-chart-interpolation"
    # feature_name = "organizations:new-weekly-report"
    for _ in range(1000):
        FlagpoleFeature.from_feature_dictionary(feature_name, feature_config[feature_name])


if __name__ == "__main__":
    from sentry import options

    for feature_name in feature_names:
        feature_config[feature_name] = options.get(f"feature.{feature_name}")

    cProfile.run('main()')

Micro benchmark

In the microbenchmark I looked at 3 feature flags (the max, approximate median and smallest features). Each feature would be parsed 10000 times and I looked at the min, mean and max duration for building a Feature object from dictionary data loaded from options seeded with the current feature flag inventory.

before (pydantic)

Results for organizations:user-feedback-ui

option load_time 0.0018911361694335938

build_time min 0.0004580021
build_time max 0.0011467934
build_time mean 0.0004902341

RSS memory usage 272400384

Results for organizations:performance-chart-interpolation

option load_time 0.0030400753021240234

build_time min 0.0000336170
build_time max 0.0001301765
build_time mean 0.0000367536

RSS memory usage 272400384

Results for organizations:new-weekly-report

option load_time 0.0022568702697753906

build_time min 0.0000057220
build_time max 0.0000231266
build_time mean 0.0000069423

RSS memory usage 272400384

after (dataclasses)

Results for organizations:user-feedback-ui

option load_time 0.0033750534057617188

build_time min 0.0000026226
build_time max 0.0000209808
build_time mean 0.0000032377

RSS memory usage 276054016

Results for organizations:performance-chart-interpolation

option load_time 0.0033571720123291016

build_time min 0.0000016689
build_time max 0.0000610352
build_time mean 0.0000028541

RSS memory usage 276054016

Results for organizations:new-weekly-report

option load_time 0.003008127212524414

build_time min 0.0000000000
build_time max 0.0000047684
build_time mean 0.0000007447

RSS memory usage 276070400

flagpole-timing script

#!/usr/bin/env python

from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import gc
import statistics
import time
import psutil

configure()


def main():
    from sentry import options
    gc.disable()

    feature_names = (
        # Largest feature flag is 2688 bytes
        "organizations:user-feedback-ui",
        # 404 bytes is around median
        "organizations:performance-chart-interpolation",
        # 97 bytes is smallest feature.
        "organizations:new-weekly-report",
    )
    for feature_name in feature_names:
        load_start = time.time()
        option_val = options.get(f"feature.{feature_name}")
        load_end = time.time()

        build_durations = []
        for _ in range(0, 10000):
            build_start = time.time()
            FlagpoleFeature.from_feature_dictionary(feature_name, option_val)
            build_end = time.time()
            build_durations.append(build_end - build_start)

        load_time = load_end - load_start
        rss_mem = psutil.Process().memory_info().rss

        print("")
        print(f"Results for {feature_name}")
        print("")
        print(f"option load_time {load_time}")
        print("")
        print("build_time min", "{:.10f}".format(min(build_durations)))
        print("build_time max", "{:.10f}".format(max(build_durations)))
        print("build_time mean", "{:.10f}".format(statistics.mean(build_durations)))
        print("")
        print(f"RSS memory usage {rss_mem}")


if __name__ == "__main__":
    main()

Again we see significant improvements in runtime without any impact in memory usage.

What we lose?

While moving to dataclasses gives us some gains in performance it has a few drawbacks:

The dataclass implementation isn't as typesound as pydantic would be. Each and every property is not validated for type correctness.
Updating the jsonschema will be manual going forward. Pydantic has a great integration with jsonschema that allows you to generate jsonschema documents from objects. Dataclasses do not.
Drift between the schema and implementation could occur. Because the jsonschema and code are not connected they could drift in the future.

Early timing measurements are looking promising.

This schema was generated with pydantic and will be used in CI to validate that schema and dataclasses are kept compatible, and could be part of a validation flow for flagpole.

github-actions · 2024-08-08T19:13:57Z

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

codecov · 2024-08-08T19:40:26Z

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 21719 tests with 14 failed, 21504 passed and 201 skipped.

View the full list of failed tests

pytest

Class name: tests.flagpole.test_conditions.TestContainsConditions
Test name: test_does_contain
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:104: in test_does_contain
    condition = ContainsCondition(property="foo", value="bar")
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestContainsConditions
Test name: test_does_not_contain
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:120: in test_does_not_contain
    condition = ContainsCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestContainsConditions
Test name: test_invalid_property_provided
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:136: in test_invalid_property_provided
    condition = ContainsCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestContainsConditions
Test name: test_missing_context_property
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:149: in test_missing_context_property
    condition = ContainsCondition(property="foo", value="bar")
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestEqualsConditions
Test name: test_equality_type_mismatch_strings
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:193: in test_equality_type_mismatch_strings
    condition = EqualsCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestEqualsConditions
Test name: test_is_equal_case_insensitive
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:183: in test_is_equal_case_insensitive
    condition = EqualsCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestEqualsConditions
Test name: test_is_equal_string
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:165: in test_is_equal_string
    condition = EqualsCondition(property="foo", value=value)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestEqualsConditions
Test name: test_is_not_equals
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:175: in test_is_not_equals
    condition = EqualsCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_invalid_property_value
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:75: in test_invalid_property_value
    condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_is_in
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:33: in test_is_in
    condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_is_in_case_insensitivity
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:65: in test_is_in_case_insensitivity
    condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_is_in_numeric_string
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:50: in test_is_in_numeric_string
    condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_is_not_in
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:57: in test_is_not_in
    condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

Class name: tests.flagpole.test_conditions.TestInConditions
Test name: test_missing_context_property
Flags:

backend

#x1B[1m#x1B[31mtests/flagpole/test_conditions.py#x1B[0m:91: in test_missing_context_property
    in_condition = InCondition(property="foo", value=values)
#x1B[1m#x1B[31mE   TypeError: ConditionBase.__init__() missing 1 required positional argument: 'operator'#x1B[0m

markstory · 2024-08-08T20:59:27Z

src/flagpole/conditions.py

    """The value to compare against the condition's evaluation context property."""

+    operator: str = dataclasses.field(default="")


I don't love this, but it is less gross than having to always provide operator to every condition constructor throughout tests.

GabeVillalobos

Minor nits, but changes look good. Nice work!

GabeVillalobos · 2024-08-13T16:43:37Z

src/flagpole/conditions.py

-        description="A brief description or identifier for the segment"
+def condition_from_dict(data: Mapping[str, Any]) -> ConditionBase:
+    operator_kind = ConditionOperatorKind(data.get("operator", "invalid"))
+    condition_cls = OPERATOR_LOOKUP[operator_kind]


Do we want to raise explicitly here if we get an invalid operator type? Otherwise, won't this be an attribute error on NoneType?

I can make the raise explicit. Currently it would raise KeyError which isn't as informative as it could be.

GabeVillalobos · 2024-08-13T16:47:01Z

tests/flagpole/test_conditions.py

+            with pytest.raises(ConditionTypeMismatchException):
+                not_condition.match(
+                    context=EvaluationContext({"foo": attr_val}), segment_name="test"
+                )


Main reason I went with the assertion util function approach before was because it made it explicit which invalid input failed when modifying condition logic. This is mostly fine, although it does make debugging just a bit harder.

markstory added 5 commits August 8, 2024 14:45

Hackity hack

efbdf08

Get tests passing with dataclass based logic

67d43da

Early timing measurements are looking promising.

Add flagpole schema

3b44321

This schema was generated with pydantic and will be used in CI to validate that schema and dataclasses are kept compatible, and could be part of a validation flow for flagpole.

Add schema and validate() to Feature

bbf5dca

Fix failing tests and createflag command

45431c9

markstory requested review from GabeVillalobos and a team August 8, 2024 19:13

github-actions bot added Scope: Frontend Automatically applied to PRs that change frontend components Scope: Backend Automatically applied to PRs that change backend components labels Aug 8, 2024

Fix mistake

d545509

vercel bot deployed to Preview August 8, 2024 20:48 View deployment

Consolidate switches for matching operators to classes

a08a7f7

markstory commented Aug 8, 2024

View reviewed changes

vercel bot deployed to Preview August 8, 2024 21:01 View deployment

markstory requested a review from a team August 12, 2024 12:51

GabeVillalobos approved these changes Aug 13, 2024

View reviewed changes

Improve error handling for invalid operators

8a10b73

vercel bot deployed to Preview August 14, 2024 21:58 View deployment

Merge branch 'master' into flagpole-dataclasses

db744a1

vercel bot deployed to Preview August 15, 2024 14:26 View deployment

markstory merged commit 79409ca into master Aug 15, 2024
49 checks passed

markstory deleted the flagpole-dataclasses branch August 15, 2024 15:01

github-actions bot locked and limited conversation to collaborators Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(features) Use dataclasses for flagpole instead of pydantic #75859

feat(features) Use dataclasses for flagpole instead of pydantic #75859

markstory commented Aug 8, 2024

github-actions bot commented Aug 8, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading

pytest

markstory Aug 8, 2024

GabeVillalobos left a comment

GabeVillalobos Aug 13, 2024

markstory Aug 14, 2024

GabeVillalobos Aug 13, 2024

		"""The value to compare against the condition's evaluation context property."""

		operator: str = dataclasses.field(default="")

feat(features) Use dataclasses for flagpole instead of pydantic #75859

feat(features) Use dataclasses for flagpole instead of pydantic #75859

Conversation

markstory commented Aug 8, 2024

cProfile results

Micro benchmark

What we lose?

github-actions bot commented Aug 8, 2024

codecov bot commented Aug 8, 2024 • edited Loading

❌ Failed Test Results:

pytest

markstory Aug 8, 2024

Choose a reason for hiding this comment

GabeVillalobos left a comment

Choose a reason for hiding this comment

GabeVillalobos Aug 13, 2024

Choose a reason for hiding this comment

markstory Aug 14, 2024

Choose a reason for hiding this comment

GabeVillalobos Aug 13, 2024

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2024 •

edited

Loading