Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Include missing related drafts in IPR searches #7836

Merged
merged 6 commits into from
Sep 10, 2024

Conversation

microamp
Copy link
Collaborator

Draft PR to get some feedback.


Fixes #7824 as well as two UI bugs (see below).

  1. Currently, the first item in the search results in https://datatracker.ietf.org/ipr/search/?submit=draft&id=rfc8955 is

    Results for RFC 5575 ("Dissemination of Flow Specification Rules")

    (note no "was replaced by" or "was obsoleted by")

    It should have been

    Results for RFC 5575 ("Dissemination of Flow Specification Rules"), which was obsoleted by RFC 8955 ("Dissemination of Flow Specification Rules")

  2. https://datatracker.ietf.org/ipr/search/?submit=draft&id=rfc8955

    currently lists

    Results for draft-ietf-idr-rfc5575bis ("Dissemination of Flow Specification Rules"), which was became rfc draft-ietf-idr-rfc5575bis ("Dissemination of Flow Specification Rules")

    (note "<draft>, which was became rfc <draft>")

    It should have been

    Results for draft-ietf-idr-rfc5575bis ("Dissemination of Flow Specification Rules"), which became rfc RFC 8955 ("Dissemination of Flow Specification Rules")


In the case of RFC 8955, it should now list 3 IPR disclosures

  • 2009-02-26, 1106, Juniper's Statement of IPR related to draft-ietf-idr-flow-spec-05
  • 2009-02-09, 1088, Juniper's Statement of IPR related to draft-ietf-idr-flow-spec-03
  • 2008-12-23, 1052, Juniper Networks Inc.'s Statement of IPR claimed in draft-ietf-idr-flow-spec-03 (Removed)

and 6 documents

  • Results for RFC 5575 ("Dissemination of Flow Specification Rules"), which was obsoleted by RFC 8955 ("Dissemination of Flow Specification Rules")

    • no direct IPR disclosures
  • Results for draft-ietf-idr-flow-spec ("Dissemination of Flow Specification Rules"), which became rfc RFC 5575 ("Dissemination of Flow Specification Rules")

    • the same three IPR disclosures as above
  • Results for RFC 8955 ("Dissemination of Flow Specification Rules")

    • no direct IPR disclosures
  • Results for draft-ietf-idr-rfc5575bis ("Dissemination of Flow Specification Rules"), which became rfc RFC 8955 ("Dissemination of Flow Specification Rules")

    • no direct IPR disclosures
  • Results for draft-ietf-idr-flowspec-redirect-rt-bis ("Clarification of the Flowspec Redirect Extended Community"), which became rfc RFC 7674 ("Clarification of the Flowspec Redirect Extended Community")

    • no direct IPR disclosures
  • Results for RFC 7674 ("Clarification of the Flowspec Redirect Extended Community"), which was obsoleted by RFC 8955 ("Dissemination of Flow Specification Rules")

    • no direct IPR disclosures

@microamp microamp requested a review from rjsparks August 19, 2024 08:28
@microamp
Copy link
Collaborator Author

Re: the first bug above, the current solution is to replace

{% if not forloop.first %}

with

{% if d != doc %}

in the template.

Another option is to preserve the order of items in the backend code, so that the RFC selected will always be the first item in the doc list. That way, {% if not forloop.first %} will work as expected.

Code like

list(set(results))  # items in a set are unordered

from the related_docs function can be replaced with

sorted(set(results), key=results.index)  # use set to discard duplicates, but preserve the order of items

for instance.

Copy link

codecov bot commented Aug 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.82%. Comparing base (c7f6bde) to head (a4fc313).
Report is 61 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7836      +/-   ##
==========================================
+ Coverage   88.78%   88.82%   +0.04%     
==========================================
  Files         296      304       +8     
  Lines       41320    41495     +175     
==========================================
+ Hits        36687    36860     +173     
- Misses       4633     4635       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@microamp
Copy link
Collaborator Author

microamp commented Aug 19, 2024

Re: the second bug above, the current solution is to introduce an additional if-else statements in the template to handle both relationship and reverse relationship. (Currently, it's not handling the latter correctly.)

{% if d != d.related.source %}, which was ...
{% else %}, which ...
{% endif %}

So, for a draft that has become one of the related RFCs, it will correctly say

  • <draft>, which became rfc <rfc> instead of
  • <draft>, which was became rfc <draft>

However, I'm looking for a cleaner solution.

@rjsparks
Copy link
Member

I pushed a significant refactor that gets the same documents, but (I think) reads better for future us. I also added a sort of the related documents. If this looks good to @microamp, please take this out of draft and request a review from @jennifer-richards

Copy link
Collaborator Author

@microamp microamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll take the PR out of draft.

for draft in drafts
]
)
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong preference here, but I personally find it easier to read when fors are kept flat like below.

docs.update(
    set(
        draft
        for d in docs
        for draft in related_docs(d, ...)
    )
)

Let it be a team decision though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be whatever Black style says, which I think is what you suggest but not certain

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now that I'm at a keyboard I see that this is more about the brackets than about the styling per se so appealing to Black isn't quite enough...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, what's there is what black produced.

Copy link
Collaborator Author

@microamp microamp Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now that I'm at a keyboard I see that this is more about the brackets than about the styling per se

Correct. I wasn't referring to formatting style - sorry for the confusion. It was more about whether the new code is actually more readable despite having one more for in the expression. My opinion is that it isn't.

Anyway, I'm okay with either as long as you're happy with the logic itself.

d.rfc_number if d.rfc_number is not None else 0,
d.became_rfc().rfc_number if d.became_rfc() else 0,
),
reverse=True,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Food for thought (not requesting a change here) - we can even do something like

key=lambda d: (
    d == first,  # to ensure that the doc selected will always be first (untested)
    # the rest...
)

The expression will return a bool value which is just an int in Python. i.e.

assert isinstance(True, int)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 0 if d == first else 1 would be safer since it doesn't rely on knowing about the Python internals (and because I think it'd have to be d != first because

>>> False < True
True

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and because I think it'd have to be d != first because

>>> False < True
True

We have reverse=True as an argument to the sorted function above. So, I believe d == first is the correct one.

assert sorted([False, True], reverse=True) == [True, False]

Anyway, I'll leave the existing code as-is, leaving the template to handle that logic for now.

updated_docs = related_docs(first, ("updates",))
related_iprs = list(
set(iprs_from_docs(updated_docs, states=states)) - set(iprs)
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Formatting changes can be done in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran black on the unit. I could have done it in a separate commit.

@microamp microamp marked this pull request as ready for review August 30, 2024 04:14
Copy link
Member

@jennifer-richards jennifer-richards left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline, I have one non-essential (but I would encourage) reformatting suggestion for template readability.

The "request changes" is really just to ask whether the gathering of docs deals well enough with a list that contains a document that is related to more than one document in the set. At a quick read, I think it'll wind up with an unpredictable related value "winning" in the list that actually gets rendered and cause incorrect output. However, I'm not sure whether the data are such that this is a real concern. It'd be good to check.

(Not opposed to merging this as an improvement and handling that as a separate issue if it actually is one)

d.rfc_number if d.rfc_number is not None else 0,
d.became_rfc().rfc_number if d.became_rfc() else 0,
),
reverse=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 0 if d == first else 1 would be safer since it doesn't rely on knowing about the Python internals (and because I think it'd have to be d != first because

>>> False < True
True

<tbody>
<tr>
<th scope="col" class="table-info" colspan="3">
Results for {{ doc.name|prettystdname|urlize_ietf_docs }} ("{{ doc.title }}"){% if not forloop.first %}{% if doc.related %}, which was {{ doc.relation|lower }} {{ doc.related.source|prettystdname|urlize_ietf_docs }} ("{{ doc.related.source.title }}"){% endif %}{% endif %}
Results for {{ d.name|prettystdname|urlize_ietf_docs }} ("{{ d.title }}"){% if d != doc %}{% if d.related %}{% if d != d.related.source %}, which was {{ d.relation|lower }} {{ d.related.source|prettystdname|urlize_ietf_docs }} ("{{ d.related.source.title }}"){% else %}, which {{ d.relation|lower}} {{ d.related.target|prettystdname|urlize_ietf_docs }} ("{{ d.related.target.title }}"){% endif %}{% endif %}{% endif %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shudder

I think maybe the following refactor reduces the 🍝 factor a little?

Results for {{ d.name|prettystdname|urlize_ietf_docs }}
("{{ d.title }}"){% if d != doc and d.related %}, which
    {% if d == d.related.source %}
        {{ d.relation|lower }}
        {{ d.related.target|prettystdname|urlize_ietf_docs }}
        ("{{ d.related.target.title }}")
    {% else %}
        was {{ d.relation|lower }}
        {{ d.related.source|prettystdname|urlize_ietf_docs }}
        ("{{ d.related.source.title }}")
    {% endif %}
{% endif %}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It introduces a lot of whitespace in the resulting html - something Lars was pushing back against pretty hard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readable template seem more valuable to me. Could also flatten the indentation if it's about file size since a CR doesn't actually consume any more bits than a space...

(I'm sympathetic to itching due to awful html whitespace, but not convinced it's a driving concern)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor suggested above is in. We can either keep it or revert it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6e4c803 shows how many whitespace characters were introduced by the previous commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems ok to me. I don't want to pick a fight with Lars though. :-)

@microamp
Copy link
Collaborator Author

microamp commented Sep 10, 2024

The "request changes" is really just to ask whether the gathering of docs deals well enough with a list that contains a document that is related to more than one document in the set. At a quick read, I think it'll wind up with an unpredictable related value "winning" in the list that actually gets rendered and cause incorrect output. However, I'm not sure whether the data are such that this is a real concern. It'd be good to check.

I don't think that behaviour is possible, but there may be some edge cases that I'm not aware of. Do you have a specific example that I can verify locally?

@microamp
Copy link
Collaborator Author

The unit test failing could be related to #7918.

@jennifer-richards
Copy link
Member

jennifer-richards commented Sep 10, 2024

I don't think that behaviour is possible, but there may be some edge cases that I'm not aware of. Do you have a specific example that I can verify locally?

No, I was just worried because the new structure of the code collects some docs, then adds docs related to those docs, then adds docs related to those docs. However, I looked more closely and in pseudocode I believe it's gathering:

DraftsThatBecameThisRFC(doc + ReplacedBy(doc) + ObsoletedBy(doc))

so a collision would mean something like an RFC came from two drafts, one of which replaced/obsoleted the other, and that can't happen.

Comment on lines +204 to +227
self.assertContains(
r,
f"""Results for <a href="/doc/{rfc_new.name}/">{prettify_std_name(rfc_new.name)}</a>
("{rfc_new.title}")""",
)
self.assertContains(
r,
f"""Results for <a href="/doc/{rfc.name}/">{prettify_std_name(rfc.name)}</a>
("{rfc.title}"), which

was obsoleted by
<a href="/doc/{rfc_new.name}/">{prettify_std_name(rfc_new.name)}</a>
("{rfc_new.title}")""",
)
self.assertContains(
r,
f"""Results for <a href="/doc/{draft.name}/">{prettify_std_name(draft.name)}</a>
("{draft.title}"), which

became rfc
<a href="/doc/{rfc.name}/">{prettify_std_name(rfc.name)}</a>
("{rfc.title}")""",
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a test live at risk of failing because of whitespace change in the html isn't good.
I'll let this in, but we should change this soon to something that is testing for the important content and not the layout details. Looking just to see that the right drafts and RFCs are mentioned should be enough.

@rjsparks rjsparks merged commit 80599f2 into ietf-tools:main Sep 10, 2024
9 checks passed
@microamp microamp deleted the 7824-ipr-search branch September 10, 2024 22:50
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include came_from_draft documents on IPR search results pages with results containing RFCs
3 participants