Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add top-level comparator script #4008

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

judovana
Copy link
Contributor

Hello! I believe the #3991 was closed due to not understanding waht it is doing.

This script is not doing any rebuild, and thus is not comparing on identity, but on binary identically builds only. As it is, it do ot have any similar work present in the toolchain. Can you please reconsider its inclusion?

maybe the in file/main readme:

# This is top level script for comparing two jdks.  It 
# This script expects exactly two args - tag:os.arch/dir/archive, two times
# eg: compare_builds.sh  /usr/lib/jvm/java-21-openjdk /my/home/my_custom_tarball.tar.xz
# eg: compare_builds.sh  /my/custom/dir/with_jdk jdk-21.0.4+7:windows.x64
# if the parameter is tag, the temurin jdk of given tag:os.arch is downloaded
# yo can omit the :os.arch part, then it will be guessed... but you may know better
# It copies and creates all in CWD!!!!
# Pass the files/dirs as absolute values, it should not be that hard
# It is useful to have JAVA_HOME pointing to some other JDK used to compile and run supporting files
# otherwise one of the provided JDKs will be duplicated, and used
# for your own safety, do not run it as root
#
# The JDK you are comparing is launched to get various info about itself
# If you need to compare jdks on different platform then they are for, you need to go manually.

is misleading?

@github-actions github-actions bot added the windows Issues that affect or relate to the WINDOWS OS label Oct 25, 2024
@judovana
Copy link
Contributor Author

judovana commented Oct 31, 2024

Thje label is not correct this is cygwin (thus windows) and linux

@judovana
Copy link
Contributor Author

judovana commented Nov 5, 2024

Hello good people. Anybody around? The #3991 was closed as duplicate, and there is no duplicate toolchain. All the tools are written, but the readme is quite complex, and one must follow it. In addition there are quite a few corenr cases.

This script remains valid contribution, which is putting it all together. I found small intersection with detection of signtool, but thats all.

@karianna
Copy link
Contributor

karianna commented Nov 5, 2024

needs linter fixes to start with.

@judovana
Copy link
Contributor Author

judovana commented Nov 6, 2024

@karianna the linter is all green, but the job never finishes and timeouts:(

@karianna
Copy link
Contributor

I see:

File:[/github/workspace/tooling/compare-builds.sh]
2024-11-10 01:46:57 [WARN] Warnings found in [bash-exec] linter!
2024-11-10 01:46:57 [WARN] Error: File:[/github/workspace/tooling/compare-builds.sh] is not executable

@judovana
Copy link
Contributor Author

judovana commented Nov 11, 2024 via email

@karianna
Copy link
Contributor

OK so the linter fails on JSCPD, ew have a config file for this, have you tried running locally?

@judovana
Copy link
Contributor Author

judovana commented Nov 12, 2024

Even the local npx jscpd -v compare-builds.sh runs for ever. The method of tapsAndJunits is guilty. Up to local totalOnlyIn is ok, but then more and more lines is added, it runs longer^lines so since some 5 lines I had never seen an end. Similarly when cutting from bottom, from aprox printXmlTest "compare" "compared-files-count" "2" "${totalFile}" "../artifact/$(basename "${totalFile}")" >> "${unitFile}" it starts to go longer^lines again.

the

 if [ "${totalDiffs}" -eq 0 ] ; then
      passed=$(("${passed}"+1))
    else
      failed=$(("${failed}"+1))
    fi
    if [ "${totalFiles}" -ne 0 ] ; then
      passed=$(("${passed}"+1))
    else
      failed=$(("${failed}"+1))
    fi
    if [ "${totalOnlyIn}" -eq 0 ] ; then
      passed=$(("${passed}"+1))
    else
      failed=$(("${failed}"+1))
    fi
    total=$(("${passed}"+"${failed}"))
    printXmlHeader "${passed}" "${failed}" "${total}" 0 "compare-comparable-builds" > "${unitFile}"
    tapHeader "${total}"  "$(date)" > "${resultsTapFile}"
    if [ "${totalDiffs}" -eq 0 ] ; then
      printXmlTest "compare" "differences-count" "1" "" "../artifact/$(basename "${differencesFile}")" >> "${unitFile}"
      tapTestStart "ok" "1" "differences-count" >> "${resultsTapFile}"
      echo "-- no differences --"
    else
      printXmlTest "compare" "differences-count" "1" "${differencesFile}" "../artifact/$(basename "${differencesFile}")" >> "${unitFile}"
      tapTestStart "not ok" "1" "differences-count" >> "${resultsTapFile}"
      echo "-- differences count: $totalDiffs --"
    fi
    set +e
      tapFromWholeFile "${differencesFile}" "${differencesFile}" >> "${resultsTapFile}"
      tapTestEnd >> "${resultsTapFile}"
    set -e
    if [ "${totalFiles}" -ne 0 ] ; then
      printXmlTest "compare" "compared-files-count" "2" "" "../artifact/$(basename "${totalFile}")" >> "${unitFile}"
      tapTestStart "ok" "2" "compared-files-count" >> "${resultsTapFile}"
      echo "-- files compared: $totalFiles --"
    else
      printXmlTest "compare" "compared-files-count" "2" "${totalFile}" "../artifact/$(basename "${totalFile}")" >> "${unitFile}"
      tapTestStart "not ok" "2" "compared-files-count" >> "${resultsTapFile}"
      echo "-- no files compared --"
    fi
    set +e
      tapFromWholeFile "${totalFile}" "${totalFile}" >> "${resultsTapFile}"
      tapTestEnd >> "${resultsTapFile}"
    set -e
    if [ "${totalOnlyIn}" -eq 0 ] ; then
      printXmlTest "compare" "onlyin-count" "3" "" "../artifact/$(basename "${differencesFile}")" >> "${unitFile}"
      tapTestStart "ok" "3" "onlyin-count" >> "${resultsTapFile}"
      echo "-- no onlyin files --"
    else
      printXmlTest "compare" "onlyin-count" "3" "${differencesFile}" "../artifact/$(basename "${differencesFile}")" >> "${unitFile}"
      tapTestStart "not ok" "3" "onlyin-count" >> "${resultsTapFile}"
      echo "-- onlyin files count: $totalOnlyIn --"
    fi
    set +e
      tapFromWholeFile "${differencesFile}" "${differencesFile}" >> "${resultsTapFile}"
      tapTestEnd >> "${resultsTapFile}"
    set -e
    if [ $GLOABL_RESULT -eq 0 ] ; then
      printXmlTest "compare" "comparable-builds" "4" "" "../artifact/$(basename "${diffFileParam}")" >> "${unitFile}"
      tapTestStart "ok" "4" "comparable-builds" >> "${resultsTapFile}"
      echo "-- COMPARABLE --"
    else
      printXmlTest "compare" "comparable-builds" "4" "${diffFileParam}" "../artifact/$(basename "${diffFileParam}")" >> "${unitFile}"
      tapTestStart "not ok" "4" "comparable-builds" >> "${resultsTapFile}"
      echo "-- MISMATCH --"
    fi

Is malicious. That code is obviously so nasty it deadlocks the tool. I will decompose it a bit.

@judovana
Copy link
Contributor Author

On contrary, when rest of the file is removed, and only "that" code is left in, it is in few ms finsihed.

@judovana
Copy link
Contributor Author

No success to narrow it more. Filed kucherenko/jscpd#716

Will redactor the tapsAndJunits method to be more human (and thus JSCPD) friendly

@sxa
Copy link
Member

sxa commented Dec 4, 2024

Since there is a lot of code here I haven't gone through everything that it has done yet, but can you clarify what the intent of this is and how it is different from the comparison tools we do after attempting a binary rebuild?

For example we have the Compare_JDK function at

which is executed for comparing a new rebuild with an existing build (Usually one from the Temurin CI). We definitely don't want to have two separate implementations of "compare two builds and report on the differences" but if this is genuinely doing something completely different let us know.

@judovana
Copy link
Contributor Author

judovana commented Dec 4, 2024

Hi! thanx for reply. I definitely agree that there must be no duplicated code. If any "top level comparator" would just call set, or even only one, function(s) from various shared libraries, it would be ok. I was last checking the state of code in shared libraries, in earl October and all I needed for such code was missing. Maybe it already got better?

Primarily, it should be calling the tools from Andrew(which requires some set-up and detection, which it must do too), which are checking not just identical state, but binary comparability - https://github.com/adoptium/temurin-build/blob/master/tooling/reproducible/ReproducibleBuilds.md#comparable-build-tools - optionally, why not. Sometimes you nedd them, sometimes not. As far as I can read it, those are still not called.
Secondarily it have to take care of various corner-cases - like doing the work on copy if you are comparing directories, as the jdk after the check is destroyed. With this hand in hand, it can take good care of third jdk which you need (to run https://github.com/adoptium/temurin-build/tree/master/tooling/src/java/temurin/tools), with unpacking of the tarabll
Then many small tweeks, like eg downloading the reference temurin, which I want to compare my build against.
Minor thing - there should be one script for windows, linux and mac

Maybe there are similar functions already on place, but I do not see them. Quick check:

I found detection of signtool.exe and cl.exe and msvs cBuild tools.

I believe that standalone, reusable script, should be proudly presented, not hidden, and if it exists, it should be called from other work-horses, not the opposite - because the comparison of binary-comparable is standalone and tricky task, and it is super useful for much wider audience.

@sxa
Copy link
Member

sxa commented Dec 4, 2024

because the comparison of binary-comparable is standalone and tricky task, and it is super useful for much wider audience.

Yeah I agree on that. We do have articles about running the stuff we have listed on https://adoptium.net/en-GB/blog/2024/08/adoptium-reproducible-verification-builds/ (and my now a little out of date as things have moved on videos such as this one which talks about the automation we have in place.

On the specifics you mention:

@judovana
Copy link
Contributor Author

judovana commented Dec 5, 2024

because the comparison of binary-comparable is standalone and tricky task, and it is super useful for much wider audience.

Yeah I agree on that. We do have articles about running the stuff we have listed on https://adoptium.net/en-GB/blog/2024/08/adoptium-reproducible-verification-builds/ (and my now a little out of date as things have moved on videos such as this one which talks about the automation we have in place.

I meant more via reusable script then via talk/readme:) But this is nice one.

On the specifics you mention:

* [repro_common.sh](https://github.com/adoptium/temurin-build/blob/master/tooling/reproducible/repro_common.sh) uses BinRepl

Thanx! I had missed that one. So the script I'm proposing probably can lost some part of logic by calling several functions from various repro_comon (and maybe others, as some are in windows only script and so on). It is quite hard to read.

* comparable_patch.sh is mentioned in https://github.com/adoptium/temurin-build/blob/master/tooling/reproducible/ReproducibleBuilds.md for people who want to use it although @andrew-m-leonard can probably comment on whether it's used by anything we specifically run at the project. I suspect not if it's only used for comparing Temurin with other vendors.

* `no detection of  -XshowSettings 2>&1| grep 'java.vendor (and similar) for extraction from binaries` is likely correct since if the binaries are considered the same that may not be too important.

Yup, thas it, The identical logic is moreover indeed around, but the binary-comaprable logic is not.

@judovana
Copy link
Contributor Author

@sxa I just realised I did not asked one clarification: " two separate implementations ". I'm still failing to see the first one.

I'm generally reading it, that there are some function, which have duplication inside this script. So if I remove the existing duplicated "bodies" (in this script) and replace them by existing functions, such a script should become acceptable?

@karianna karianna requested a review from sxa February 12, 2025 22:39
@sxa
Copy link
Member

sxa commented Feb 14, 2025

I'm only holding off on a full formal review here because I want @andrew-m-leonard to have some input in this and confirm that it doesn't duplicate anything. Adding him as a reviewer.

@sxa sxa requested a review from andrew-m-leonard February 14, 2025 11:21
@judovana
Copy link
Contributor Author

I'm only holding off on a full formal review here because I want @andrew-m-leonard to have some input in this and confirm that it doesn't duplicate anything. Adding him as a reviewer.

I think it currently duplicate detection of windows tools. I have to honestly admit that I gave up that it wille never go in, and was not working on the deduplication on meantime.

# the result will be changed via tap plugin or jtreg plugin
exit 0
else
exit ${GLOABL_RESULT}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably ought to spell this?: GLOBAL_RESULT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. sure!

@andrew-m-leonard
Copy link
Contributor

Linter needs fixing

@judovana
Copy link
Contributor Author

judovana commented Feb 14, 2025

Linter needs fixing

It needs more then fixing. The linter is deadlocking - kucherenko/jscpd#716. I will elaborate on it.

export JAVA_HOME="${JAVA_HOME}"
}

function tapsAndJunits() {
Copy link
Contributor

@smlambert smlambert Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original plan for this work (before judovana stepped in) was to run it as an AQAvit test target (like the reproducible tests), in which case, we would not want the test to generate TAP files, since TKG takes care of that. I prefer not to add wrapper scripts that duplicate functionality we already have in place, as it means more code to track and maintain. At very least, will need a flag to disable generation of this TAP file. Our Jenkins jobs and monitoring solutions will not 'appreciate' other tap files being present.

Copy link
Contributor Author

@judovana judovana Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag is - as on all my PRs - is already there. If not, it would be a mistake I would be more then happy to fix. It is enabled by default, but it is single variable to turn it off. I have absolutely no issues to make it off by default . The script is designed as standalone, so it have to be able to produce some machine readable results better then just return value. But I also agree it should be optional and disabled by default. Will do. I did not stepped in with bad intention. I looked into code and found nothing similar. I'm rally sorry for any intrusion it caused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off by default for tap files please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Am on it. As jtreg results are generated by freshly cloned clone "https://github.com/rh-openjdk/run-folder-as-tests.git", that will be off by default too.

@smlambert smlambert changed the title Added top-level comaprator script Added top-level comparator script Feb 16, 2025
@smlambert smlambert changed the title Added top-level comparator script Add top-level comparator script Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Issues that affect or relate to the WINDOWS OS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants