Fix: mdsdisconnect for MATLAB java bridge no longer uses Java finalize()#2936
Conversation
|
Note that this fix merely immediately disconnects from the archive server, but it does not destroy the MATLAB "connection" objects. They remain dangling objects that only get cleared when the Java garbage collector eventually runs or the user shuts down MATLAB. Nonetheless, this fix made a dramatic difference on the test systems. Prior to the fix, the test virtual machines could only iterate 64 times before MATLAB and/or the archive server choked. With the fix, it easily handled 1,000 iterations of the |
joshStillerman
left a comment
There was a problem hiding this comment.
this looks great!
|
This PR has been tested on GA's Omega cluster and works OK. Here are the details . . . This simple MATLAB script reproduced the problem reported by the GA researchers in issues #2925 and #2926. If the loop is changed to run many iterations, and run with MDSplus alpha-7-139-59, then each iteration leaves a dangling object in MATLAB and a dangling connection on the MDSplus archive server. As those dangling objects / connections accumulate, eventually MATLAB and/or the server choke, thereby forcing the user to exit MATLAB and start over. STATUS QUO: This is the MDSplus version GA has installed on Omega. It is from the "alpha" branch. This is the output of the MATLAB script when run in MATLAB 2024b. Note that the PR 2936 FIX: A "developer" build of MDSplus (current "alpha + PR 2936") was put in a tar file and installed in my home directory on Omega. (This installation has no impact on the cluster's main MDSplus.) A shell script was run to set environment variables in my session to point to the MDSplus dev build. Also a The This shows the output of the MATLAB program when running with the dev build of MDSplus. Although not shown here, the loop was set to 200 iterations and worked fine. NOTES: An attempt was made to run GA's Upgrading MDSplus on a cluster that has many users can be a big task. GA might find it simpler to modify its MATLAB scripts -- |
06e7aee
|
Restored the Note that when a Java process ends, the garbage collector is usually not run (i.e., instead the operating system cleans up all the system resources for the process). The Java garbage collector thus typically runs when the process is under memory pressure. The Java Thus, it is difficult to create a test case that proves that the MDSpluss Connection object's Nonetheless, this revised version of this PR works fine in manual testing. It compiles OK, immediately disconnects when the |
This is a fix for Issue #2925.
To enable MATLAB to access MDSplus, there are two bridges: Python and Java. The Java bridge was leaving dangling "connection" objects in MATLAB and dangling mdsip connections on the MDSplus archive server. This is because the
mdsdisconnectprovided to MATLAB was not immediately terminating the mdsip connection to the server. It was instead releasing a reference to the "connection" object, which was in turn relying on Java's garbage collector to run thefinalize()method to destroy the connection. However, the garbage collector runs infrequently and cannot be controlled by the application program. Which means that MATLAB accumulates dangling "connection" objects and the MDSplus archive server accumulates dangling mdsip connections. This was causing MATLAB and/or the archive server to eventually choke. The mdsip connections would only be destroyed when MATLAB's garbage collector ran, MATLAB crashed, or the user shutdown MATLAB.Note that the Java
finalize()method was deprecated in Java 9 (in part because of this issue with the garbage collector). Oracle's most recent advice is to remove thefinalize()method from classes (see the entry for java.lang.Object.finalize in the following document).https://docs.oracle.com/en/java/javase/22/docs/api/deprecated-list.html
This PR only fixes the MATLAB Java bridge
mdsdisconnectissue. Future PRs should removefinalize()from all other Java classes in MDSplus.This PR was tested manually by using two virtual machines: an archive server and a client. The client ran a MATLAB program that had a loop that iterated 1,000 times, with each iteration doing a pair of
mdsconnectandmdsdisconnectcalls. When the MATLAB test finished, the archive VM was checked to see how many mdsip connections were present. Prior to implementing this fix, the test showed that there was a dangling mdsip connection for every iteration of the loop. With the fix, all mdsip connections were closed.