Skip to content

Fix: mdsdisconnect for MATLAB java bridge no longer uses Java finalize()#2936

Merged
mwinkel-dev merged 2 commits intoMDSplus:alphafrom
mwinkel-dev:mw-2925-matlab-bridge
Jul 22, 2025
Merged

Fix: mdsdisconnect for MATLAB java bridge no longer uses Java finalize()#2936
mwinkel-dev merged 2 commits intoMDSplus:alphafrom
mwinkel-dev:mw-2925-matlab-bridge

Conversation

@mwinkel-dev
Copy link
Contributor

This is a fix for Issue #2925.

To enable MATLAB to access MDSplus, there are two bridges: Python and Java. The Java bridge was leaving dangling "connection" objects in MATLAB and dangling mdsip connections on the MDSplus archive server. This is because the mdsdisconnect provided to MATLAB was not immediately terminating the mdsip connection to the server. It was instead releasing a reference to the "connection" object, which was in turn relying on Java's garbage collector to run the finalize() method to destroy the connection. However, the garbage collector runs infrequently and cannot be controlled by the application program. Which means that MATLAB accumulates dangling "connection" objects and the MDSplus archive server accumulates dangling mdsip connections. This was causing MATLAB and/or the archive server to eventually choke. The mdsip connections would only be destroyed when MATLAB's garbage collector ran, MATLAB crashed, or the user shutdown MATLAB.

Note that the Java finalize() method was deprecated in Java 9 (in part because of this issue with the garbage collector). Oracle's most recent advice is to remove the finalize() method from classes (see the entry for java.lang.Object.finalize in the following document).
https://docs.oracle.com/en/java/javase/22/docs/api/deprecated-list.html

This PR only fixes the MATLAB Java bridge mdsdisconnect issue. Future PRs should remove finalize() from all other Java classes in MDSplus.

This PR was tested manually by using two virtual machines: an archive server and a client. The client ran a MATLAB program that had a loop that iterated 1,000 times, with each iteration doing a pair of mdsconnect and mdsdisconnect calls. When the MATLAB test finished, the archive VM was checked to see how many mdsip connections were present. Prior to implementing this fix, the test showed that there was a dangling mdsip connection for every iteration of the loop. With the fix, all mdsip connections were closed.

@mwinkel-dev mwinkel-dev added bug An unexpected problem or unintended behavior US Priority api/matlab Relates to the Matlab API labels Jul 16, 2025
@mwinkel-dev
Copy link
Contributor Author

The fix for Issue #2925 should also include PR #2897 to turn off a debug statement in the mdstcpip/mdsipshr/MdsIpThreadStatic.c file. That statement was generating the many messages that the users reported with Issue #2925.

@mwinkel-dev
Copy link
Contributor Author

Note that this fix merely immediately disconnects from the archive server, but it does not destroy the MATLAB "connection" objects. They remain dangling objects that only get cleared when the Java garbage collector eventually runs or the user shuts down MATLAB.

Nonetheless, this fix made a dramatic difference on the test systems. Prior to the fix, the test virtual machines could only iterate 64 times before MATLAB and/or the archive server choked.

With the fix, it easily handled 1,000 iterations of the mdsconnect / mdsdisconnect pairs.

joshStillerman
joshStillerman previously approved these changes Jul 17, 2025
Copy link
Contributor

@joshStillerman joshStillerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great!

@mwinkel-dev
Copy link
Contributor Author

mwinkel-dev commented Jul 17, 2025

This PR has been tested on GA's Omega cluster and works OK.

Here are the details . . .

This simple MATLAB script reproduced the problem reported by the GA researchers in issues #2925 and #2926. If the loop is changed to run many iterations, and run with MDSplus alpha-7-139-59, then each iteration leaves a dangling object in MATLAB and a dangling connection on the MDSplus archive server. As those dangling objects / connections accumulate, eventually MATLAB and/or the server choke, thereby forcing the user to exit MATLAB and start over.

% Name:   bug_2925.m
% Purpose:   This tests the PR #2936 fix of Issue #2925.

% Issue 2925 was caused by relying on the Java garbage collector
% to terminate connections and delete connection objects.  However,
% there is no control over when the garbage collector runs, thus
% dangling objects / connections are created -- especially when
% the mdsdisconnect is in a loop. 

% On development system, this program would drop the mdsip connection
% if the loop ran more than 64 iterations, whereupon it also displayed
% the various error messages reported in Issue #2925.


% Cannot use the Python-bridge because Omega's MATLAB requires
% installing Python 3.9 (or newer), but Omega only has Python 3.7.
% mdsUsePython(true)

for i = 1:5
   status = mdsconnect('atlas');
   if status == -1
      break
   end
   mdsopen('RF', 203131);
   d = mdsvalue('LHCD.LH_COORDS');
   mdsclose();
   mdsdisconnect;
   disp(i)
end

STATUS QUO:

This is the MDSplus version GA has installed on Omega. It is from the "alpha" branch.

$ mdstcl
TCL> show version


MDSplus version: 7.139.59
----------------------
  Release:  HEAD_release_7.139.59
  Browse:   https://github.com/MDSplus/mdsplus/tree/HEAD_release_7.139.59
  Download: https://github.com/MDSplus/mdsplus/archive/HEAD_release_7.139.59.tar.gz

This is the output of the MATLAB script when run in MATLAB 2024b. Note that the quit causes MATLAB's garbage collector to run, which destroys the dangling objects, thereby generating the many messages plus also terminating the associated mdsip connections.

$ matlab -nodisplay

To get started, type doc.
For product information, visit www.mathworks.com.
 
>> bug_2925
     1

     2

     3

     4

     5

>> quit
D, 1752786698.807:  buffer_free()                 Connection(id=4, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
D, 1752786698.807:  buffer_free()                 Connection(id=3, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
D, 1752786698.807:  buffer_free()                 Connection(id=2, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
D, 1752786698.807:  buffer_free()                 Connection(id=1, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
D, 1752786698.807:  buffer_free()                 Connection(id=0, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
$ 

PR 2936 FIX:

A "developer" build of MDSplus (current "alpha + PR 2936") was put in a tar file and installed in my home directory on Omega. (This installation has no impact on the cluster's main MDSplus.) A shell script was run to set environment variables in my session to point to the MDSplus dev build. Also a javaclasspath.txt file was used to add the dev build's Java classes to the beginning of MATLAB's static javaclasspath list.

The Date: field shows that this is indeed the dev build of MDSplus that is being used. (Ignore the version number; it is bogus on this dev build.)

$ mdstcl
TCL> show version


MDSplus version: 7.139.17
----------------------
  Release:  alpha_release-7-139-17
  Date:     Wed Jul 16 19:02:26 UTC 2025
  Browse:   https://github.com/MDSplus/mdsplus/tree/alpha_release-7-139-17
  Download: https://github.com/MDSplus/mdsplus/releases/tag/alpha_release-7-139-17


TCL> 

This shows the output of the MATLAB program when running with the dev build of MDSplus. Although not shown here, the loop was set to 200 iterations and worked fine.

$ matlab -nodisplay

To get started, type doc.
For product information, visit www.mathworks.com.
 
>> bug_2925
     1

     2

     3

     4

     5

>> exit
$ 

NOTES:

An attempt was made to run GA's getalldat.m MATLAB script with the dev build of MDSplus, but it involved too much fiddling with environment variables.

Upgrading MDSplus on a cluster that has many users can be a big task. GA might find it simpler to modify its MATLAB scripts -- getmds.m, getalldat.m, etcetera -- to skip the mdsdisconnect call if GA's MDSplus_persistent flag is set. Doing so would re-use the existing connection, and eliminate dangling connection objects in MATLAB and dangling mdsip connections on the MDSplus server.

@mwinkel-dev
Copy link
Contributor Author

Restored the finalize() method but flagged it as deprecated (i.e., will be removed in a future Java).

Note that when a Java process ends, the garbage collector is usually not run (i.e., instead the operating system cleans up all the system resources for the process). The Java garbage collector thus typically runs when the process is under memory pressure. The Java System.gc() method can be used to suggest that the JVM do garbage collection, but there is no guarantee that it will do so.

Thus, it is difficult to create a test case that proves that the MDSpluss Connection object's finalize() method has actually been called.

Nonetheless, this revised version of this PR works fine in manual testing. It compiles OK, immediately disconnects when the mdsdisconnect() method is called, and does not leave dangling mdsip processes on the server.

@mwinkel-dev mwinkel-dev merged commit 718794d into MDSplus:alpha Jul 22, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api/matlab Relates to the Matlab API bug An unexpected problem or unintended behavior US Priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants