Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-72371] rework node monitor configuration #8719

Merged
merged 4 commits into from
Dec 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions core/src/main/java/hudson/model/ComputerSet.java
Original file line number Diff line number Diff line change
Expand Up @@ -352,15 +352,33 @@
BulkChange bc = new BulkChange(MONITORS_OWNER);
try {
Jenkins.get().checkPermission(Jenkins.MANAGE);
monitors.rebuild(req, req.getSubmittedForm(), getNodeMonitorDescriptors());
JSONObject json = req.getSubmittedForm();
monitors.rebuild(req, json, getNodeMonitorDescriptors());

// add in the rest of instances are ignored instances
for (Descriptor<NodeMonitor> d : NodeMonitor.all())
if (monitors.get(d) == null) {
for (Descriptor<NodeMonitor> d : NodeMonitor.all()) {
NodeMonitor monitor = monitors.get(d);
if (monitor == null) {

Check warning on line 361 in core/src/main/java/hudson/model/ComputerSet.java

View check run for this annotation

ci.jenkins.io / Code Coverage

Partially covered line

Line 361 is only partially covered, one branch is missing
NodeMonitor i = createDefaultInstance(d, true);
if (i != null)
monitors.add(i);
} else {

Check warning on line 365 in core/src/main/java/hudson/model/ComputerSet.java

View check run for this annotation

ci.jenkins.io / Code Coverage

Not covered line

Line 365 is not covered by tests
/*
* Some monitors in plugins do not have a DataBoundConstructor
* but a Descriptor that overrides newInstance. For those the ignored
* field is not set, so we have to explicitly set it.
*/
String name = d.getJsonSafeClassName();
JSONObject o = json.optJSONObject(name);
boolean ignored = true;
if (o != null) {

Check warning on line 374 in core/src/main/java/hudson/model/ComputerSet.java

View check run for this annotation

ci.jenkins.io / Code Coverage

Partially covered line

Line 374 is only partially covered, one branch is missing
if (o.containsKey("ignored")) {

Check warning on line 375 in core/src/main/java/hudson/model/ComputerSet.java

View check run for this annotation

ci.jenkins.io / Code Coverage

Partially covered line

Line 375 is only partially covered, one branch is missing
ignored = o.getBoolean("ignored");
}
}
monitor.setIgnored(ignored);
}
}

// recompute the data
for (NodeMonitor nm : monitors) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,25 @@ protected AbstractNodeMonitorDescriptor() {
this(PERIOD);
}

/**
* Indicates if this monitor is capable to take agents offline in case it detects a problem.
* If true, this will enable the configuration option to ignore the monitor.
* Defaults to {@code true} so plugins that do not override this method behave as before.
* Plugins that do implement a monitor that will not take agents offline should override this
* method and return false.
*
* @return true if this monitor might take agents offline
* @since TODO
*/
public boolean canTakeOffline() {
return true;
}

@Override
public String getConfigPage() {
return getViewPage(clazz, "configure.jelly");
}

/**
* @deprecated as of 1.522
* Extend from {@link AbstractAsyncNodeMonitorDescriptor}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,8 @@
import hudson.remoting.Callable;
import java.io.IOException;
import jenkins.security.MasterToSlaveCallable;
import net.sf.json.JSONObject;
import org.jenkinsci.Symbol;
import org.kohsuke.stapler.DataBoundConstructor;
import org.kohsuke.stapler.StaplerRequest;

/**
* Discovers the architecture of the system to display in the agent list page.
Expand All @@ -60,8 +58,8 @@ public String getDisplayName() {
}

@Override
public NodeMonitor newInstance(StaplerRequest req, JSONObject formData) throws FormException {
return new ArchitectureMonitor();
public boolean canTakeOffline() {
return false;
}
}

Expand Down
12 changes: 5 additions & 7 deletions core/src/main/java/hudson/node_monitors/ClockMonitor.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,10 @@
import hudson.remoting.Callable;
import hudson.util.ClockDifference;
import java.io.IOException;
import net.sf.json.JSONObject;
import org.jenkinsci.Symbol;
import org.kohsuke.accmod.Restricted;
import org.kohsuke.accmod.restrictions.NoExternalUse;
import org.kohsuke.stapler.DataBoundConstructor;
import org.kohsuke.stapler.StaplerRequest;

/**
* {@link NodeMonitor} that checks clock of {@link Node} to
Expand Down Expand Up @@ -72,6 +70,11 @@ public DescriptorImpl() {
DESCRIPTOR = this;
}

@Override
public boolean canTakeOffline() {
return false;
}

@Override
protected Callable<ClockDifference, IOException> createCallable(Computer c) {
Node n = c.getNode();
Expand All @@ -84,10 +87,5 @@ protected Callable<ClockDifference, IOException> createCallable(Computer c) {
public String getDisplayName() {
return Messages.ClockMonitor_DisplayName();
}

@Override
public NodeMonitor newInstance(StaplerRequest req, JSONObject formData) throws FormException {
return new ClockMonitor();
}
}
}
2 changes: 2 additions & 0 deletions core/src/main/java/hudson/node_monitors/NodeMonitor.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
import hudson.util.DescriptorList;
import java.util.List;
import jenkins.model.Jenkins;
import org.kohsuke.stapler.DataBoundSetter;
import org.kohsuke.stapler.export.Exported;
import org.kohsuke.stapler.export.ExportedBean;

Expand Down Expand Up @@ -133,6 +134,7 @@ public boolean isIgnored() {
return ignored;
}

@DataBoundSetter
public void setIgnored(boolean ignored) {
this.ignored = ignored;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,8 @@
import java.util.Map;
import java.util.logging.Logger;
import jenkins.security.MasterToSlaveCallable;
import net.sf.json.JSONObject;
import org.jenkinsci.Symbol;
import org.kohsuke.stapler.DataBoundConstructor;
import org.kohsuke.stapler.StaplerRequest;
import org.kohsuke.stapler.export.Exported;
import org.kohsuke.stapler.export.ExportedBean;

Expand Down Expand Up @@ -103,11 +101,6 @@ protected Map<Computer, Data> monitor() throws InterruptedException {
public String getDisplayName() {
return Messages.ResponseTimeMonitor_DisplayName();
}

@Override
public NodeMonitor newInstance(StaplerRequest req, JSONObject formData) throws FormException {
return new ResponseTimeMonitor();
}
}

private static final class Step1 extends MasterToSlaveCallable<Data, IOException> {
Expand Down
12 changes: 5 additions & 7 deletions core/src/main/java/hudson/node_monitors/SwapSpaceMonitor.java
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,10 @@
import java.io.IOException;
import jenkins.model.Jenkins;
import jenkins.security.MasterToSlaveCallable;
import net.sf.json.JSONObject;
import org.jenkinsci.Symbol;
import org.jvnet.hudson.MemoryMonitor;
import org.jvnet.hudson.MemoryUsage;
import org.kohsuke.stapler.DataBoundConstructor;
import org.kohsuke.stapler.StaplerRequest;
import org.kohsuke.stapler.export.Exported;
import org.kohsuke.stapler.export.ExportedBean;

Expand Down Expand Up @@ -103,6 +101,11 @@ public DescriptorImpl() {
DESCRIPTOR = this;
}

@Override
public boolean canTakeOffline() {
return false;
}

@Override
protected MonitorTask createCallable(Computer c) {
return new MonitorTask();
Expand All @@ -113,11 +116,6 @@ protected MonitorTask createCallable(Computer c) {
public String getDisplayName() {
return Messages.SwapSpaceMonitor_DisplayName();
}

@Override
public NodeMonitor newInstance(StaplerRequest req, JSONObject formData) throws FormException {
return new SwapSpaceMonitor();
}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ THE SOFTWARE.
<f:form method="post" action="configSubmit" name="config">
<f:descriptorList title="${%Preventive Node Monitoring}"
descriptors="${it.nodeMonitorDescriptors}"
instances="${it.nonIgnoredMonitors}" />
instances="${it.monitors}"
forceRowSet="true"/>

<l:hasAdministerOrManage>
<f:bottomButtonBar>
Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
description=Jenkins monitors each attached node for disk space, free temp space, free swap, clock time/sync, and response time. Nodes will be taken offline if any of these values go outside of the configured threshold.
description=Jenkins monitors each attached node for various metrics like free disk space, free temp space, free swap, clock time/sync, response time and others provided by plugins. \
Some of these monitors are able to mark nodes temporarily offline, when the values go outside the configured thresholds.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<div>
This monitor just shows the architecture of the agent for your information. It
never marks the agent offline.
never marks an agent offline.
</div>
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
systems, version control activities and distributed file access (such as NFS,
Windows file shares) tend to have strange problems when systems involved have
different clocks.
<br />
It never marks an agent offline.

<p>To keep clocks in sync, refer to NTP.</p>
</div>
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<?jelly escape-by-default='true'?>
<j:jelly xmlns:j="jelly:core" xmlns:st="jelly:stapler" xmlns:f="/lib/form">
<j:set var="help" value="${descriptor.helpFile}"/>
<div class="jenkins-form-label help-sibling tr">
${descriptor.displayName}
<f:helpLink url="${help}" featureName="${descriptor.displayName}"/>
</div>
<f:helpArea />
<div class="optionalBlock-container jenkins-form-item">
<div class="form-container">
<j:if test="${descriptor.canTakeOffline()}">
<f:entry title="${%Don't mark agents temporarily offline}" field="ignored">
<f:checkbox checked="${instance.ignored}" />
</f:entry>
</j:if>
<j:if test="${!descriptor.canTakeOffline()}">
<f:invisibleEntry>
<f:checkbox name="ignored" checked="false" />
</f:invisibleEntry>
</j:if>
<st:include page="config.jelly" class="${descriptor.clazz}" optional="true"/>
</div>
</div>
</j:jelly>
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<div>
When checked the agent will
<i>not</i>
be marked temporarily offline in case the monitor detects a problem. But the
monitor will still run to collect data and the corresponding field will be
highlighted in case of a problem.
</div>
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
<div>
This monitors the round trip network response time from the controller to the
agent, and if it goes above a threshold repeatedly, it marks the agent
offline.
agent, and if it goes above a threshold repeatedly, it disconnects the agent.

<p>
This is useful for detecting unresponsive agents, or other network problems
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<div>
This monitors the available virtual memory (also known as
<code>swap space</code>
).
<p>The exact definition of swap space is platform dependent.</p>
<ul>
<li>
On Windows this is the available space of the page file. As Windows can
automatically increase the page file size, this value doesn't mean much.
</li>
<li>
On Linux this value is retrieved from
<code>/proc/meminfo</code>
.
</li>
<li>
For other Unix systems, the value is obtained from the
<code>top</code>
command.
</li>
</ul>
</div>
Loading