HDDS-7248. Recon: Expand the container status page to show all unhealthy container states #3837

smitajoshi12 · 2022-10-14T12:16:16Z

What changes were proposed in this pull request?

To show Under Replicated ,Over Replicated,Mis Replicated and All Replicas UnHealthy from Overview and Left Side Panel.
(Please fill in changes proposed in this fix)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7248

How was this patch tested?

Manually

…y container states

…hy ACOLUMNS Defined

kerneltime · 2022-10-17T16:09:06Z

cc @dombizita

dombizita

Thank you for the patch @smitajoshi12, this will be a great new feature added to Recon. Currently the ALL_REPLICAS_UNHEALTHY is not populated correctly on the Recon side, so we shouldn't add things related to that as it won't work correctly. I think we should refactor the missing containers part, so we would use that in the same pattern as we are using other unhealthy containers right now (eg. same path and responses). Also we can rename the file from missingContainers.tsx to unhealthyContainer.tsx. Let me know what you think about these two ideas @smengcl

hadoop-ozone/recon/src/main/resources/webapps/recon/ozone-recon-web/api/db.json

hadoop-ozone/recon/src/main/resources/webapps/recon/ozone-recon-web/api/routes.json

dombizita · 2022-10-17T22:05:02Z

...in/resources/webapps/recon/ozone-recon-web/src/views/missingContainers/missingContainers.tsx

  }
 ];

+const UCOLUMNS = [


I checked and the UCOLUMNS, OCOLUMNS, MCOLUMNS and ACOLUMNS are exactly the same. Why do we have it copied 4 times?
If we want to handle the MISSING state the same as the others we could have only the COLUMNS (starts on line 97) and adjust it to the parameters.

I have used another REPLICATEDCOLUMNS as one columns differs Unhealthy Since.

We can use the Unhealthy Since column for the MISSING state as well.

Done Code Changes

...in/resources/webapps/recon/ozone-recon-web/src/views/missingContainers/missingContainers.tsx

…thy container Review Changes

hadoop-ozone/recon/src/main/resources/webapps/recon/ozone-recon-web/api/routes.json

...zone/recon/src/main/resources/webapps/recon/ozone-recon-web/src/components/navBar/navBar.tsx

...in/resources/webapps/recon/ozone-recon-web/src/views/missingContainers/missingContainers.tsx

smengcl

Thanks @smitajoshi12 for the patch. Overall looks good. I have posted some comments above.

…y container states Review Changes

Change-Id: I9eff56d24bce5884cbc77cca67e840b6330aa546

smengcl · 2022-10-19T16:32:48Z

...in/resources/webapps/recon/ozone-recon-web/src/views/missingContainers/missingContainers.tsx

+    axios.all([
+      axios.get('/api/v1/containers/unhealthy/MISSING'),
+      axios.get('/api/v1/containers/unhealthy/UNDER_REPLICATED'),
+      axios.get('/api/v1/containers/unhealthy/OVER_REPLICATED'),
+      axios.get('/api/v1/containers/unhealthy/MIS_REPLICATED')


Let's try to get all unhealthy container information at once with:

Suggested change

axios.all([

axios.get('/api/v1/containers/unhealthy/MISSING'),

axios.get('/api/v1/containers/unhealthy/UNDER_REPLICATED'),

axios.get('/api/v1/containers/unhealthy/OVER_REPLICATED'),

axios.get('/api/v1/containers/unhealthy/MIS_REPLICATED')

axios.all([

axios.get('/api/v1/containers/unhealthy')

then change the logic accordingly. Hopefully this will be a small change.

Getting this Response.

{"missingCount":0,"underReplicatedCount":0,"overReplicatedCount":0,"misReplicatedCount":0,"containers":[]} but not sure How will be Containers data and states so Can we change this things in next JIRA I am not sure about changes and logic unless I get proper data from cluster.

In what kind of cluster did you get this response? Did you change everything accordingly to this change? If you don't pass any state as a parameter to the endpoint it will return all the containers (javadoc on the @Path("/unhealthy/{state}"): Passing null returns all containers.) . Based on the javadoc on the @Path("/unhealthy"):

* Return * {@link org.apache.hadoop.ozone.recon.api.types.UnhealthyContainerMetadata} * for all unhealthy containers.

I am fine if we ask every states endpoint individually for now, but please change the MISSING state ask accordingly to the others (/api/v1/containers/unhealthy/MISSING).

Can we change this things in next JIRA I am not sure about changes and logic unless I get proper data from cluster.

Alright. Let's move the refactoring work to a follow-up JIRA.

smengcl · 2022-10-19T16:33:27Z

...in/resources/webapps/recon/ozone-recon-web/src/views/missingContainers/missingContainers.tsx

-      const missingContainers: IMissingContainerResponse[] = missingContainersResponse.containers;
+
+    axios.all([
+      axios.get('/api/v1/containers/unhealthy/MISSING'),


TODO: Try with docker compose cluster response.

Tried with Docker Cluster Response but not getting actual response.

http://localhost:9888/api/v1/containers/unhealthy/MISSING

{"missingCount":0,"underReplicatedCount":1,"overReplicatedCount":0,"misReplicatedCount":0,"containers":[]}

What kind of response did you expect from the endpoint? What is the state of the cluster? If we don't have missing containers in the cluster this is a good response, this is the scheme we are looking for.

Change-Id: I3488d231c80d42e15a469d10df414a6f59dfb841

…hy container states Review Comments

smengcl

+1

@smitajoshi12 Please open a new JIRA for the refactoring (that combines 4 axios gets into one. ref)

* master: (718 commits) HDDS-7342. Move encryption-related code from MultipartCryptoKeyInputStream to OzoneCryptoInputStream (apache#3852) HDDS-7413. Fix logging while marking container state unhealthy (apache#3887) Revert "HDDS-7253. Fix exception when '/' in key name (apache#3774)" HDDS-7396. Force close non-RATIS containers in ReplicationManager (apache#3877) HDDS-7121. Support namespace summaries (du, dist & counts) for legacy FS buckets (apache#3746) HDDS-7258. Cleanup the allocated but uncommitted blocks (apache#3778) HDDS-7381. Cleanup of VolumeManagerImpl (apache#3873) HDDS-7253. Fix exception when '/' in key name (apache#3774) HDDS-7182. Add property to control RocksDB max open files (apache#3843) HDDS-7284. JVM crash for rocksdb for read/write after close (apache#3801) HDDS-7368. [Multi-Tenant] Add Volume Existence check in preExecute for OMTenantCreateRequest (apache#3869) HDDS-7403. README Security Improvement (apache#3879) HDDS-7199. Implement new mix workload Read/Write Freon command (apache#3872) HDDS-7248. Recon: Expand the container status page to show all unhealthy container states (apache#3837) HDDS-7141. Recon: Improve Disk Usage Page (apache#3789) HDDS-7369. Fix wrong order of command arguments in Nonrolling-Upgrade.md (apache#3866) HDDS-6210. EC: Add EC metrics (apache#3851) HDDS-7355. non-primordial scm fail to get signed cert from primordial SCM when converting an unsecure cluster to secure (apache#3859) HDDS-7356. Update SCM-HA.zh.md to match the English version (apache#3861) HDDS-6930. SCM,OM,RECON should not print ERROR and exit with code 1 on successful shutdown (apache#3848) ... Conflicts: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/LegacyReplicationManager.java hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestLegacyReplicationManager.java

HDDS-7248 Recon Expand the container status page to show all unhealth…

1ced653

…y container states

smitajoshi12 changed the title ~~HDDS-7248 Recon Expand the container status page to show all unhealth…~~ HDDS-7248. Recon Expand the container status page to show all unhealth… Oct 17, 2022

HDDS-7248. Recon Expand the container status page to show all unhealt…

a07bd96

…hy ACOLUMNS Defined

smitajoshi12 changed the title ~~HDDS-7248. Recon Expand the container status page to show all unhealth…~~ HDDS-7248. Recon: Expand the container status page to show all unhealthy container states Oct 17, 2022

dombizita requested changes Oct 17, 2022

View reviewed changes

HDDS-7248. Recon: Expand the container status page to show all unheal…

4080679

…thy container Review Changes