Update documentation of Batch data plane swagger.#2885
Conversation
Automation for azure-libraries-for-javaNothing to generate for azure-libraries-for-java |
|
@matthchr @darylmsft @dlepow Please review the doc change. |
Automation for azure-sdk-for-pythonThe initial PR has been merged into your service PR: |
Automation for azure-sdk-for-nodeThe initial PR has been merged into your service PR: |
Automation for azure-sdk-for-goThe initial PR has been merged into your service PR: |
There was a problem hiding this comment.
"and be": replace by "and will be" or "and"
There was a problem hiding this comment.
We don't use PaaS/IaaS in public documentation.
Can you say this instead?
On CloudServiceConfiguration pools, this user is logged in with the INTERACTIVE flag. On Windows VirtualMachineConfiguration pools, this user is logged in with the BATCH flag.
There was a problem hiding this comment.
We DO use PaaS/IaaS in the document, which exactly for cloudServiceConfiguration/virtualMachineConfig.
There was a problem hiding this comment.
"All tasks should be idempotent, it means..." replace by "All tasks should be idempotent. This means tasks need to tolerate being... etc".
I don't think the email on this text and/or placement has reached a conclusion yet.
There was a problem hiding this comment.
Remove the comma in "... being interrupted and restarted,without causing any corruption" #nitpick
There was a problem hiding this comment.
My attempt:
"Note that this value specifically controls the number of retries for the task executable due to nonzero exit code. The Batch service will try the task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the task after the first attempt. If the maximum retry count is -1, the Batch service retries the task without limit. Resource files and application packages are only downloaded again if the task is retried on a new compute node. Batch will also retry tasks when a recovery operation is triggered on a compute node. Examples of recovery operations include (but are not limited to) when an unhealthy compute node is rebooted or a compute node disappeared due to host failure. Retries due to recovery operations are independent of and are not counted against the maxTaskRetryCount. Even if the maxTaskRetryCount is 0, an internal retry due to a recovery operation may occur. Because of this, all tasks should be idempotent. This means tasks need to tolerate being interrupted and restarted without causing any corruption or duplicate data. Best practices recommended for long running tasks is to use checkpointing."
I think some/all of this text (specifically this bit: Batch will also retry tasks when a recovery operation is triggered on a compute node. Examples of recovery operations include (but are not limited to) when an unhealthy compute node is rebooted or a compute node disappeared due to host failure. Retries due to recovery operations are independent of and are not counted against the maxTaskRetryCount. Even if the maxTaskRetryCount is 0, an internal retry due to a recovery operation may occur. Because of this, all tasks should be idempotent. This means tasks need to tolerate being interrupted and restarted without causing any corruption or duplicate data. Best practices recommended for long running tasks is to use checkpointing. which is about internal retries should be moved to the root of the task object rather than buried down here on maxtaskRetryCount which is actually about user retries not system/internal ones.
There was a problem hiding this comment.
Are we able to include new lines in the description documentation? The internal retries topic is really a separate concept than the user retries and it would be helpful to make a clear distinction when we branch off into it to prevent confusion.
I agree with Matt's point that idempotent tasks should be addressed closer to the root as well because not everyone will care enough to read into retry policy. Generally when reading documentation people will start at the base and then delve deeper into the options as needed.
There was a problem hiding this comment.
Did you already ship client SDK with the current api version? Will they still work without this parameter being sent to server?
There was a problem hiding this comment.
@hovsepm This is server response. Not request.
There was a problem hiding this comment.
@xingwu1 why do you mark server response fields as required then? IMHO they should be marked as "readOnly": true to prevent client from modifying them. Required field means that those parameters should be present in the request and not response - https://swagger.io/docs/specification/describing-parameters/#required-and-optional-parameters-7
There was a problem hiding this comment.
readOnly and required field can't co-exist. However, without required field, all the int properties will convert to Nullable which make user difficult to use. So right now, we will keep the required field right now. We will re-consider readonly field.
There was a problem hiding this comment.
Remove the comma in "... being interrupted and restarted,without causing any corruption" #nitpick
There was a problem hiding this comment.
I'm not sure that "Tasks should be idempotent" is the most important information in this description; perhaps move it to the end? (Same comment also applies elsewhere.)
There was a problem hiding this comment.
We don't use PaaS/IaaS in public documentation.
Can you say this instead?
On CloudServiceConfiguration pools, this user is logged in with the INTERACTIVE flag. On Windows VirtualMachineConfiguration pools, this user is logged in with the BATCH flag.
There was a problem hiding this comment.
My attempt:
"Note that this value specifically controls the number of retries for the task executable due to nonzero exit code. The Batch service will try the task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the task after the first attempt. If the maximum retry count is -1, the Batch service retries the task without limit. Resource files and application packages are only downloaded again if the task is retried on a new compute node. Batch will also retry tasks when a recovery operation is triggered on a compute node. Examples of recovery operations include (but are not limited to) when an unhealthy compute node is rebooted or a compute node disappeared due to host failure. Retries due to recovery operations are independent of and are not counted against the maxTaskRetryCount. Even if the maxTaskRetryCount is 0, an internal retry due to a recovery operation may occur. Because of this, all tasks should be idempotent. This means tasks need to tolerate being interrupted and restarted without causing any corruption or duplicate data. Best practices recommended for long running tasks is to use checkpointing."
I think some/all of this text (specifically this bit: Batch will also retry tasks when a recovery operation is triggered on a compute node. Examples of recovery operations include (but are not limited to) when an unhealthy compute node is rebooted or a compute node disappeared due to host failure. Retries due to recovery operations are independent of and are not counted against the maxTaskRetryCount. Even if the maxTaskRetryCount is 0, an internal retry due to a recovery operation may occur. Because of this, all tasks should be idempotent. This means tasks need to tolerate being interrupted and restarted without causing any corruption or duplicate data. Best practices recommended for long running tasks is to use checkpointing. which is about internal retries should be moved to the root of the task object rather than buried down here on maxtaskRetryCount which is actually about user retries not system/internal ones.
There was a problem hiding this comment.
Avoid this style of "reference" since it doesn't look nice in any generated language. I think we need to think of a better way to do this.
This comment applies down the line.
|
To all reviewers: please explicitly approve the PR then I'll be able to merge it. |
|
Hi There, I am the AutoRest Linter Azure bot. I am here to help. My task is to analyze the situation from the AutoRest linter perspective. Please review the below analysis result: File: AutoRest Linter Guidelines | AutoRest Linter Issues | Send feedback Thanks for your co-operation. |
|
Waiting for all reviewers to sign-off. |
There was a problem hiding this comment.
Due to a nonzero exit code (missing the "a")
There was a problem hiding this comment.
If the command line refers to file paths, it should use a relative path (relative to the task working directory), or use the Batch provided environment variables (https://docs.microsoft.com/en-us/azure/batch/batch-compute-node-environment-variables)
There was a problem hiding this comment.
This applies to all copies of this description, since there are a few
There was a problem hiding this comment.
At the end:
"The best practice for long running tasks is to use some form of checkpointing."
|
Hi There, I am the AutoRest Linter Azure bot. I am here to help. My task is to analyze the situation from the AutoRest linter perspective. Please review the below analysis result: File: AutoRest Linter Guidelines | AutoRest Linter Issues | Send feedback Thanks for your co-operation. |
|
@hovsepm you can merge it |
This checklist is used to make sure that common issues in a pull request are addressed. This will expedite the process of getting your pull request merged and avoid extra work on your part to fix issues discovered during the review process.
PR information
api-versionin the path should match theapi-versionin the spec).Quality of Swagger