General payu docs for Run a Model section#1107
General payu docs for Run a Model section#1107ccarouge wants to merge 18 commits intodevelopmentfrom
Conversation
|
anton-seaice
left a comment
There was a problem hiding this comment.
I started reviewing , and now realised that this is Draft. Let me know when you want a review
jo-basevi
left a comment
There was a problem hiding this comment.
Sorry for the late initial review! I've just added a couple comments so far. Overall it seems to read well to me
|
@jo-basevi I have incorporated your feedback, can you please have a look and see if everything is resolved? |
|
Sorry to chime in on this. I haven't followed the whole discussion and haven't reviewed this page yet, but I think it's great that discussions are happening and this page gets a lot of inputs! I just wanted to give my opinion on the "trial and error" logic here.
I personally don't think this should be our main goal. While it’s true that we can’t explain everything and some trial and error is inevitable (that's probably how everyone learns in the first place), I think we should aim to make the information on our documentation pages as clear as possible, rather than relying on users to figure out things out through trial/error. |
@aidanheerdegen Reading the consultation, it means we don't yet have a logo, but it doesn't mean we can't design one. Or am I misunderstanding your conversation with Marshall? |
Yes you are correct. I think he would be happy for us to make something. |
|
@anton-seaice I've redesigned this page. I'm putting just you as a reviewer for now to see if you think it's better and worth bothering others for review. I also let you mark your comments as resolved if you think I've addressed them appropriately. |
anton-seaice
left a comment
There was a problem hiding this comment.
This looks great - thanks @ccarouge . From a first look I find it much clearer.
docs/models/run_a_model/payu.md
Outdated
| There is also [technical documentation](https://payu.readthedocs.io/en/latest/) for how to configure _payu_. | ||
| {: class="example-img" loading="lazy"} | ||
|
|
||
| This design was chosen to separate the small files that define the configuration and the larger binary output and input files needed for a realisation of a configuration. This ensures the configuration definition is easy to back up and share. It also optimises the use of different filesystems on high-performance computers. Finally, this layout ensures several experiments that share common executables and input data to be run simultaneously. |
There was a problem hiding this comment.
I think I would put this paragraph before the diagram
docs/models/run_a_model/payu.md
Outdated
| - The _control_ directory contains the model configuration and is where the model is run from. | ||
| - The _laboratory_ directory contains all data from _payu_ experiments using a same model. By default, it is `/scratch/$PROJECT/$USER/<model_name>`. `$PROJECT` and `$USER` are environment variables on _Gadi_ that points to your default project and your username respectively. See the section on [modifying the PBS resources](#modify-pbs-resources) to learn how to change the _laboratory_ location. | ||
|
|
||
| On _Gadi_, the _control_ directory can be in your `$HOME` directory (as it is the only filesystem actively backed-up on _Gadi_). The quotas for `$HOME` are low and strict, which limits what can be stored there, so it is not suitable for larger files. |
There was a problem hiding this comment.
| On _Gadi_, the _control_ directory can be in your `$HOME` directory (as it is the only filesystem actively backed-up on _Gadi_). The quotas for `$HOME` are low and strict, which limits what can be stored there, so it is not suitable for larger files. | |
| On _Gadi_, the _control_ directory can be in your `$HOME` directory (as it is the only filesystem actively backed-up on _Gadi_). The _control_ directory only contains text files and symlinks, and therefore fits easily within the 10GB limit placed on home directories. The _laboratory_ directory is on `/scratch` where there is lots more space available for large model output. |
docs/models/run_a_model/payu.md
Outdated
| The `archive` and `work` directories for an experiment are most easily accessed through the symbolic links created in the _control_ directory. | ||
|
|
||
| !!! warning | ||
| Files on the `/scratch` drive, such as the _laboratory_ directory, might get deleted if not accessed for several days and the `/scratch` drive is limited in space. For these reasons, all model runs which are to be kept should be moved to `/g/data/` by enabling the `sync` step in _payu_. To know more refer to [Syncing output data](#syncing-output-data-to-long-term-storage). |
There was a problem hiding this comment.
| Files on the `/scratch` drive, such as the _laboratory_ directory, might get deleted if not accessed for several days and the `/scratch` drive is limited in space. For these reasons, all model runs which are to be kept should be moved to `/g/data/` by enabling the `sync` step in _payu_. To know more refer to [Syncing output data](#syncing-output-data-to-long-term-storage). | |
| Files on the `/scratch` drive, such as the _laboratory_ directory, might get deleted if not accessed for several days. All experiments which are to be kept should be moved to `/g/data/` by enabling the `sync` step in _payu_. To know more refer to [Syncing output data](#syncing-output-data-to-long-term-storage). |
There was a problem hiding this comment.
might get deleted
might be deleted
docs/models/run_a_model/payu.md
Outdated
| - [modifying a _payu_-based configuration for the most commonly customised aspects](#edit-a-payu-configuration) | ||
|
|
||
| !!! info | ||
| This page is to be used in conjunction with the [Run a Model][Run a Model] page for the chosen configuration. The Run a Model page will give information specific to that model (for example, additional requirements or configuration names and locations) as well as any information on any configurations customisation that is particular to that model. |
There was a problem hiding this comment.
| This page is to be used in conjunction with the [Run a Model][Run a Model] page for the chosen configuration. The Run a Model page will give information specific to that model (for example, additional requirements or configuration names and locations) as well as any information on any configurations customisation that is particular to that model. | |
| This page is to be used in conjunction with the [Run a Model][Run a Model] page for the chosen configuration. The Run a Model page will give information specific to that model (for example, additional information on configuration names and locations) as well as information on configuration customisations that are particular to that model. |
docs/models/run_a_model/payu.md
Outdated
| - identify the `<repository>` and `<branch>` name the configuration is stored under on GitHub. See the information on the [Run a Model][Run a Model] page of your chosen model for this step. | ||
| - decide where on Gadi to store all your _payu_ experiments, `<configurations-directory>`, typically a folder under $HOME. This directory must exist before running _payu_. | ||
| - decide on a name for your experiment, `<experiment-name>`. It is recommended to choose a descriptive name. | ||
| - decide on a directory name to store the experiment, `<control-directory>` (created by _payu_). The `control` directory is a git repository. Experiments are saved as branches in this repository, making it possible to use the same `control` directory for several experiments. For this reason, we recommend to always set the `<experiment-name>`. For more information refer to this [payu tutorial](https://forum.access-hive.org.au/t/access-om2-payu-tutorial/1750#select-experiment-12). |
There was a problem hiding this comment.
I think this means we recommend setting branch-name
experiment name is constructed by payu as descired in the link (https://forum.access-hive.org.au/t/access-om2-payu-tutorial/1750#p-5861-experiment-naming-6)
docs/models/run_a_model/payu.md
Outdated
|
|
||
| - `<repository>` and `<branch>`: base your experiment off the branch, `release-1deg_jra55_ryf`, from the repository, `https://github.com/ACCESS-NRI/access-om2-configs` | ||
| - `<configurations-directory>`: store the all your ACCESS-OM2 configurations under `~/access-om2/` | ||
| - `<experiment-name>`: name your experiment `diffuse_test1-1deg_jra55_ryf` |
There was a problem hiding this comment.
this should be branch-name I think
|
@chrisb13 @aidanheerdegen @jo-basevi @atteggiani , this is the payu docs page for the ACCESS-Hive docs. @anton-seaice has provided quite a few reviews now and I think we have a promising version, ready for more eyes. Please have a read. If you want to pass on reviewing, that's ok with me, just let us know. |
|
Thank you all for the massive work! |
aidanheerdegen
left a comment
There was a problem hiding this comment.
I've suggested quite a bit ... hope it is helpful.
| <!-- Diagram created from Lucid chart: https://lucid.app/users/login#/login --> | ||
| <!-- It can be edited by any Lucid's member (free account), at this link: https://lucid.app/lucidchart/ccebf957-8915-4344-a832-426427451c00/edit?viewport_loc=-159%2C129%2C2067%2C1113%2C0_0&invitationId=inv_1c8cccfd-b20e-4b2f-977a-a74b0b8355ae --> | ||
|
|
||
| {: class="example-img" loading="lazy"} |
There was a problem hiding this comment.
Is there a way to indicate the work directory isn't always present? Also the archive directory is only present once payu is run.
|
|
||
| #### Submodels {: .no-toc } | ||
|
|
||
| Coupled models deploy multiple submodels, a.k.a. the model components. |
There was a problem hiding this comment.
| Coupled models deploy multiple submodels, a.k.a. the model components. | |
| Coupled models typically deploy multiple submodels, a.k.a. the model components. |
|
|
||
| Coupled models deploy multiple submodels, a.k.a. the model components. | ||
|
|
||
| This section of the _payu_ configuration file specifies the submodels, the configuration options required to execute the model correctly and the location of all inputs required for this submodel. |
There was a problem hiding this comment.
| This section of the _payu_ configuration file specifies the submodels, the configuration options required to execute the model correctly and the location of all inputs required for this submodel. | |
| This section of the _payu_ configuration file specifies the submodels, the configuration options required to execute the model component correctly and the location of all inputs required for this submodel. |
| ```yaml | ||
| runlog: true | ||
| ``` | ||
| When running a new configuration, _payu_ automatically commits changes with _git_ if `runlog` is set to `true`. |
There was a problem hiding this comment.
| When running a new configuration, _payu_ automatically commits changes with _git_ if `runlog` is set to `true`. | |
| When running an experiment, if `runlog` is set to `true`, _payu_ saves a history of the experiment. It does this using _git_, by automatically committing changes to the control directory repository. |
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: | ||
|
|
||
| - `error` gets called if the model does not run correctly and returns an error code. | ||
| - `run` gets called after the model successful execution, but prior to model output archive. |
There was a problem hiding this comment.
| - `run` gets called after the model successful execution, but prior to model output archive. | |
| - `run` gets called after the model successful execution, but prior to archiving model output. |
| Each of the model components contains additional configuration options that are read in when the model component is running.<br> | ||
| These options are typically useful to modify the physics used in the model, the input data, or the model variables saved in the output files. | ||
|
|
||
| These configuration options are specified in files located inside a subfolder of the _control_ directory, named according to the submodel's `name` specified in the `config.yaml` `submodels` section (e.g., configuration options for the _ocean_ component are in the `ocean` sub-directory).<br> | ||
| To modify these options please refer to the configurations documentation of the respective model component, found on the [Run a Model][Run a Model] page for your chosen model. |
There was a problem hiding this comment.
I think this is important information that is buried here and not really mentioned anywhere else. I'd advocate moving or copying this up to the structure bit at the start.
atteggiani
left a comment
There was a problem hiding this comment.
Thank you all for your contributions to this page.
Overall I really like it and I think it explains things clearly.
I added quite a lot of suggestions, but they are mostly related to formatting or phrasing.
| [PBS job]: https://opus.nci.org.au/display/Help/4.+PBS+Jobs | ||
| [Run a Model]: /models/run_a_model | ||
|
|
||
| # Run models using payu |
There was a problem hiding this comment.
| # Run models using payu | |
| # Run models using Payu |
There was a problem hiding this comment.
payu is always lowercase in the text. Why would we put it uppercase on the page title? Or do you want to use Title Case: "Run Models Using Payu"?
There was a problem hiding this comment.
Yes, Title Case would make sense in my opinion.
However, it is currently not consistent across other pages, so feel free to leave this all lower case.
There was a problem hiding this comment.
My tomorrow self will decide on what happens here 😄
| sync: /g/data/vk83/apps/om2-scripts/concatenate_ice/concat_ice_daily.sh | ||
| ``` | ||
|
|
||
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: |
There was a problem hiding this comment.
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: | |
| Used to run scripts or subcommands at various stages of a _payu_ submission: |
|
|
||
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: | ||
|
|
||
| - `error` gets called if the model does not run correctly and returns an error code. |
There was a problem hiding this comment.
| - `error` gets called if the model does not run correctly and returns an error code. | |
| - `error` gets called if the model does not run correctly and exits with an error. |
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: | ||
|
|
||
| - `error` gets called if the model does not run correctly and returns an error code. | ||
| - `run` gets called after the model successful execution, but prior to model output archive. |
There was a problem hiding this comment.
Is this after each model run or after the whole experiment (e.g., payu run -n ... )?
| - `run` gets called after the model successful execution, but prior to model output archive. | |
| - `run` gets called after the entire model experiment completes successfully, but before the model output is archived. |
There was a problem hiding this comment.
It's run after the model run command completes e.g. mpirun ....
Currently there isn't an option to run a script after all of a series of payu runs have completed with payu run -n ...
There was a problem hiding this comment.
Alright thank you.
Then, what is the difference between a run userscript and a postscript?
There was a problem hiding this comment.
Postscript submits a separate PBS job after payu finishes archiving outputs, or at the end of the collation PBS job if collation is enabled.
|
|
||
| - `error` gets called if the model does not run correctly and returns an error code. | ||
| - `run` gets called after the model successful execution, but prior to model output archive. | ||
| - `sync` gets called at the start of the sync pbs job. For more information refer to [Syncing output data](#syncing-output-data-to-long-term-storage). |
There was a problem hiding this comment.
| - `sync` gets called at the start of the sync pbs job. For more information refer to [Syncing output data](#syncing-output-data-to-long-term-storage). | |
| - `sync` gets called at the start of the sync PBS job. For more information, refer to [Syncing output data](#syncing-output-data-to-long-term-storage). |
| These configuration options are specified in files located inside a subfolder of the _control_ directory, named according to the submodel's `name` specified in the `config.yaml` `submodels` section (e.g., configuration options for the _ocean_ component are in the `ocean` sub-directory).<br> | ||
| To modify these options please refer to the configurations documentation of the respective model component, found on the [Run a Model][Run a Model] page for your chosen model. |
There was a problem hiding this comment.
| These configuration options are specified in files located inside a subfolder of the _control_ directory, named according to the submodel's `name` specified in the `config.yaml` `submodels` section (e.g., configuration options for the _ocean_ component are in the `ocean` sub-directory).<br> | |
| To modify these options please refer to the configurations documentation of the respective model component, found on the [Run a Model][Run a Model] page for your chosen model. | |
| These configuration options are specified in files located inside a subfolder of the _control_ directory, named according to the submodel's `name` specified in the `config.yaml` [`submodels` section](#submodels) (e.g., configuration options for the _ocean_ component are in the `ocean` sub-directory).<br> | |
| To modify these options, please refer to the configurations documentation of the respective model component, found on the [Run a Model][Run a Model] page for your chosen model. |
Should we also add any references?
<custom-references>
- [First reference](first-reference-url)
- [Second reference](second-reference-url)
</custom-references>
atteggiani
left a comment
There was a problem hiding this comment.
Thank you all for your contributions to this page.
Overall I really like it and I think it explains things clearly.
I added quite a lot of suggestions, but they are mostly related to formatting or phrasing.
jo-basevi
left a comment
There was a problem hiding this comment.
Just had a re-read through and it all looks good to me. No extra comments sorry!
| project: ol01 | ||
| ``` | ||
|
|
||
| For model configurations and output to be saved to a `/scratch` storage allocation other than `project` (or your default if `project` is not set) then also set `shortpath` to the desired path. |
There was a problem hiding this comment.
Yes short path will need be a full path. E.g. /scratch/PROJECT_CODE
| - restart002: 01/01/2006 (first restart date on or after 01/01/2005) | ||
| - restart004: 01/01/2012 (first restart date on or after 01/01/2011) | ||
| - restart005: 01/01/2015 (keeps immediate restarts before 01/01/2017) |
There was a problem hiding this comment.
The time difference is always added to the last saved restart - so there will be at least 5 years between each saved restart. Each permanently saved restart is like a checkpoint.
Payu doesn't always know the model start time for an experiment and previous restarts may get deleted (maybe from scratch limits). For example if restart000 and restart002 were deleted, it could still follow on from restart004.
Does that make sense?
| A dictionary to run scripts or subcommands at various stages of a _payu_ submission: | ||
|
|
||
| - `error` gets called if the model does not run correctly and returns an error code. | ||
| - `run` gets called after the model successful execution, but prior to model output archive. |
There was a problem hiding this comment.
It's run after the model run command completes e.g. mpirun ....
Currently there isn't an option to run a script after all of a series of payu runs have completed with payu run -n ...
|
@ccarouge Just checking on the status of this PR. The Hive Docs team is happy to discuss any questions if helpful. |
ACCESS-Hive Docs
Thank you for submitting a pull request to the ACCESS-Hive Docs Project. 🎉
More assistance is available in the how to contribute section on ACCESS-Hive Docs
Description
Write a general payu doc page for the Run a Model section.
Fixes #978
❓ The issue #978 includes the issue #934 on hosting payu experiments on GitHub. However, the discussion on that issue does not seem to be finalised as to where to include this information. @atteggiani seems to prefer a tutorial page, which would not be the general payu page included in this PR. So for now, this PR does not address issue #934. ❓
Type of change
Please delete options that are not relevant.
Checklist:
When your pull request is ready please request a review.
Unless there is a specific person you want your PR to be reviewd by, please select the Hive Docs Team:
ACCESS-NRI/hivedocsteam. This ensures the load for reviewing pull requests is shared equitably.