Automating convergence testing

mkhorton · April 27, 2017, 8:52pm

Hi all,

I was chatting to Shyam the other week about possibly automating convergence of parameters inside atomate + would welcome thoughts on this.

At the moment, I have something like a powerup that takes a workflow and generates a set of workflows, varying one parameter: https://github.com/mkhorton/atomate/blob/convergence/atomate/vasp/powerups.py#L488

This is basically just a convenience function for doing convergence tests. However, I wasn’t sure if using dynamic FWs we could do something a bit more clever?

I’m thinking about two types of workflow:

Convergence of a parameter towards a stable solution: this would increase an input parameter (e.g. ENCUT), and track an output parameter (e.g. total energy), automatically generating additional workflows until either the output parameter is converged or a maximum iteration count is reached. Could use the pymatgen.utils.convergence to test for convergence, or just simply stop if the difference between output parameters is below a certain threshold value (e.g. 1e-4).
Convergence of a parameter towards a target value: this would vary an input parameter (e.g. LDAUU) and track an output parameter (e.g. band gap), and automatically iterate to try and find the best input value. This could make use of some of scipy’s in-built optimizers, which are quite robust even with low iteration counts (I imagine you’d want a max of e.g. 10 iterations). A proof-of-concept workflow for this could be trying to find the square root of 2 or something along those lines.

Any thoughts on if this would be worth pursuing, or general ideas for implementation?

All the best,

Matt

Anubhav_Jain · April 28, 2017, 4:28pm

Hi Matt

Background

There are two places one can do convergence:

inside custodian
as part of the workflow, using dynamic workflows

For an example of the custodian way, see the “full_opt_run” in custodian which keeps running structural optimization vasp jobs until the volume change from the previous run converges. I think you probably have a decent mental picture of how the FWS/atomate version would work using dynamic FWs - e.g., using detour or additions in FWAction until hitting a convergence.

Opinions

Basically, there are two advantages of doing things with FireWorks instead of custodian:

you can run each individual step of a convergence as a separate queue submission. So, if you don’t think you can complete the entire job in a single queue submission (e.g., due to walltime restrictions), the FWS way will be a solution to that.
if a single job in the convergence fails, e.g. for the parameter optimization type workflow you mentioned earlier, it’s a little bit easier to restart from the failed job rather than restart the entire optimization from scratch with the FWS way.

Although that seems to powerfully favor FWS, pretty much every other advantage goes to custodian. In particular, cleanliness and interpretability of the implementation (and therefore debugging problems, etc) is much easier with custodian. The custodian implementation is pretty clear, whereas designing workflows that use complex FWAction and convergence are difficult to understand and not completely clear in their operation from just looking at the code. There is a greater chance for bugs and debugging and interpreting failed workflows is more of a pain.

Thus my general opinion is to try to do it with custodian wherever possible. Only when it’s clear that custodian is not going to work, e.g. an NEB job that might take a month but you only have 7 days walltime, would I go the FWS route.

As for trying to do a more general optimizer for a parameter (i.e., not simply increasing the parameter until convergence), we are trying to build a general framework for this in FireWorks with Alex Dunn (currently undergrad researcher in my group working remotely at UCLA, will be a UC Berkeley grad student in my group in the Fall). Basically, any time you have an optimization problem and can define (i) the forward workflow that gives a score and (ii) the parameter space to test, the infrastructure will allow you very simply and automatically optimize over the space. The current state of the code (in progress) is here:

Sorry for the lack of a clear answer, but just wanted to make you aware of the options and give my opinion.

···

On Thursday, April 27, 2017 at 1:52:52 PM UTC-7, [email protected] wrote:

Hi all,

I was chatting to Shyam the other week about possibly automating convergence of parameters inside atomate + would welcome thoughts on this.

At the moment, I have something like a powerup that takes a workflow and generates a set of workflows, varying one parameter: https://github.com/mkhorton/atomate/blob/convergence/atomate/vasp/powerups.py#L488

This is basically just a convenience function for doing convergence tests. However, I wasn’t sure if using dynamic FWs we could do something a bit more clever?

I’m thinking about two types of workflow:

Convergence of a parameter towards a stable solution: this would increase an input parameter (e.g. ENCUT), and track an output parameter (e.g. total energy), automatically generating additional workflows until either the output parameter is converged or a maximum iteration count is reached. Could use the pymatgen.utils.convergence to test for convergence, or just simply stop if the difference between output parameters is below a certain threshold value (e.g. 1e-4).

Convergence of a parameter towards a target value: this would vary an input parameter (e.g. LDAUU) and track an output parameter (e.g. band gap), and automatically iterate to try and find the best input value. This could make use of some of scipy’s in-built optimizers, which are quite robust even with low iteration counts (I imagine you’d want a max of e.g. 10 iterations). A proof-of-concept workflow for this could be trying to find the square root of 2 or something along those lines.

Any thoughts on if this would be worth pursuing, or general ideas for implementation?

All the best,

Matt

mkhorton · April 28, 2017, 4:52pm

Hi Anubhav,

Thanks for this – this is exactly the information I was looking for. I basically didn’t know if it was worth spending the time to develop extra convergence functionality or not. In particular, my knowledge of custodian is weaker, so I don’t think I appreciated the trade-offs on doing this in custodian over atomate.

I started looking at this this because there were parameters (specifically kwargs in atomate workflows) that I wanted to vary, but I appreciate this might be more of a niche case, since this is likely more useful for the development of the workflows themselves rather than for end-user functionality.

Will look forward to seeing TurboWorks as it develops, because that sounds ideal for a lot of this

Best,

Matt

···

On Friday, April 28, 2017 at 9:28:22 AM UTC-7, Anubhav Jain wrote:

Hi Matt

Background

There are two places one can do convergence:

inside custodian

as part of the workflow, using dynamic workflows

For an example of the custodian way, see the “full_opt_run” in custodian which keeps running structural optimization vasp jobs until the volume change from the previous run converges. I think you probably have a decent mental picture of how the FWS/atomate version would work using dynamic FWs - e.g., using detour or additions in FWAction until hitting a convergence.

Opinions

Basically, there are two advantages of doing things with FireWorks instead of custodian:

you can run each individual step of a convergence as a separate queue submission. So, if you don’t think you can complete the entire job in a single queue submission (e.g., due to walltime restrictions), the FWS way will be a solution to that.

if a single job in the convergence fails, e.g. for the parameter optimization type workflow you mentioned earlier, it’s a little bit easier to restart from the failed job rather than restart the entire optimization from scratch with the FWS way.

Although that seems to powerfully favor FWS, pretty much every other advantage goes to custodian. In particular, cleanliness and interpretability of the implementation (and therefore debugging problems, etc) is much easier with custodian. The custodian implementation is pretty clear, whereas designing workflows that use complex FWAction and convergence are difficult to understand and not completely clear in their operation from just looking at the code. There is a greater chance for bugs and debugging and interpreting failed workflows is more of a pain.

Thus my general opinion is to try to do it with custodian wherever possible. Only when it’s clear that custodian is not going to work, e.g. an NEB job that might take a month but you only have 7 days walltime, would I go the FWS route.

As for trying to do a more general optimizer for a parameter (i.e., not simply increasing the parameter until convergence), we are trying to build a general framework for this in FireWorks with Alex Dunn (currently undergrad researcher in my group working remotely at UCLA, will be a UC Berkeley grad student in my group in the Fall). Basically, any time you have an optimization problem and can define (i) the forward workflow that gives a score and (ii) the parameter space to test, the infrastructure will allow you very simply and automatically optimize over the space. The current state of the code (in progress) is here:

https://github.com/ardunn/turboworks

Sorry for the lack of a clear answer, but just wanted to make you aware of the options and give my opinion.

On Thursday, April 27, 2017 at 1:52:52 PM UTC-7, [email protected] wrote:

Hi all,

I was chatting to Shyam the other week about possibly automating convergence of parameters inside atomate + would welcome thoughts on this.

At the moment, I have something like a powerup that takes a workflow and generates a set of workflows, varying one parameter: https://github.com/mkhorton/atomate/blob/convergence/atomate/vasp/powerups.py#L488

This is basically just a convenience function for doing convergence tests. However, I wasn’t sure if using dynamic FWs we could do something a bit more clever?

I’m thinking about two types of workflow:

Convergence of a parameter towards a stable solution: this would increase an input parameter (e.g. ENCUT), and track an output parameter (e.g. total energy), automatically generating additional workflows until either the output parameter is converged or a maximum iteration count is reached. Could use the pymatgen.utils.convergence to test for convergence, or just simply stop if the difference between output parameters is below a certain threshold value (e.g. 1e-4).

Convergence of a parameter towards a target value: this would vary an input parameter (e.g. LDAUU) and track an output parameter (e.g. band gap), and automatically iterate to try and find the best input value. This could make use of some of scipy’s in-built optimizers, which are quite robust even with low iteration counts (I imagine you’d want a max of e.g. 10 iterations). A proof-of-concept workflow for this could be trying to find the square root of 2 or something along those lines.

Any thoughts on if this would be worth pursuing, or general ideas for implementation?

All the best,

Matt

Takeshi_Miyake · March 20, 2021, 12:20pm

Hi. I learned a lot from this discussion. Are there any working examples to use rocketsled together with atomate? I have successfully installed rocketsled in the atomate virtual environment. Thank you very much.