How to specify a number of workers?

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

Hi Roman,

If you are running on a single machine, typically you want to just use a single Worker unless you have different job types and want the same machine to handle two different categories of jobs differently. In your case, I think you want to keep a single Worker but use the multi-launch:

https://pythonhosted.org/FireWorks/multi_job.html

Let me know if it doesn’t solve your problem

Best,

Anubhav

···

On Wed, Mar 30, 2016 at 9:44 PM, Roman M [email protected] wrote:

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/11f9fa81-6d95-4273-b1ab-e8661aa72cd8%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Hi Roman,

With the help of Xiaohui Qu, we noticed one important followup to this conversation. The multi-launch script works by spawning multiple parallel workers (e.g., one on each core when parallelizing over a single processor). Each worker functions as normal, but sometimes this causes behavior that looks strange.

For example, it you have a diamond shaped workflow:

1 -> 2,3 -> 4

and you run multi-launch with num_jobs = 2, you would expect Fireworks 2 and 3 to run in parallel over the 2 workers. However, what happens in practice is:

  • 2 parallel workers start in the beginning

  • Worker A starts Firework 1

  • Worker B sees nothing to run (since Firework 1 is not yet finished, and other Fireworks depend on Firework 1)

  • With the nlaunches setting set to 0 (default), Worker B quits since it sees no jobs available to run and nlaunches=0 means a worker stops when there is nothing left to run.

  • Now, only Worker A is left, and things do not run in parallel since Worker B has already quit.

The easiest fix is to set nlaunches to “infinity” or equal to a large number. In the future it might be nice to have other options, e.g. to have Worker B quit only if no jobs can be found for N minutes and not to immediately quit if there are no jobs (or better, if there is nothing left waiting to run within the constraints of Worker B).

I hope this helps address some of the issues we were seeing in a private conversation.

Best,

Anubhav

···

On Wednesday, March 30, 2016 at 10:52:28 PM UTC-7, ajain wrote:

Hi Roman,

If you are running on a single machine, typically you want to just use a single Worker unless you have different job types and want the same machine to handle two different categories of jobs differently. In your case, I think you want to keep a single Worker but use the multi-launch:

https://pythonhosted.org/FireWorks/multi_job.html

Let me know if it doesn’t solve your problem

Best,

Anubhav

On Wed, Mar 30, 2016 at 9:44 PM, Roman M [email protected] wrote:

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/11f9fa81-6d95-4273-b1ab-e8661aa72cd8%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Hi,

I am a bit confused about what can be parallelized and what not through the mlaunch application:

The documentation mentions that independent workflows can be parallelized, whereas the reply by Anubhav suggests that individual Fireworks are run in parallel, when the structure of the workflow permits it.

Should I interpret the term workflow and firework as interchangeable here?

Thanks!

···

On Thursday, March 31, 2016 at 12:44:01 AM UTC-4, Roman M wrote:

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

Hi Nick,

Sorry for the confusion. Usually we are careful in the docs to properly distinguish Firework vs workflow. In this case, the docs were written a bit sloppily.

The parallelization is over Fireworks, which can be either within the same Workflow or across several Workflows. Each parallel worker just pulls available (=READY) Fireworks and runs them regardless of what Workflow they are in.

I will update the docs soon to be more clear.

Best,

Anubhav

···

On Fri, Apr 15, 2016 at 9:04 AM, Nick Vandewiele [email protected] wrote:

Hi,

I am a bit confused about what can be parallelized and what not through the mlaunch application:

The documentation mentions that independent workflows can be parallelized, whereas the reply by Anubhav suggests that individual Fireworks are run in parallel, when the structure of the workflow permits it.

Should I interpret the term workflow and firework as interchangeable here?

Thanks!

On Thursday, March 31, 2016 at 12:44:01 AM UTC-4, Roman M wrote:

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/45421449-9be9-4c60-9ad3-176847844136%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Ok I took a stab at updating the docs:

http://pythonhosted.org/FireWorks/multi_job.html

Let me know if anything is still unclear.

Best,

Anubhav

···

On Friday, April 15, 2016 at 9:09:31 AM UTC-7, ajain wrote:

Hi Nick,

Sorry for the confusion. Usually we are careful in the docs to properly distinguish Firework vs workflow. In this case, the docs were written a bit sloppily.

The parallelization is over Fireworks, which can be either within the same Workflow or across several Workflows. Each parallel worker just pulls available (=READY) Fireworks and runs them regardless of what Workflow they are in.

I will update the docs soon to be more clear.

Best,

Anubhav

On Fri, Apr 15, 2016 at 9:04 AM, Nick Vandewiele [email protected] wrote:

Hi,

I am a bit confused about what can be parallelized and what not through the mlaunch application:

The documentation mentions that independent workflows can be parallelized, whereas the reply by Anubhav suggests that individual Fireworks are run in parallel, when the structure of the workflow permits it.

Should I interpret the term workflow and firework as interchangeable here?

Thanks!

On Thursday, March 31, 2016 at 12:44:01 AM UTC-4, Roman M wrote:

Stumbled upon Fireworks today, and have a newbie question. Thanks in advance.

I have eight cores on my machine, my workflow has a diamond shape. I first split work into 1000 tasks, calculate 1000 in parallel and then join the results together. I don’t quite understand, where I specify the number of concurrent workers, I want to set it to 8.

If someone can point me to an example in python, that would be great.

-Roman

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/45421449-9be9-4c60-9ad3-176847844136%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.