mlaunch leads to deadlock state after "UserWarning: MongoClient opened before fork."?

Hi,

I am testing the mlaunch application to process FireWorks in parallel.

The test case I am running is a slightly modified version of the workflow tutorial in which the two Fireworks that can be run in parallel (and part of a diamond shaped workflow) execute a sleep command for 10s:

fws:

  • fw_id: 1

    spec:

    _tasks:

    • _fw_name: ScriptTask

      script: echo ‘Ingrid is CEO.’

  • fw_id: 2

    spec:

    _tasks:

    • _fw_name: ScriptTask

      script: sleep 10

  • fw_id: 3

    spec:

    _tasks:

    • _fw_name: ScriptTask

      script: sleep 10

  • fw_id: 4

    spec:

    _tasks:

    • _fw_name: ScriptTask

      script: echo ‘Kip is an intern.’

links:

1:

  • 2

  • 3

2:

  • 4

3:

  • 4

metadata: {}

When running this command as follows on a Ubuntu 12.04 (vagrant) VM with 4 assigned processors:

mlaunch 4 --nlaunches infinite

It quickly prints a warning message to the screen:

[email protected]:/vagrant_data/Code/fireworks/fw_tutorials/workflow$ mlaunch 4 --nlaunches infinite

/home/vagrant/anaconda/lib/python2.7/site-packages/pymongo/topology.py:75: UserWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. See PyMongo’s documentation for details: http://api.mongodb.org/python/current/faq.html#using-pymongo-with-multiprocessing>

"MongoClient opened before fork. Create MongoClient "

2016-04-15 12:46:12,488 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-486964

2016-04-15 12:46:12,489 INFO Launching Rocket : (Process-2)

2016-04-15 12:46:12,541 INFO RUNNING fw_id: 1 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-486964

2016-04-15 12:46:12,612 INFO Sleeping for 60 secs : (Process-3)

2016-04-15 12:46:12,711 INFO Task started: ScriptTask.

Ingrid is CEO.

2016-04-15 12:46:12,715 INFO Task completed: ScriptTask

2016-04-15 12:46:12,760 INFO Sleeping for 60 secs : (Process-4)

2016-04-15 12:46:12,903 INFO Rocket finished : (Process-2)

2016-04-15 12:46:12,909 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-908343

2016-04-15 12:46:12,910 INFO Launching Rocket : (Process-2)

2016-04-15 12:46:12,926 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-925160

2016-04-15 12:46:12,928 INFO Launching Rocket : (Process-5)

2016-04-15 12:46:12,945 INFO RUNNING fw_id: 2 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-908343

2016-04-15 12:46:12,981 INFO RUNNING fw_id: 3 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-925160

2016-04-15 12:46:13,114 INFO Task started: ScriptTask.

2016-04-15 12:46:13,146 INFO Task started: ScriptTask.

2016-04-15 12:46:33,119 INFO Task completed: ScriptTask

2016-04-15 12:46:33,151 INFO Task completed: ScriptTask

2016-04-15 12:46:33,301 INFO Rocket finished : (Process-2)

2016-04-15 12:46:33,340 INFO Rocket finished : (Process-5)

2016-04-15 12:46:33,346 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-33-345288

2016-04-15 12:46:33,347 INFO Launching Rocket : (Process-5)

2016-04-15 12:46:33,368 INFO RUNNING fw_id: 4 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-33-345288

2016-04-15 12:46:33,454 INFO Sleeping for 60 secs : (Process-2)

2016-04-15 12:46:33,534 INFO Task started: ScriptTask.

Kip is an intern.

2016-04-15 12:46:33,539 INFO Task completed: ScriptTask

2016-04-15 12:46:33,716 INFO Rocket finished : (Process-5)

2016-04-15 12:46:33,870 INFO Sleeping for 60 secs : (Process-5)

2016-04-15 12:47:12,628 INFO Checking for FWs to run… : (Process-3)

2016-04-15 12:47:12,629 INFO Sleeping for 60 secs : (Process-3)

2016-04-15 12:47:12,780 INFO Checking for FWs to run… : (Process-4)

2016-04-15 12:47:12,782 INFO Sleeping for 60 secs : (Process-4)

The two targeted Fireworks are indeed run in parallel. So far so good. Once the final Firework has completed, seemingly, the workers continue to look for remaining Fireworks to execute, although the workflow has completed. I cannot exit the application as it has slipped into some kind of deadlock state.

My version of

pymongo: 3.2.2

fireworks: 1.2.7

Any suggestions?

Hi Nick,

I don’t see the “MongoClient opened before fork” message on my system (pymongo 3.0.2, Mac OS/X). You could check if adding “connect=False” when initializing the MongoClient in LaunchPad helps the message disappear as stated in the docs you linked to.

That said, regardless of the warning, the workers will continue to keep looking for jobs when nlaunches=infinite. If you set nlaunches=4 they would stop after hitting 4 jobs. I usually quit them using the keyboard combination Ctrl+\

Best,

Anubhav

···

On Fri, Apr 15, 2016 at 9:59 AM, Nick Vandewiele [email protected] wrote:

Hi,

I am testing the mlaunch application to process FireWorks in parallel.

The test case I am running is a slightly modified version of the workflow tutorial in which the two Fireworks that can be run in parallel (and part of a diamond shaped workflow) execute a sleep command for 10s:

fws:

  • fw_id: 1

spec:

_tasks:
- _fw_name: ScriptTask
  script: echo 'Ingrid is CEO.'
  • fw_id: 2

spec:

_tasks:
- _fw_name: ScriptTask
  script: sleep 10
  • fw_id: 3

spec:

_tasks:
- _fw_name: ScriptTask
  script: sleep 10
  • fw_id: 4

spec:

_tasks:
- _fw_name: ScriptTask
  script: echo 'Kip is an intern.'

links:

1:

  • 2
  • 3

2:

  • 4

3:

  • 4

metadata: {}

When running this command as follows on a Ubuntu 12.04 (vagrant) VM with 4 assigned processors:

mlaunch 4 --nlaunches infinite

It quickly prints a warning message to the screen:

[email protected]:/vagrant_data/Code/fireworks/fw_tutorials/workflow$ mlaunch 4 --nlaunches infinite

/home/vagrant/anaconda/lib/python2.7/site-packages/pymongo/topology.py:75: UserWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. See PyMongo’s documentation for details: http://api.mongodb.org/python/current/faq.html#using-pymongo-with-multiprocessing>

"MongoClient opened before fork. Create MongoClient "

2016-04-15 12:46:12,488 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-486964

2016-04-15 12:46:12,489 INFO Launching Rocket : (Process-2)

2016-04-15 12:46:12,541 INFO RUNNING fw_id: 1 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-486964

2016-04-15 12:46:12,612 INFO Sleeping for 60 secs : (Process-3)

2016-04-15 12:46:12,711 INFO Task started: ScriptTask.

Ingrid is CEO.

2016-04-15 12:46:12,715 INFO Task completed: ScriptTask

2016-04-15 12:46:12,760 INFO Sleeping for 60 secs : (Process-4)

2016-04-15 12:46:12,903 INFO Rocket finished : (Process-2)

2016-04-15 12:46:12,909 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-908343

2016-04-15 12:46:12,910 INFO Launching Rocket : (Process-2)

2016-04-15 12:46:12,926 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-925160

2016-04-15 12:46:12,928 INFO Launching Rocket : (Process-5)

2016-04-15 12:46:12,945 INFO RUNNING fw_id: 2 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-908343

2016-04-15 12:46:12,981 INFO RUNNING fw_id: 3 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-12-925160

2016-04-15 12:46:13,114 INFO Task started: ScriptTask.

2016-04-15 12:46:13,146 INFO Task started: ScriptTask.

2016-04-15 12:46:33,119 INFO Task completed: ScriptTask

2016-04-15 12:46:33,151 INFO Task completed: ScriptTask

2016-04-15 12:46:33,301 INFO Rocket finished : (Process-2)

2016-04-15 12:46:33,340 INFO Rocket finished : (Process-5)

2016-04-15 12:46:33,346 INFO Created new dir /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-33-345288

2016-04-15 12:46:33,347 INFO Launching Rocket : (Process-5)

2016-04-15 12:46:33,368 INFO RUNNING fw_id: 4 in directory: /vagrant_data/Code/fireworks/fw_tutorials/workflow/launcher_2016-04-15-16-46-33-345288

2016-04-15 12:46:33,454 INFO Sleeping for 60 secs : (Process-2)

2016-04-15 12:46:33,534 INFO Task started: ScriptTask.

Kip is an intern.

2016-04-15 12:46:33,539 INFO Task completed: ScriptTask

2016-04-15 12:46:33,716 INFO Rocket finished : (Process-5)

2016-04-15 12:46:33,870 INFO Sleeping for 60 secs : (Process-5)

2016-04-15 12:47:12,628 INFO Checking for FWs to run… : (Process-3)

2016-04-15 12:47:12,629 INFO Sleeping for 60 secs : (Process-3)

2016-04-15 12:47:12,780 INFO Checking for FWs to run… : (Process-4)

2016-04-15 12:47:12,782 INFO Sleeping for 60 secs : (Process-4)

The two targeted Fireworks are indeed run in parallel. So far so good. Once the final Firework has completed, seemingly, the workers continue to look for remaining Fireworks to execute, although the workflow has completed. I cannot exit the application as it has slipped into some kind of deadlock state.

My version of

pymongo: 3.2.2

fireworks: 1.2.7

Any suggestions?

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/5aaacf6c-ae08-4d5d-925b-ce3d3d956de1%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.