Hello,
I’m currently running a Python script on my laptop which submits jobs to another computer’s database (let’s call it “Base”) and I then connect to Base through ssh to execute “qlaunch -rh worker -ru michael rapidfire” in the command line (which will send the jobs to another computer called “worker”). I have Fireworks installed on all 3 computers and this works fine when my fireworks contain only built-in firetasks. However, when I include custom firetasks (in this case the RunQECalc task stored in run_qe_calc_task_v2.py) and try to run qlaunch I get the following error:
michael@Base:~/FireworkFiles/rocketruns$ qlaunch -rh worker -rc /home/michael/FireworkFiles/rocketruns/ -ru michael rapidfire --nlaunches 1
[worker] run: qlaunch rapidfire --nlaunches 1
[worker] out: 2016-07-07 09:56:59,786 INFO getting queue adapter
[worker] out: 2016-07-07 09:56:59,786 INFO Created new dir /home/michael/FireworkFiles/rocketruns/block_2016-07-07-15-56-59-786446
[worker] out: 2016-07-07 09:56:59,803 INFO The number of jobs currently in the queue is: 0
[worker] out: 2016-07-07 09:56:59,803 INFO 0 jobs in queue. Maximum allowed by user: 10
[worker] out: 2016-07-07 09:56:59,820 ERROR ----|vvv|----
[worker] out: 2016-07-07 09:56:59,820 ERROR Error with queue launcher rapid fire!
[worker] out: 2016-07-07 09:56:59,821 ERROR Traceback (most recent call last):
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/queue/queue_launcher.py”, line 192, in rapidfire
[worker] out: while jobs_in_queue < njobs_queue and launchpad.run_exists(fworker) \
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/core/launchpad.py”, line 511, in run_exists
[worker] out: return bool(self._get_a_fw_to_run(query=q, checkout=False))
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/core/launchpad.py”, line 663, in _get_a_fw_to_run
[worker] out: m_fw = self.get_fw_by_id(m_fw[‘fw_id’])
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/core/launchpad.py”, line 316, in get_fw_by_id
[worker] out: return Firework.from_dict(self.get_fw_dict_by_id(fw_id))
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 147, in _decorator
[worker] out: new_args[0] = {k: _recursive_load(v) for k, v in args[0].items()}
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 147, in
[worker] out: new_args[0] = {k: _recursive_load(v) for k, v in args[0].items()}
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 108, in _recursive_load
[worker] out: return {k: _recursive_load(v) for k, v in obj.items()}
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 108, in
[worker] out: return {k: _recursive_load(v) for k, v in obj.items()}
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 111, in _recursive_load
[worker] out: return [_recursive_load(v) for v in obj]
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 103, in _recursive_load
[worker] out: return load_object(obj)
[worker] out: File “/usr/local/lib/python2.7/dist-packages/fireworks/utilities/fw_serializers.py”, line 306, in load_object
[worker] out: mod = import(modname, globals(), locals(), [classname], 0)
[worker] out: ImportError: No module named run_qe_calc_task_v2
[worker] out:
[worker] out: 2016-07-07 09:56:59,821 ERROR ----|^^^|----
[worker] out:
Disconnecting from worker… done.
The error tells me that it cannot find the module run_qe_calc_task_v2 but here’s what I don’t understand. Base and worker share the same home directory so the same .bashrc file is run whenever I connect to either using ssh. Within .bashrc I’ve included the line export PYTHONPATH="${PYTHONPATH}:/home/michael/FireworkFiles/my_firetasks/" where I store all of my custom firetasks, so if I execute the line echo $PYTHONPATH on either computer then the my_firetasks directory is present. I’ve also set the my_firetasks directory within my laptop’s PATH variable so there isn’t a problem when I try to run qlaunch rapidfire on Base, or qlaunch rapidfire on worker when there’s a my_launchpad.yaml file connecting it to Base’s database. The problem only appears when I send the firework through remote qlaunch.
I tried replacing the custom firetask with task2 = ScriptTask(script = ‘echo $PYTHONPATH ; python -c “import os; print(os.environ)”’) in order to see if the PYTHONPATH variable was present when the firework was running. Again, /home/michael/FireworkFiles/my_firetasks/ was present for qlaunch rapidfire on both Base and worker but couldn’t be found for qlaunch -rh worker -ru michael rapidfire, leading me to wonder why .bashrc isn’t read on worker when it boots up to run the job.
The only solution I found was to add run_qe_calc_task_v2.py to the fireworks.user_objects directory on each computer. That way I could change the import statement in the python script from from run_qe_calc_task_v2 import RunQECalc to from fireworks.user_objects.run_qe_calc_task_v2 import RunQECalc and then remote qlaunch would work fine. The problem is I will later be accessing other computing clusters where I don’t have access to the site-packages directory of Python. The only way I’ve thought of to work around this is to create virtual environments on each computer where I can store all custom firetasks in fireworks.user_objects but I want to know if there’s a simpler solution. How can I get remotely launched fireworks to remember the PYTHONPATH, or how can I make sure remote fireworkers read their .bashrc file when sent a remote firework?
Michael