How can I query the database inside a firetask?

felipe_zapata · November 16, 2015, 2:33pm

Dear all,
Assume that I have the following Firetask,

class FooTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

var_foo = “var_foo”

PyTask(func = ‘math.exp’, args = [3.2] , stored_data_varname = var_foo)

class BarTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

previous_foo = query_db(“var_foo”)

Does Fireworks have a query_db function? In other words, can I do some work in a task using Pytask then retrieve this result in a further computation?

Anubhav_Jain · November 16, 2015, 5:00pm

Hi,

I have a couple of comments here.

First, you should give BarTask a different _fw_name than footask (or don’t set the _fw_name at all, in which case it defaults to package_name.class_name, or use the @explicit_serialize decorator instead).

Second, I am not sure that you need to call PyTask inside of the “run_task” method of footask. The run_task method can contain any Python code, not just FireTasks. Perhaps you are using PyTask so that you can use the stored_data_varname. In that case, I would still suggest just calling the Python routines you want directly, and then storing the desired variables at the end using a FWAction (see example #1 below)

Currently, the stored_data_varname is meant to be more “archival” (look it up later for reference) than a variable that is meant to affect the execution of the workflow. In order to do the latter, I would suggest one of the following:

(recommended) don’t use PyTask, and just use a FWAction at the end that both stores and passes on the variable:

    def run_task(fw_spec):
var_foo = math.exp(3,2)
return FWAction(update_spec={"foo":var_foo}, stored_data={"foo":var_foo})

Then, the next FireTask will be able to access the variable using the

fw_spec['foo']

This is due to the update_spec part. Actually, you might not need the stored_data at all in this case. The value of foo will be stored in the spec of the next FireWork due to update_spec. A few more details about this are in the following tutorials:

http://pythonhosted.org/FireWorks/dynamic_wf_tutorial.html

http://pythonhosted.org/FireWorks/guide_to_writing_firetasks.html

(not recommended). Remember, there is no simple way to access stored_data_varname. You will need to query the launches database to get this information. You can get access to a launchpad object by setting the following in your fw_spec:

{"_add_launchpad_and_fw_id": True}

Then, your firetask will have access to a “launchpad” and “fw_id” internal variables, i.e. inside of run_task() you will be able to access self.launchpad and self.fw_id. Then you can do something like:

stored_data = self.launchpad.launches.find({“fw_id”: self.fw_id}, {“action.stored_data”:1})[‘action’][‘stored_data’]

after storing your data. Again, I don’t recommend this technique.

Finally, if there is a certain way you’d want things to work, feel free to suggest it. We can always look into adding it if it makes sense in a general way.

Best,

Anubhav

···

On Mon, Nov 16, 2015 at 9:33 AM, [email protected] wrote:

Dear all,
Assume that I have the following Firetask,

class FooTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

var_foo = “var_foo”

PyTask(func = ‘math.exp’, args = [3.2] , stored_data_varname = var_foo)

class BarTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

previous_foo = query_db(“var_foo”)

Does Fireworks have a query_db function? In other words, can I do some work in a task using Pytask then retrieve this result in a further computation?

–

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/49be51b6-e2df-4000-a020-c31c82d5d8df%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

felipe_zapata · November 17, 2015, 5:41pm

Thank you for the nice explanation!

Now, assuming that I pass the arguments as described in solution 1. Can I get the fw_spec from the last firework?

Suppose I have an script like,

launchpad = LaunchPad()

launchpad.reset(’’, require_password=False)

fireworks_WF = create_tasks(workflow)

launchpad.add_wf(fireworks_WF)

rapidfire(launchpad, nlaunches=0)

fw_spec = query_lpad(launchpad)

Is it there a way to query the lpad once the workflow is done, asking for some variables store in the fw_spec?

Best,

Felipe

···

On Monday, November 16, 2015 at 6:00:52 PM UTC+1, Anubhav Jain wrote:

Hi,

I have a couple of comments here.

First, you should give BarTask a different _fw_name than footask (or don’t set the _fw_name at all, in which case it defaults to package_name.class_name, or use the @explicit_serialize decorator instead).

Second, I am not sure that you need to call PyTask inside of the “run_task” method of footask. The run_task method can contain any Python code, not just FireTasks. Perhaps you are using PyTask so that you can use the stored_data_varname. In that case, I would still suggest just calling the Python routines you want directly, and then storing the desired variables at the end using a FWAction (see example #1 below)

Currently, the stored_data_varname is meant to be more “archival” (look it up later for reference) than a variable that is meant to affect the execution of the workflow. In order to do the latter, I would suggest one of the following:

(recommended) don’t use PyTask, and just use a FWAction at the end that both stores and passes on the variable:

    def run_task(fw_spec):
var_foo = math.exp(3,2)
return FWAction(update_spec={"foo":var_foo}, stored_data={"foo":var_foo})

Then, the next FireTask will be able to access the variable using the

fw_spec['foo']

This is due to the update_spec part. Actually, you might not need the stored_data at all in this case. The value of foo will be stored in the spec of the next FireWork due to update_spec. A few more details about this are in the following tutorials:

http://pythonhosted.org/FireWorks/dynamic_wf_tutorial.html

http://pythonhosted.org/FireWorks/guide_to_writing_firetasks.html

(not recommended). Remember, there is no simple way to access stored_data_varname. You will need to query the launches database to get this information. You can get access to a launchpad object by setting the following in your fw_spec:

{"_add_launchpad_and_fw_id": True}

Then, your firetask will have access to a “launchpad” and “fw_id” internal variables, i.e. inside of run_task() you will be able to access self.launchpad and self.fw_id. Then you can do something like:

stored_data = self.launchpad.launches.find({“fw_id”: self.fw_id}, {“action.stored_data”:1})[‘action’][‘stored_data’]

after storing your data. Again, I don’t recommend this technique.

Finally, if there is a certain way you’d want things to work, feel free to suggest it. We can always look into adding it if it makes sense in a general way.

Best,

Anubhav

On Mon, Nov 16, 2015 at 9:33 AM, [email protected] wrote:

Dear all,
Assume that I have the following Firetask,

class FooTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

var_foo = “var_foo”

PyTask(func = ‘math.exp’, args = [3.2] , stored_data_varname = var_foo)

class BarTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

previous_foo = query_db(“var_foo”)

Does Fireworks have a query_db function? In other words, can I do some work in a task using Pytask then retrieve this result in a further computation?

–

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/49be51b6-e2df-4000-a020-c31c82d5d8df%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Anubhav_Jain · November 18, 2015, 6:40pm

Hi Felipe,

Once the workflow is completed, you can query and display things inside the FW spec.

See this link for more info:

https://pythonhosted.org/FireWorks/query_tutorial.html

The “-d all” shows you the entire spec, and there is also an example that shows you how to query by the spec values.

There is also a web site that you can use to click and browse results:

https://pythonhosted.org/FireWorks/basesite_tutorial.html

Finally, if you are familiar with MongoDB, you can directly query the database. The links to the collections are LaunchPad.fireworks, LaunchPad.workflows, and LaunchPad.launches. e.g.

my_launchpad = LaunchPad.from_file(MY_FILE)

my_launchpad.fireworks.find({“fw_id”:1}, {“spec”:1})

See docs for MongoDB and pymongo for more on how to use the find() command and related commands.

Best

Anubhav

···

On Tue, Nov 17, 2015 at 9:41 AM, [email protected] wrote:

Thank you for the nice explanation!

Now, assuming that I pass the arguments as described in solution 1. Can I get the fw_spec from the last firework?

Suppose I have an script like,

launchpad = LaunchPad()

launchpad.reset(’’, require_password=False)

fireworks_WF = create_tasks(workflow)

launchpad.add_wf(fireworks_WF)

rapidfire(launchpad, nlaunches=0)

fw_spec = query_lpad(launchpad)

Is it there a way to query the lpad once the workflow is done, asking for some variables store in the fw_spec?

Best,

Felipe

On Monday, November 16, 2015 at 6:00:52 PM UTC+1, Anubhav Jain wrote:

Hi,

I have a couple of comments here.

First, you should give BarTask a different _fw_name than footask (or don’t set the _fw_name at all, in which case it defaults to package_name.class_name, or use the @explicit_serialize decorator instead).

Second, I am not sure that you need to call PyTask inside of the “run_task” method of footask. The run_task method can contain any Python code, not just FireTasks. Perhaps you are using PyTask so that you can use the stored_data_varname. In that case, I would still suggest just calling the Python routines you want directly, and then storing the desired variables at the end using a FWAction (see example #1 below)

Currently, the stored_data_varname is meant to be more “archival” (look it up later for reference) than a variable that is meant to affect the execution of the workflow. In order to do the latter, I would suggest one of the following:

(recommended) don’t use PyTask, and just use a FWAction at the end that both stores and passes on the variable:

    def run_task(fw_spec):
var_foo = math.exp(3,2)
return FWAction(update_spec={"foo":var_foo}, stored_data={"foo":var_foo})

Then, the next FireTask will be able to access the variable using the

fw_spec['foo']

This is due to the update_spec part. Actually, you might not need the stored_data at all in this case. The value of foo will be stored in the spec of the next FireWork due to update_spec. A few more details about this are in the following tutorials:

http://pythonhosted.org/FireWorks/dynamic_wf_tutorial.html

http://pythonhosted.org/FireWorks/guide_to_writing_firetasks.html

(not recommended). Remember, there is no simple way to access stored_data_varname. You will need to query the launches database to get this information. You can get access to a launchpad object by setting the following in your fw_spec:

{"_add_launchpad_and_fw_id": True}

Then, your firetask will have access to a “launchpad” and “fw_id” internal variables, i.e. inside of run_task() you will be able to access self.launchpad and self.fw_id. Then you can do something like:

stored_data = self.launchpad.launches.find({“fw_id”: self.fw_id}, {“action.stored_data”:1})[‘action’][‘stored_data’]

after storing your data. Again, I don’t recommend this technique.

Finally, if there is a certain way you’d want things to work, feel free to suggest it. We can always look into adding it if it makes sense in a general way.

Best,

Anubhav

On Mon, Nov 16, 2015 at 9:33 AM, [email protected] wrote:

Dear all,
Assume that I have the following Firetask,

class FooTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

var_foo = “var_foo”

PyTask(func = ‘math.exp’, args = [3.2] , stored_data_varname = var_foo)

class BarTask(FireTaskBase):

*_fw_name = “footask” *

def run_task(self, fw_spec):

previous_foo = query_db(“var_foo”)

Does Fireworks have a query_db function? In other words, can I do some work in a task using Pytask then retrieve this result in a further computation?

–

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/49be51b6-e2df-4000-a020-c31c82d5d8df%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

–

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/d3aa78a8-0004-480f-aef8-d9e10ca2b1ee%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.