Error running VASP in offline mode

Hi,

I’m trying to run atomate in offline mode because the compute nodes at our center cannot connect to the database. I have everything set up and running. But when I try to run any VASP workflows I get the following error in the FW_job.err file:

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 246, in run

Rocket.update_checkpoint(lp, launch_dir, launch_id, checkpoint)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 443, in update_checkpoint

f_out.write(json.dumps(d, ensure_ascii=False))

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 238, in dumps

**kw).encode(obj)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 199, in encode

chunks = self.iterencode(o, _one_shot=True)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 257, in iterencode

return _iterencode(o, 0)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 179, in default

raise TypeError(f'Object of type {o.__class__.__name__} '

TypeError: Object of type VaspJob is not JSON serializable

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/bin/rlaunch”, line 10, in

sys.exit(rlaunch())

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/scripts/rlaunch_run.py”, line 155, in rlaunch

launch_rocket(launchpad, fworker, args.fw_id, args.loglvl, pdb_on_exception=args.pdb)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket_launcher.py”, line 58, in launch_rocket

rocket_ran = rocket.run(pdb_on_exception=pdb_on_exception)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 415, in run

d = json.loads(f_in.read())

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 348, in loads

return _default_decoder.decode(s)

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 337, in decode

obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 355, in raw_decode

raise JSONDecodeError("Expecting value", s, err.value) from None

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The VASP job seems to have run properly, but the FW_offline.json file is empty at the end of the run. Using lpad recover_offline gives the below error and the job fizzles out.

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/launchpad.py”, line 1691, in recover_offline

offline_data = loadfn(offline_loc)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/monty/serialization.py”, line 83, in loadfn

return json.load(fp, *args, **kwargs)

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 296, in load

parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 361, in loads

return cls(**kw).decode(s)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/monty/json.py”, line 255, in decode

d = json.JSONDecoder.decode(self, s)

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 337, in decode

obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 355, in raw_decode

raise JSONDecodeError("Expecting value", s, err.value) from None

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

2019-07-22 16:01:13,056 INFO FINISHED recovering offline runs. 0 job(s) recovered: []

2019-07-22 16:01:13,056 INFO FAILED to recover offline fw_ids: [2, 1]

I was reading through some of the previous threads, but mine doesn’t seem to be a gzip-related error. I also tried changing the “db_file” parameter in my_fworker.yaml to null, and I still get the same errors. Any idea why this might be happening?

Thanks,

Shreyas

1 Like

Hi Shreyas

We don’t really use or actively support atomate in offline mode. That said, can you print out the contents of the “checkpoint” variable prior to the Rocket.update_checkpoint command being executed (line 245 in rocket.py in the latest FWS)?

For the offline mode to work out, the contents of the checkpoint need to be JSON-able. But it seems that the checkpoint is including a VaspJob object which is not JSONable. I just want to see what the checkpoint is supposed to look like and see if we can fix.

Best,

Anubhav

···

On Monday, July 22, 2019 at 4:12:42 PM UTC-7, [email protected] wrote:

Hi,

I’m trying to run atomate in offline mode because the compute nodes at our center cannot connect to the database. I have everything set up and running. But when I try to run any VASP workflows I get the following error in the FW_job.err file:

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 246, in run

Rocket.update_checkpoint(lp, launch_dir, launch_id, checkpoint)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 443, in update_checkpoint

f_out.write(json.dumps(d, ensure_ascii=False))

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 238, in dumps

**kw).encode(obj)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 199, in encode

chunks = self.iterencode(o, _one_shot=True)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 257, in iterencode

return _iterencode(o, 0)

File “/home5/shonrao/miniconda3/lib/python3.7/json/encoder.py”, line 179, in default

raise TypeError(f'Object of type {o.__class__.__name__} '

TypeError: Object of type VaspJob is not JSON serializable

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/bin/rlaunch”, line 10, in

sys.exit(rlaunch())

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/scripts/rlaunch_run.py”, line 155, in rlaunch

launch_rocket(launchpad, fworker, args.fw_id, args.loglvl, pdb_on_exception=args.pdb)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket_launcher.py”, line 58, in launch_rocket

rocket_ran = rocket.run(pdb_on_exception=pdb_on_exception)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/rocket.py”, line 415, in run

d = json.loads(f_in.read())

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 348, in loads

return _default_decoder.decode(s)

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 337, in decode

obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 355, in raw_decode

raise JSONDecodeError("Expecting value", s, err.value) from None

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The VASP job seems to have run properly, but the FW_offline.json file is empty at the end of the run. Using lpad recover_offline gives the below error and the job fizzles out.

Traceback (most recent call last):

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/fireworks/core/launchpad.py”, line 1691, in recover_offline

offline_data = loadfn(offline_loc)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/monty/serialization.py”, line 83, in loadfn

return json.load(fp, *args, **kwargs)

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 296, in load

parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

File “/home5/shonrao/miniconda3/lib/python3.7/json/init.py”, line 361, in loads

return cls(**kw).decode(s)

File “/home5/shonrao/miniconda3/lib/python3.7/site-packages/monty/json.py”, line 255, in decode

d = json.JSONDecoder.decode(self, s)

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 337, in decode

obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File “/home5/shonrao/miniconda3/lib/python3.7/json/decoder.py”, line 355, in raw_decode

raise JSONDecodeError("Expecting value", s, err.value) from None

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

2019-07-22 16:01:13,056 INFO FINISHED recovering offline runs. 0 job(s) recovered: []

2019-07-22 16:01:13,056 INFO FAILED to recover offline fw_ids: [2, 1]

I was reading through some of the previous threads, but mine doesn’t seem to be a gzip-related error. I also tried changing the “db_file” parameter in my_fworker.yaml to null, and I still get the same errors. Any idea why this might be happening?

Thanks,

Shreyas

Same problem here. Please fix offline mode. I will gladly provide my configuration if it helps

FW_job.error:

Traceback (most recent call last):
  File "/home/sugon/venvs/atomate/lib/python3.7/site-packages/fireworks/core/rocket.py", line 246, in run
    Rocket.update_checkpoint(lp, launch_dir, launch_id, checkpoint)
  File "/home/sugon/venvs/atomate/lib/python3.7/site-packages/fireworks/core/rocket.py", line 443, in update_checkpoint
    f_out.write(json.dumps(d, ensure_ascii=False))
  File "/home/sugon/anaconda3/lib/python3.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/home/sugon/anaconda3/lib/python3.7/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/home/sugon/anaconda3/lib/python3.7/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/home/sugon/anaconda3/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type VaspJob is not JSON serializable
Traceback (most recent call last):
  File "/home/sugon/venvs/atomate/bin/rlaunch", line 8, in <module>
    sys.exit(rlaunch())
  File "/home/sugon/venvs/atomate/lib/python3.7/site-packages/fireworks/scripts/rlaunch_run.py", line 154, in rlaunch
    launch_rocket(launchpad, fworker, args.fw_id, args.loglvl, pdb_on_exception=args.pdb)
  File "/home/sugon/venvs/atomate/lib/python3.7/site-packages/fireworks/core/rocket_launcher.py", line 58, in launch_rocket
    rocket_ran = rocket.run(pdb_on_exception=pdb_on_exception)
  File "/home/sugon/venvs/atomate/lib/python3.7/site-packages/fireworks/core/rocket.py", line 415, in run
    d = json.loads(f_in.read())
  File "/home/sugon/anaconda3/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/sugon/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/sugon/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Dear Anubhav Jain,
I got this content you asked above.
Seems that <custodian.vasp.jobs.VaspJob at 0x2afeddaddc50> is not JSONable thing
I’ll try to fix it somehow and tell about results there.

In [1]: import pickle

In [2]: with open(‘block_2021-05-28-13-04-50-066054/launcher_2021-05-28-13-04-51-570345/error.pkl’,‘rb’) as f:
…: checkpoint = pickle.load(f)
…:

In [3]: checkpoint
Out[3]:
{’_task_n’: 3,
‘_all_stored_data’: {‘custodian’: [{‘job’: <custodian.vasp.jobs.VaspJob at 0x2afeddaddc50>,
‘corrections’: [],
‘handler’: None,
‘validator’: None,
‘max_errors’: False,
‘max_errors_per_job’: False,
‘max_errors_per_handler’: False,
‘nonzero_return_code’: False}]},
‘_all_update_spec’: {},
‘_all_mod_spec’: []}

In [9]: checkpoint['_all_stored_data']['custodian'][0]['job']
Out[9]: <custodian.vasp.jobs.VaspJob at 0x2b3d1be74358>

In [10]: job = checkpoint['_all_stored_data']['custodian'][0]['job']

In [11]: job.as_dict()
Out[11]: 
{'@module': 'custodian.vasp.jobs',
 '@class': 'VaspJob',
 '@version': '2021.2.8',
 'vasp_cmd': ['mpirun', '-np', '36', '/home/sugon/bin/vasp_std'],
 'output_file': 'vasp.out',
 'stderr_file': 'std_err.txt',
 'suffix': '',
 'final': True,
 'backup': True,
 'auto_npar': False,
 'auto_gamma': True,
 'settings_override': None,
 'gamma_vasp_cmd': None,
 'copy_magmom': False,
 'auto_continue': False}