Radial Distribution Function

Are there any citations for RadialDistributionFunction featurizer? I’m looking to use dictionary returned by this featurizer to predict energy band gap values. But not sure how to reduce distances, distribution key( values for them are 200 length list ) values to single number so as to use in prediction.

Any pointers would be helpful

Thank you in advance

Hello,

We don’t have any citations for the RadialDistributionFunction because there isn’t a paper that emphasizes its use for ML and I’m not sure what paper introduced it in general.

If you use the featurize function to compute the distribution function, it will produce a vector of values you can use as an input to an ML algorithm. Unless I misunderstand your question, it should not be necessary to reduce it to a single value to train a ML model with scikit-learn.

I have an example using the Partial Radial Distribution function, but would not be able to send it to you until I’m back from travel. Would that be useful for you?

Thanks,

Logan

···

On Tue, Aug 27, 2019, 9:31 AM genie [email protected] wrote:

Are there any citations for RadialDistributionFunction featurizer? I’m looking to use dictionary returned by this featurizer to predict energy band gap values. But not sure how to reduce distances, distribution key( values for them are 200 length list ) values to single number so as to use in prediction.

Any pointers would be helpful

Thank you in advance

You received this message because you are subscribed to the Google Groups “matminer” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/matminer/c46e9d0b-7d99-4da1-8105-16eec8fe7007%40googlegroups.com.

Partial Radial Distribution Function adds several descriptors as columns to dataframe(tensor). Radial distribution function is slightly different by adding a single descriptor(feature) of dictionary with keys ‘distances’ and ‘distribution’, whose values are list of length 200. Based on you suggestion I will ignore distances(key) values but add distribution(key) values as descriptors for ML model. Will get back to you for PRDF example if facing a underfitting or overfitting issue in future.

Thanks for the hint.

···

On Thursday, August 29, 2019 at 11:56:46 PM UTC+5:30, Logan Ward wrote:

Hello,

We don’t have any citations for the RadialDistributionFunction because there isn’t a paper that emphasizes its use for ML and I’m not sure what paper introduced it in general.

If you use the featurize function to compute the distribution function, it will produce a vector of values you can use as an input to an ML algorithm. Unless I misunderstand your question, it should not be necessary to reduce it to a single value to train a ML model with scikit-learn.

I have an example using the Partial Radial Distribution function, but would not be able to send it to you until I’m back from travel. Would that be useful for you?

Thanks,

Logan

On Tue, Aug 27, 2019, 9:31 AM genie [email protected] wrote:

Are there any citations for RadialDistributionFunction featurizer? I’m looking to use dictionary returned by this featurizer to predict energy band gap values. But not sure how to reduce distances, distribution key( values for them are 200 length list ) values to single number so as to use in prediction.

Any pointers would be helpful

Thank you in advance

You received this message because you are subscribed to the Google Groups “matminer” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/matminer/c46e9d0b-7d99-4da1-8105-16eec8fe7007%40googlegroups.com.