Search Blogs

Thursday, April 27, 2023

Atomistic Calculations using GNN

Disclaimer

I'm still very much working through M3GNet, CHGNet, and MACE papers. As a result I may get some of the GNN concepts and descriptins incorrect. Apologies in advance. I will provide updates as I continue learning.

I'm in the process of better understanding the recent release of graph neural network (GNN) interatomic potentials. I'm primarily focused on M3GNet and CHGNet potentials which include 3-body effects by updating the GNN structure [1,2]. The CHGNet adds the feature that it captures dynamic charge transfer effects, although the predictive improvement seems marginal over M3GNet/MEGNet based on a recent matbench-discovery preprint [3]. I will say that in general, I'm fairly optimistic about this class of ML potential and its use as an initial effort to do MD simulations across the periodic table. For reference here are some figure-of-merits for the GNN from ref. [3]:

Table showing a comparison of GNN using Matbench Discovery [3].

Based on the results it's clear that M3GNet and CHGNet perform better overall. CHGNet is at the top and especially does better in the discovery acceleration factor (DAF) which is a measure of how well the model predicts a stable structure (as compared to a random/dummy model).

So what is allowing these potentials to work in the first place? The answer is intricate GNNs. So I'm going to do my best here to try and describe a GNN in my understanding. I'll probably get a lot of points wrong but here's my go at it.

We first need to define what a graph is. A graph is a connected network that contains 3 major components. A set of vertices or nodes, $v \in V$, a set of edges, $e \in E | e; \text{connects to}\; v$, and global state, $U$. Okay, this seems reasonable and at first glance, you can see how an atomic representation of a material can be mapped to a graph. The vertices/nodes are the atoms, the edges are the bonds, and the global state are system properties like density, pressure, and temperature. There are however two additional needs for an atomic system represented by a graph:

  1. Atoms have defined features: mass, atomic number, charge, and valency.
  2. Bonds can be characterized by binding energy and lengths but also bond-pairs matter because bonding angles and dihedrals are characteristic of the chemistry.

So what do you do, you assign features to the vertices and edges corresponding to these. If you break this down then you really have a large set of graphs where each corresponds to a graph with a scalar value assigned to the nodes and edges; these are like weights and bias maybe? Great, I think, but now you need to feed this into a neural network, and for potential, you need it to predict energies, forces, and stresses. Additionally, because physics is invariant to the orientation of the system, meaning if we rotate the entire system by $\pi/2$ we shouldn't see anything change, so our NN needs to handle such inputs (i.e. if we rotate a molecule the prediction is the same). I'll touch on this invariance condition at the end.

Alright, so we have some additional details about the graph or an atomic system and constraints for the NN. We then may want to ask about the feature dimensionality of the graph; or collection of graphs if we want to think like that. This can easily become very large and so one may want to find a latent space representation that is easier to train on to match the energies and forces.  To do this we can use a convolutional NN that finds a single graph representation. Then we can feed this graph into a multi-layer perceptron network to make predictions for the energies, forces, and stresses. 

So what does M3GNet and CHGNet do here? Based on how I understand the papers, they add something that is akin to an attention mechanism to CNN. What is attention? I'm still learning this as well but essentially it is a way to have the CNN consider what the environment looks like and use that information to update the subsequent graph. In other words, say an edge corresponding to a bond is passed through the NN but doesn't know anything about the other edges or nodes, then it will simply learn about what it means to be that type of edge in the grand scheme of the network. However, if the edge is informed, via an attention mechanism (?), about the environment then one updates what it means to be an edge/bond in that context. 

There is also the addition of some more physics-informed aspects of informing/updating the edge features (i.e. bonds) in the graph. This is done by the inclusion of many-body interactions which give sets of nodes and edges, then edge features can be updated. This is very similar to the Tersoff-style interatomic potential which uses a bond-order parameter to modify the pairwise interaction between atoms. In M3GNet this step seems to occur before the update (attention) in the CNN. Basically, a GNN is trained to take in nodes and edges where the edges have features representing the bond distance as an expansion in radial basis functions and these are what gets updated and pass on to the graph CNN. Again this is what I think is going on, but could be wrong.

If I get anything wrong here please leave a comment to correct me. Also on the topic of invariance, I believe that graph CNN can be structured so that they are resistant to translation and rotation of inputs [4].

References

[1] C. Chen, S.P. Ong, A universal graph deep learning interatomic potential for the periodic table, Nat Comput Sci. 2 (2022) 718–728. https://doi.org/10.1038/s43588-022-00349-3.

[2] B. Deng, P. Zhong, K. Jun, K. Han, C.J. Bartel, G. Ceder, CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling, (2023). https://doi.org/10.48550/arXiv.2302.14231.

[3] Riebesell, J., Goodall, R., Jain, A., Persson, K., & Lee, A. (Date TBD). Can machine learning identify stable crystals? [Preprint]. Matbench Discovery. Retrieved from https://matbench-discovery.materialsproject.org/preprint.

[4] N. Keriven, G. Peyré, Universal Invariant and Equivariant Graph Neural Networks, (2019). https://doi.org/10.48550/arXiv.1905.04943.




Reuse and Attribution

1 comment:

  1. Will come back to update this post, but I've been spending a lot of my free time trying to understand GNN architecture and implementation. I'll probably have other posts on GNN in the mean time.

    ReplyDelete

Please refrain from using ad hominem attacks, profanity, slander, or any similar sentiment in your comments. Let's keep the discussion respectful and constructive.