Search Blogs

Tuesday, October 15, 2024

Making KMC Cool

If you've ever tried to simulate a deposition process for film growth, you'll quickly find that a lot of different events take place. Moreover, these events occur at different timescales. In a deposition process, for example, the particles/species are arriving at the surface at a much faster timescale than, say, surface diffusion. To put it in context, say we have a 1 mm$^2$ area, and our particle flux is $10^{21} \frac{\#}{\text{m}^2 \cdot \text{s}}$, while the diffusion coefficient is $10^{-4} \frac{\text{cm}^2}{\text{s}}$. If we assume some desired number of particles to deposit to create a film on this area, then the timescales to see events are something like microseconds for deposition and thousands of seconds for diffusion.

These timescales are vastly different and hence will need to be handled differently. In the deposition process, we could treat it directly using atomistic simulations, but it turns out that for most practical fluxes, the timescales are just out of reach. To solve this problem, we can turn to a technique called kinetic Monte Carlo (KMC). For diffusion, it's clear that we need to use KMC or solve mass transport equations.

Kinetic Monte Carlo (KMC)

Most people are familiar with Monte Carlo methods. These are stochastic approaches to solving problems where, instead of computing the solutions directly, we sample from probable events. The probability is determined by the probability mass or density function. Monte Carlo methods can be used to solve all kinds of problems, for example performing integration. Say we have a function we want to integrate but don't know a deterministic algorithm or analytical solution. We can use Monte Carlo to randomly sample the area under the curve and estimate a definite integral. The ultimate accuracy depends on how much sampling you decide to perform.

Monte Carlo methods let us perform random sampling to estimate the behavior of a system or process; if we think back to our integral example, we can think of the system or process being described by a high-dimensional probability density or mass function that we are trying to integrate to get the expected behavior (i.e., value).

So what does it mean when we add the prefix "kinetic" to Monte Carlo? It simply designates that the process has a time component to the evolution. Think of a chemical reaction or diffusive dynamics. In KMC, the events that are described are derived from rates, $\nu$, which are usually in inverse time units. The probability of an event is determined by the rate, and this in turn informs the elapsed time. To expand, let's look at the math.

Say we have some event, $\Omega_i$, that corresponds to a diffusive hop. There is some energy required to overcome for the diffusive hop to occur, given as $E_{a}^i$. We can describe the rate as $\nu_i = \nu_0^i \exp\left(-\frac{E_{a}^i}{k_B T}\right)$, where $k_B$ is the Boltzmann constant, $T$ is the temperature, and $\nu_0^i$ is the prefactor (attempt frequency) for event $i$. So now assume we have $N$ events that describe some diffusive hop to a different site/location. We can describe the probability of any one event occurring by:

\begin{equation} P(\Omega_i; E_a^i, \nu_0^i) = \frac{\nu_0^i \exp\left(-\dfrac{E_{a}^i}{k_B T}\right)}{\sum\limits_{j=1}^N \nu_0^j \exp\left(-\dfrac{E_{a}^j}{k_B T}\right)} \label{eq:probability} \end{equation}

I intentionally include that the probability is parameterized based on selecting the activation energy and the prefactor. However, we could easily consider this as a conditional probability if we have some knowledge of the range of values for these terms.

Thus, you have a way to sample from the event catalog now. So how do you define time based on the sampling? If an event $\Omega_i$ is selected, then we can calculate the time increment associated with this event. In kinetic Monte Carlo, we generate a second random number to calculate the time increment. Specifically, the time increment $\Delta t$ is given by:

\begin{equation} \Delta t = -\dfrac{1}{R_N} \ln r \end{equation}

where $r$ is a uniform random number between 0 and 1, and $R_N$ is the cumulative rate for all events/processes, i.e., $R_N = \sum\limits_{j=1}^N \nu_j$. This time increment reflects the stochastic time elapsed before the occurrence of event $\Omega_i$.

It's a pretty simple formalism but very powerful to simulate kinetic processes at the right timescales. The challenge is always two-fold:

  1. Treating a global event table with all process timescales and rates can lead to the need for very large sampling trials. In other words, if we treat the deposition and diffusive processes on the same event table, diffusive events will have a lower chance of being sampled. The easiest way to address this is simply by decoupling them and having two KMC times simultaneously evolve.

  2. A priori knowledge of what should happen, i.e., this atom deposits at this site and reaction $\alpha$ occurs if it has species A but does reaction $\beta$ if it has species B. For well-understood systems that have simple species, this is manageable, but it should be clear that the combinatorics of events for multi-species deposition and diffusion becomes prohibitive, and more importantly, we have many model parameters to decide on.

Lattice KMC

One way to simplify the KMC approach and to simulate crystalline materials is to use a lattice formalism. This means we can describe lattice points where events can occur and thus it somewhat restricts what can happen. In particular, diffusion hops are constrained to occur between neighboring lattice sites. This works well for crystalline materials with a single basis, but one may need to fall back on mesh methods to treat more complicated crystals.

🧠 Thinking out Loud

I haven't really thought too much about meshing approaches, but the idea would be to define nested lattices where each basis point is defined. One could then define the events that are allowed within a lattice and between lattices. Take a Cesium Chloride crystal; we would have two interpenetrating simple cubic lattices.

Some of the challenges with lattice KMC are that you can't treat amorphous materials and grain boundaries. And it gets complicated for multiple species in my view.

You're also still plagued with the fact that you need to enumerate all the events and assign the rates. Maybe not too difficult for a monoatomic species, where you can find a good amount of information for the surface deposition/diffusion and bulk diffusion.

On-the-Fly Off-Lattice KMC

First, imagine we remove the lattice constraint and let points (representing atoms or molecules) occupy any point in space. Then, say we have a way to determine for a species the value of $E_a^i$ for some motion it could have in space. Think of a surface atom hopping on top of another surface atom. If you could find and enumerate all the possible motions such an atom could have, you would come up with a list of activation energies and therefore rates. Thus, you would have cataloged an event table you could then sample from to perform KMC. This is on-the-fly KMC.

Basically, you need to have a model/description for the energy barrier associated with different dynamical processes. We do this through the potential energy surface (PES) model, which is the functional surface describing, for a given configuration of atoms in space, what is the potential energy and the associated gradients (i.e., forces).

Once you have a way to describe the PES, then we need a methodology to sample different pathways (e.g., reaction coordinates) to determine what pathways through space an atom(s) could take and the associated energetic barrier. Then, through transition state theory, we can determine the rate of each pathway; that is, we get an Arrhenius expression for the rate of a reaction coordinate:

\begin{equation} \nu_i = \nu_0 \exp\left(-\dfrac{E_{a}^i}{k_B T}\right) \end{equation}

Challenges with Transition State Theory

The two major hurdles with transition state theory are:

  1. You need to have an accurate PES description for the chemical system you're trying to describe. This usually requires force-field or interatomic potential development (see my posts discussing such topics).

  2. The pathway along each reaction coordinate either needs an intelligent way to search around an initial point or requires an initial and final point to be connected. The former are called single-ended methods, and techniques like the dimer method or growing string method are used. These essentially find the saddle points of the PES, and thus the final state is just the "downhill" path from the saddle point. The other method is dual-ended, where the initial and final coordinates are known, but possible paths are not. We then can find the possible barriers and usually only tabulate the one that is the lowest.

Most of the effort is always centered around proper PES description and setting up the TST calculations, which can be truly challenging. This is why we probably don't see many on-the-fly KMC codes in practice. The only one that I came across was EON from the Henkelman group at UT Austin [1]. It uses a client-server approach where the server runs the TST calculations and then hands the event table back to the KMC client.

Putting It All Together

In practice, I think the outline for an on-the-fly off-lattice KMC code that simulates deposition and diffusion would be something like:

flowchart TD Start{Start Simulation} --> Init[Initialize Deposition Events] Init --> OuterLoop{Max steps?} OuterLoop -->|No| Deposition([Sample deposition event]) Deposition --> InnerKMC[Perform inner KMC loop] InnerKMC --> Update[Update KMC times] Update --> Stats[(Store Statistics)] Stats --> OuterLoop OuterLoop -->|Yes| End(((Terminate))) subgraph InnerKMC [Inner KMC Loop] TST(Transition State Calculation) TST --> SampleTS(Sample from transition state table) end

As this blog was more a discussion on KMC, just so I can jot down my thoughts, I'll leave it at this for now. If you want to read more on lattice KMC, you can read the chapter in the book by LeSar on it; it's a pretty good starting point [2].

References

[1] S.T. Chill, M. Welborn, R. Terrell, L. Zhang, J.-C. Berthet, A. Pedersen, H. Jónsson, G. Henkelman, EON: software for long time simulations of atomic scale systems, Modelling Simul. Mater. Sci. Eng. 22 (2014) 055002. DOI.

[2] R. LeSar, Introduction to Computational Materials Science: Fundamentals to Applications, Cambridge University Press, Cambridge; New York, 2013. DOI



Reuse and Attribution

Thursday, September 19, 2024

Some Updates

I haven't been writing many blog posts over the last two months—just playing catch-up with daily life. However, I came across a few tools during my last post that I wanted to share.

ML Potentials

The pace of new universal ML potentials for atomistic simulations is ramping up, with proprietary models currently leading the way. It seems like much of the activity is following the trend set by large language models (LLMs), 🤷‍♂️. Earlier this month, I spent an evening setting up the Orbital Materials pre-trained potential models to test some property predictions on specific systems and compare them to other physics-informed potentials. I haven't fully benchmarked this yet but plan to include it in my draft pre-print1.

As I mentioned in a previous post, if you're looking for a good overall comparison of different ML potentials, check out the Matbench Discovery paper [1]. There is also this recent effort aimed at benchmarking more specifically for MD simulations, though it’s still under development and hence sparse in detail.

The Orbital pre-trained model(s) seems to perform reasonably well, on par with the MACE-MP2 model [3]. One thing I wanted to achieve was to make the Orbital models ASE calculator available for use with LAMMPS via a Python wrapper. Fortunately, the skeleton code for this had already been done by the AdvancedSoft Corp team for M3GNet, making it straightforward to implement a similar wrapper. I went ahead and did so; you can find it here. It seems to work correctly, but I have yet to fully stress-test the implementation. The benefit of having the LAMMPS interface is that you can now utilize many of the compute and fix commands available in LAMMPS.

One thing I want to refactor in the Orbital LAMMPS wrapper is to eliminate the use of the Python driver script. While this approach allows modifying the ASE Calculator interface, I'm unsure if it's the most efficient solution. It seems more robust to instantiate the Python class and directly call its methods from within the C++ code with PyBind11, particularly for better memory management and reduced overhead.

Turning Papers into Podcasts

Reading research papers is something I spend a lot of time on—it's part of being a scientist or engineer. The challenge is the sheer volume of papers being published, making it practically impossible to keep up by reading them all. You need to find ways to efficiently digest and select the ones worth your time3. One of the best outcomes of the recent advancements in generative AI and LLMs is the ability to consume information in one medium and convert it into another. In other words, you can now turn papers into podcasts or lectures and listen to them during your commute to school or work. I'm not referring to mere dictations (although you can do that too), but using AI to distill the important details of a paper into a script and then have it narrated.

There are a few tools out there that do this. The two I've been experimenting with are PDF to Audio Converter [4] from the Buehler LAMM Group at MIT and PDF to Podcast4. So what kind of quality do you get? I’d say it's pretty impressive. Here's an audio lecture based on the tutorial paper "How to train a neural network potential" by Tokita and Behler [5]:

It definitely captures the university lecture style, although it may not convey all the nuances of the tutorial paper. Still, it’s a great way to decide if you want to dive deeper into the paper. I'm not sure how it handles figures or equations, but it would be interesting if it tried to describe them. If you prefer a conversational style (e.g., question-answer format), you can switch the profile.

Setting up PDF to Podcast locally is fairly straightforward, but you can also use the HuggingFace web app. You’ll need your OpenAI API key, and the cost for the example above was about $0.30 USD using GPT-4o. You also get the transcript of the paper, so you can review it afterward if you want to find a specific detail.

My goal is to use this method to go through papers while I'm in the car, at least to determine if they are of interest and worth my limited time to read. After all, nothing beats reading a good paper in depth.

Footnotes


  1. Yes I know its kind of funny to say "draft pre-print" but I just haven't had the time to turn this into a arXiv uploadable preprint yet. 

  2. This is just the model parameters the actual model code is here

  3. Most papers are a waste of time and really are making an impact in a meaningful way. 

  4. This was actually the code the Buehler LAMM group used for their project. 

References

[1] Riebesell, J., Goodall, R., Jain, A., Persson, K., & Lee, A. (Date TBD). Can machine learning identify stable crystals? [Preprint]. Matbench Discovery. Retrieved from https://matbench-discovery.materialsproject.org/preprint.

[2] Y. Chiang, MLIP Arena, https://github.com/atomind-ai/mlip-arena.

[3] I. Batatia, D.P. Kovacs, G.N.C. Simm, C. Ortner, G. Csanyi, MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields, in: A.H. Oh, A. Agarwal, D. Belgrave, K. Cho (Eds.), Advances in Neural Information Processing Systems, 2022. https://openreview.net/forum?id=YPpSngE-ZU.

[4] A. Ghafarollahi, M.J. Buehler, SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning, (2024). https://arxiv.org/abs/2409.05556.

[5] A.M. Tokita, J. Behler, How to train a neural network potential, The Journal of Chemical Physics 159 (2023) 121501. https://doi.org/10.1063/5.0160326.



Reuse and Attribution