Search Blogs

Sunday, March 26, 2023

Research hype seems problematic

The recent news  [1,2] about room temperature superconductors (RTSC) at relatively low pressures has been getting a lot of attention, both negative and positive. The negative press is coming from the fact that the PI of the study has had questionable research in the past. The PI has also been resistant to requests for sharing data and samples that were synthesized. 

If it turns out that the nitrogen-doped lutetium hydride is indeed a superconductor at ambient temperatures and pressures around 1 GPa this would indeed be a waypoint on the journey toward superconducting materials. For many, 1 GPa may seem fairly high compared to ambient pressure, which is around ~0.0001 GPa. However, the ability to engineer a coating or conduit that wraps and applies suitable compressive pressures to RTSC should be something feasible. You can think of how Corning's Gorilla glass works, which is an alkali-aluminosilicate material that uses some clever surface composition engineering to create gradients of strain to arrest microcracks/pits that form on the surface; this is done by creating compressive stresses in the material. The difference here is you wouldn't modify the RTSC material directly.

Going back to press on RTSC, I'm glad to see there is a lot of debate going on. One thing that appears to be very clear is that the peer review process for these high-impact journals is not very good. Seems to me that such an impactful article would have received the same criticism that is being displayed in the public discourse. Such criticism would have probably made it much more challenging for the authors to publish their findings as they would have had to overcome many requests for raw data by the reviewers, although I'm not sure journal editors entirely support such requests as the reviewer could be a potential competitor. What I like is that there is a lot of community review going on. Independent researchers and groups are eagerly trying to reproduce the findings and we will probably know shortly what the outcome is. Some early preprints/papers [3,4] are indicating they aren't observing the same resistivity measurements as reported in the original work; not looking too good for the controversial PI.

A parallel event that is going on in quantum computing is the ongoing debate and coverage of the quantum computing wormhole publication [5]. I've worked a bit on quantum algorithms for NISQ devices so I'm a little familiar with what can be done using them. I know nothing about research in quantum gravity or EPR=ER, but I can tell you that the initial (seems they've updated it) coverage by Quanta magazine was awfully misleading. The main message that should have been conveyed is that the simplified model being simulated facilitates the mathematical relation between the dynamics of entanglement in quantum systems and wormholes predicted in general relativity. It does not mean that running the quantum device creates spacetime wormholes in the physical lab, however, anyone reading the original article or related popular stories would be inclined to think this is what happened.

This leads me to start thinking about what is going on with hype in research. Why is it that science is becoming about what hype one can generate around the research? I've seen a lot of good posts on Linkedin commenting on this. It appears that it is strongly linked with the prospectus in securing more funding for their research. I assume the thinking behind this is that if funding agencies and program managers get excited, they won't want to miss out on all the fun! Other comments indicate that it's mostly because much of the science being done is actually not that impactful and very incremental, so things get overblown in importance and meaning. Whatever it may be it seems this is going to create huge issues in the future because popular articles on scientific research that overhype will eventually lead to serious decisions that affect both social and economic life for everyone.

References

[1] N. Dasenbrock-Gammon, E. Snider, R. McBride, H. Pasan, D. Durkee, N. Khalvashi-Sutter, S. Munasinghe, S.E. Dissanayake, K.V. Lawler, A. Salamat, R.P. Dias, Evidence of near-ambient superconductivity in a N-doped lutetium hydride, Nature. 615 (2023) 244–250. https://doi.org/10.1038/s41586-023-05742-0.

[2] H. Pasan, E. Snider, S. Munasinghe, S.E. Dissanayake, N.P. Salke, M. Ahart, N. Khalvashi-Sutter, N. Dasenbrock-Gammon, R. McBride, G.A. Smith, F. Mostafaeipour, D. Smith, S.V. Cortés, Y. Xiao, C. Kenney-Benson, C. Park, V. Prakapenka, S. Chariton, K.V. Lawler, M. Somayazulu, Z. Liu, R.J. Hemley, A. Salamat, R.P. Dias, Observation of conventional near room temperature superconductivity in carbonaceous sulfur hydride, (2023). https://doi.org/10.48550/arXiv.2302.08622.

[3] P. Shan, N. Wang, X. Zheng, Q. Qiu, Y. Peng, J. Cheng, Pressure-induced color change in the lutetium dihydride LuH2, Chinese Phys. Lett. (2023). https://doi.org/10.1088/0256-307X/40/4/046101.

[4] X. Ming, Y.-J. Zhang, X. Zhu, Q. Li, C. He, Y. Liu, B. Zheng, H. Yang, H.-H. Wen, Absence of near-ambient superconductivity in LuH$_{2\pm\text{x}}$N$_y$, (2023). https://doi.org/10.48550/arXiv.2303.08759.

[5] D. Jafferis, A. Zlokapa, J.D. Lykken, D.K. Kolchmeyer, S.I. Davis, N. Lauk, H. Neven, M. Spiropulu, Traversable wormhole dynamics on a quantum processor, Nature. 612 (2022) 51–55. https://doi.org/10.1038/s41586-022-05424-3.


Edited 27 Mar, 2023: In the original version it was stated that Gorilla glass is a borosilicate glass, this is incorrect and the post has been updated to reflect that Gorilla glass is an alkali-aluminosilicate.  Corning does produce a product called Willow glass which is a borosilicate.



Reuse and Attribution

Thursday, March 23, 2023

Parsing research articles with a Zotero workflow

I started thinking about needing to document my reference and journal reading process since I may want to overhaul it later on based on the rapidly evolving tools that are coming out such as elicit.org or scispace.com.  So here is how I do things for the most part.

My go-to reference manager: Zotero


If you are anyone who is conducting research in an academic, government, or even industrial lab setting you most likely are familiar with a reference manager. The list is exhaustive nowadays, but for me, there is only one that I've consistently kept going back to and that is Zotero. The thing I like about Zotero is that it's multiplatform, open-source, no-cost, easy to use, and has a good amount of features integrated. The biggest issue is the cloud storage cost. This is really only a problem if you want to have all your attached pdfs stored on the cloud service so that you can access them using the browser interface.

To get around this limitation you can use the ZotFile extension which allows you to change how Zotero stores and renames files. If your using Google Drive or DropBox, this means you can create a folder on your cloud storage and then use ZotFile to store all linked PDFs there. If you have multiple devices with the same OS and file structure, then opening the attached files on any of your devices with Zotero desktop will work. If this is not possible with your working environment, for example, you have Windows and Linux machines. Then you will just want to make sure you create links to your local cloud storage path for the Zotero entry. You could also open the folder on your cloud storage such that anyone with the links can access the files and then add these as links in the Zotero entry. That way your file is always accessible no matter where you try to grab it from.

There is one other Zotero add-on I like to use since I do a lot of my technical writing in LaTeX. The Better Bibtex extension makes generating .bib files extremely easy and . Furthermore, it makes using your preferred bib entry naming schema consistent across all references straightforward. You also don't have to worry about keeping a .bib file updated.

Grabbing Literature


There are several ways to get journal articles so I'm not going to list all of them. For me, the easiest and most natural is to just use google scholar. I like google scholar mainly because it grabs any PDFs that have been posted on the internet and ties them to the reference.

Parsing Literature


Once I've added an entry to my Zotero library, which includes adding relevant PDFs, GitHub repo links, and tagging, making sure you tag and grab relevant links will make it easier latter on. I then go about creating a note item for each entry. The nice thing with recent versions of Zotero notes is that they are markdown+rich text meaning you can be pretty detailed.  So how do I initial parse without having to read all the papers I've added? I add three sections in the note. The first is a summary section. For this, I either use ChatGPT, SciSpace, or paper digest to do this for me. Then I look through the paper/document and screengrab any figures that stand out to me for whatever reason. Finally, I mark the priority level. Do I think this is a high-priority paper to read or not? The reason to do that in the note and not the tag is because the priority for reading is a transient state of an entry; it will change over time and eventually have a null status once it's read. 

Once I'm done creating the notes, what I can do for a specific folder that may represent a topic or a specific project is generate a Zotero report that provides all the metadata text for the entries. I can then go through it again to see what stands out to me. This in my opinion is a really nice feature, because it lets me visually go through the titles and my notes and see what I want to focus on first. There are some fancy tools out there that can create graphs of connectivity and other relationships between papers which would probably also be very useful if your trying to narrow down papers to read.

Reading Literature


Once I'm done selecting which papers I want to read from the Zotero report I generated, I will then print out the papers. Yes, I know printing doesn't make much sense in our multi-monitor research setups, but for me, I can't seem to shake the desire to want to read a paper in physical form. There is something about being able to flip back and forth between different sections and how I represent concepts and results in my mind's eye. I believe there is some strong evidence for better information recall when reading books in physical form.

For reading the papers, there is no real best approach in my opinion, you just have to sit down and read in the best way that works for you. I personally try not to spend too much time marking up the document. I will typically just add some kind of marking to indicate to myself some passage or content I find interesting or important. Once I'm done reading I then go to the digital PDF of the paper in Zotero and make annotations and highlights of the parts I've indicated on the physical form. This is important because when I'm writing and want to reference the document, I use these annotations and highlights as a guide for why I wanted to reference the paper in the first place.

Referencing


If you use MS Word with the Zotero plugin, then it's pretty straightforward to create in-text citations and a bibliography. If you are a $\LaTeX$ user, as mentioned above, I've found the best way to set up your documents is to use a cloud storage service and the Better Bibtex plugin. This lets you automatically update the .bib file for a given folder in your Zotero library and save it on your cloud storage. This way if you use something like Overleaf, you just have to create a shared link for the .bib file in your cloud storage and then create an Overleaf project file based on the link. For google drive, you can follow the steps here.

DOI


Reuse and Attribution

Thursday, March 16, 2023

Bayes Rule: Visual Referesher

Bayes rule is a familiar or natural outcome for most familiar with probability theory. In words, it tells us how to update the probability of a random variable(s) given some event(s) has occurred and that we have some prior knowledge or belief about the probability of the random variable(s) from earlier events. The algebra to get to Bayes rule is simple but I found it always best to have a more spatial perspective on what Bayes rule is really stating.

I'll first begin with ta 2D square sample space, $\it{S}$. This space is discrete and we can represent each outcome as a tiny square, $\it{s}$. In this case, we will have a total of 12 tiny squares in $\it{S}$. This means there is a 1/16 chance that any square is randomly selected, hence, $p(s)$.

$$\begin{array}{|c|c|c|c|}\hline  \it{s_1} & \it{s_2}  & \it{s_3}  & \it{s_4}   \\ \hline \it{s_5}  & \it{s_6}  & \it{s_7} & \it{s_8}  \\ \hline \it{s_9}  & \it{s_{10}}  & \it{s_{11}}  &  \it{s_{11}} \\ \hline \it{s_{13}}  & \it{s_{14}} & \it{s_{15}}  & \it{s_{16}} \\ \hline \end{array}$$

$$\mathrm{P}(\it{s}_i)_{\it{S}} = \mathrm{1/16}$$

Now we say have the scenario where we are only interested in two subspaces of $\it{S}$, $\it{S}_A$ and $\it{S}_B$. More specifically we want to know the probabilities of a square randomly occurring in each of these subspaces given they occur in $\it{S}$ and what the probability is of a square occurring in the intersection, or state differently, the probability of a square occurring in both $\it{S}_A$ and $\it{S}_B$.

With this we have the following: $\mathrm{P}(\it{s}_A)$, $\mathrm{P}(\it{s}_B)$, and $\mathrm{P}(\it{s}_A \cap \it{s}_B) = \mathrm{P}(\it{s}_A)$. The updated image of this would look like:

The probability $\mathrm{P}(\it{s}_A)$ in red, $\mathrm{P}(\it{s}_B)$ in blue, and the overlap $P(\it{s}_A \cap \it{s}_B)$. Keep in mind that $\mathrm{P}(\it{s}_A \cap \it{s}_B) = \mathrm{P}(\it{s}_A \cap \it{s}_B) = \mathrm{1/8}$. 

The question we usually want to ask is not what the joint probability, i.e., what's the probability of both $\it{S}_A$ and $\it{S}_B$ squares, but instead is what is the probability of a square in $\it{S}_A$  given that a square in $\it{S}_B$ has been picked/occurred or vice versa.  So what does this mean? We want to compare the relative probabilities of the joint space to that of the given space where the event has occurred:

\begin{equation} \mathrm{P}(\it{s}_A | \it{s}_B) = \frac{\mathrm{P}(\it{s}_A  \cap \it{s}_B)}{\mathrm{P}(\it{s}_B)}\label{eq:bayes1}   \end{equation} 

and

\begin{equation} \mathrm{P}(\it{s}_B| \it{s}_A) = \frac{\mathrm{P}(\it{s}_A  \cap \it{s}_B)}{\mathrm{P}(\it{s}_A)} \label{eq:bayes2}\end{equation}

Notice how these two equations are not the same but we the probability in the joint space, $\mathrm{P}(\it{S}_A \cap \it{S}_B) = \mathrm{P}(\it{S}_B \cap \it{S}_A)$. This had to be the case just by looking at the illustration with the colored cells above. 

The key is that we can now determine the conditional probabilities, that is the probability of a cell in a subspace given a cell in the other subspace has been picked or occurred, by rearrange eq. \ref{eq:bayes1} and eq. \ref{eq:bayes2} for the joint probability and then substituting terms to get:

\begin{equation*} \mathrm{P}(\it{s}_A | \it{s}_B) \mathrm{P}(\it{s}_B) = \mathrm{P}(\it{s}_B | \it{s}_A) \mathrm{P}(\it{s}_A)\end{equation*}

which is rearranged to get the typical Bayes formula:

\begin{equation}\mathrm{P}\left(\it{s}_A | \it{s}_B\right)  = \frac{\mathrm{P}\left(\it{s}_B | \it{s}_A\right) \mathrm{P}\left(\it{s}_A\right)}{\mathrm{P}\left(\it{s}_B\right)} \label{eq:bayesformula}\end{equation}.

At first eq. \ref{eq:bayesformula} might seem expected you could. I mean it is just an outcome of analyzing probabilities of subspaces, but the impact is really how one can this equation to update knowledge. Let us break down the terms in eq. \ref{eq:bayesformula}.

The first term in the numerator is called the likelihood probability. It indicates how probably an event in $\it{S}_B$ is given that an event in $\it{S}_A$ occurs. It can also represent the probability of the observed data given the model and its parameters (i.e. prior over parameters). The second term in the numerator, the prior, informs about previous knowledge of the observations or parameters. Finally, the denominator can be interpreted as the probability of observing a cell in $\it{S}_B$ or you can think about it as the data averaged over all possible values of the model parameters. 

An important aspect of eq. $\ref{eq:bayesformula}$ is that in the case of probability functions, the integration equals one. This just means that over the whole space of probabilities, something must have happened.

In the example given, the probabilities are just uniform discrete values, so we obtain a posterior probability that is just a number that represents our updated knowledge about the probability of a cell in $\it{S}_{A}$ given the cell is in $\it{S}_{B}$. This is a particularly simple and maybe intuitive outcome. What is typically more useful is that we have a probability density function that represents our prior knowledge about an event/outcome and we want to determine the posterior distribution. We then choose a likelihood probability that encodes information about what has been observed given the prior probability and make inferences by sampling the constructed posterior distribution.


Reuse and Attribution

Thursday, March 9, 2023

Book Review: Quantum Entanglement, Jed Brody

I just finished reading through this short monograph on quantum entanglement. The approach taken by the author is to provide what quantum entanglement is through conceptual examples. There are no wave functions or quantum states discussed in this book. At first, the reader is introduced to two very important concepts in the philosophy of physics; realism and locality. In realism, the assumption is that any physical objects have properties regardless of whether another object with agency (i.e., a person) is observing that object. A typical example of this concept is the following questions:

Does a falling tree in the forest make a sound when no one is listening? 

Realism says yes, it does. In the case of the tree, it has a center of mass that gives the tree some gravitational potential energy that upon falling is converted to kinetic energy and then generates sound waves in the air once it hits to ground. The tree had mass, potential energy, and kinetic energy which according to realism exist objectively. The opposing view is that it was the sound wave came into existence because an agent was listening. This seems absurd and it is in classical physics, but not necessarily in quantum physics.

Locality refers to the fact that observing, measuring, or disturbing objects in a region of a space does not affect other objects at arbitrary distances in that space. Here, I'm using space in an abstract sense not necessarily a Euclidean 3D space. I do note that in the book the discussion of locality is with regard to distances in 3D Euclidean geometry but I think I'm correct in that locality applies to non-Euclidean 3D spaces and this would be the more general statement. Locality is a pretty important concept in physics and is one of the reasons we got the famous EPR paper from Einstein. 

After the book presents these two concepts it gradually moves into the concept of hidden variables, that is properties of objects that can change aspects of the object when observed, but yet the variables themselves are never observed. Hidden variables satisfy realism. Much of the subsequent chapters present examples that lead to the famous Bell inequality which arises due to correlations in probabilities. The bell inequality needs to be satisfied for a theory to contain locality, if it is violated the theory is non-local. As it turns out, at least to our ability to experimental test the theory of quantum mechanics, it is a non-local theory without hidden variables because. All experiments that have been conducted to date violate Bell's inequality and suggest that correlations are instantaneous within the quantum mechanics framework. It should be noted that you could have a quantum hidden variable theory (i.e. Bohemian mechanics) that is non-local which would describe experimental results, but I guess the argument against this is why introduce hidden variable theory if a non-hidden variable theory doesn't provide any additional clarity other than satisfying realism.

It is pretty well documented, or at least we are made to think, that Einstein had serious issues with the non-local (dubbed "spooky action at a distance") behavior of quantum theory as well as the mainstream interpretations not satisfying a realism philosophical perspective. More specifically, the Copenhagen interpretation posits that the wavefunction/quantum state is more of a mathematical tool and is not necessarily a physical object since it only provides a way to extract probabilities of observable properties.

Going back to the book, chapters 3 and 4 provide different and simple experimental setups that look at probabilities and their correlations to arrive at Bell's inequality. The author then reminds the reader that quantum mechanics violates this inequality. Chapters 1-4 are written in a direct and comprehendible manner, but the truth is, I find it actually easier to understand the Bell inequality and violations of it by actually following the simple Linear algebra of quantum theory.  Trying to think through all the words describing the setup and outcomes can become burdensome. Given  the current focus on quantum computing, there are a lot of good books that go through the same results using simple linear algebra. I think it would have been easy to introduce most readers interested in this book to the basics of a qubit, Hilbert space, and corresponding operations, which could help readers understand these concepts more easily.

Chapter 5 goes through the potential inconsistencies of quantum mechanics with special relativity. Personally, I found this chapter was not delivered in the most impactful way, but it addresses the original concerns of physicists.  The final concluding sentence that indicates everything is okay in the end is:

"... the linkage between entangled particles conveys neither mass nor messages"

The author ends the book with a chapter regarding realism and its validity.  I think this is the best section of the book. The author gives their thinking on the topic of local realism by stating:

"The only fact that's (almost) certain is local realism cannot account for measured results"

Thus, local realism is a dead concept in the author's eyes As frustrating as it feels, I would agree with the author. The remainder of the chapter deals with interpretations of quantum theory from philosophical perspectives and you get a nice concrete quadrant table to decide what path to take, namely:

Find falsehoods in assumptions Abandon locality & realism
Abandon locality & keep realism Abandon realism & keep locality

I'm not going to go through and explain each of these because I want to leave some excitement, but I think this is the most interesting part of the book. 

I recommend reading this book if you're going to be studying quantum mechanics in any way because it will help with some of the philosophical thinking behind the theory. The reading is extremely accessible to any background and it is very short making for a good weekend read. Here's the book:

MIT Press Store

Reuse and Attribution

Thursday, March 2, 2023

Fine-tuning GPT-3 for a LAMMPS or VASP AI chatbot

The GPT API  enables fine-tuning of the GPT model for your specific application. I'm interested in utilizing this to create a new tool that would allow a user to query a software user manual to generate macros or scripts to perform operations. The idea is to put together a series of prompts and completions that are extracted from user forms like stack overflow/exchange, discourse, etc. as well as from domain users who are willing to contribute. For example, I'm curious about creating a GPT chatbot that can provide users with LAMMPS or VASP scripts based on text prompts about the problem. At the moment, ChatGPT tries to do this but fails to get enough of the specific commands and parameters correct. 

What I'm thinking is if you have a dataset with prompts and completions like:


  {"prompt": "What is the command for\n 
computing thermal conductivity in LAMMPS",\n
"completion": "In order to calculate the\n
thermal conductivty using the Green-Kubo formulas,\n
the heat flux needs to be calculated.\n
The command to do so is:\n
compute ID group-ID heat/flux ke-ID pe-ID stress-ID"}

My hope is that if you fine-tune the GPT model with these examples the user can just ask a AI chatbot more broadly something like:

Please create a LAMMPS input script to calculate the thermal conductivity of graphite at 300K.

Would this approach work for fine-tuning a GPT model? I don't really know, I'm planning on giving it a go. I need to also be cognizant that the number of tokenizations in the dataset for fine-tuning doesn't make it a costly disaster.

I'm wondering is if there is a way to grab the questions and answers in a json format from the LAMMPS discourse community and likewise sources to create the fine-tuning dataset. If not it would be very time-consuming for domain knowledge from individuals. I guess could create some kind of community input form where users provide this. Would do the same for VASP and hopefully most of the other mainstream atomistic packages.  I have a name for a LAMMPS AI chatbot but need to ask the person first if it's okay to eponymize the chatbot after them.


Reuse and Attribution