Search Blogs

Thursday, December 16, 2021

The Framework of Quantum Theory

This post is my rewording of the postulates of quantum mechanics/theory. I'm trying to do it first without using mathematical notation; we will see how good I do. Then I'll add the math.

Many texts have slightly different ways of presenting the postulates of quantum theory and it usually depends on which domain the framers are coming from. I particularly like the quantum computing perspective because its abstract from any physical system and is strictly discussed from the mathematics. References are provided at the end of the post.

What is meant by quantum theory?

To answer this question I would like to first  understand what is meant by theory, I'll use the basic definition:
An uncertain belief or a system of ideas intended to explain something, usually using a general and abstract framework.

You may ask how does this differs from a law. Well to be honest, the difference between a "law" and "theory" seems mostly semantics, but usually a law is based strictly on factual observations, whereas in by the definition above a theory is a hypothesis or concept used to explain observations. 

So how does quantum theory get formed, well we invoke a "system of ideas" in the form of postulates. A postulate is defined as:

Assume something to be true (i.e. axiom) or factual-based for use as a foundation for reasoning.

Therefore what is done in quantum theory is to put forth postulates that act as the foundation of the theory and are applied to every system we are interested in describing within the theory. Now listing  out what the pioneers of quantum theory took as the postulates.

Postulates of quantum theory 

Depending on which text you reference there are may be a different number of postulates (5-7) presented, here I'll present 5; sometimes they are in a different order.

Below I will first state the postulates in general language and then reiterate them in a stronger mathematical sense.

  1. A physical system can be completely described by a quantum state existing in a vector space (i.e. Hilbert space). The information extracted from the quantum state about the physical system is given in terms of probability.
  2. To interpret a classical physical quantity (e.g. momentum) associated with a quantum state, a corresponding operator exist that acts upon the quantum state.
  3. The realization of observing an operator, that is measurement of a physical system described by a quantum state acted upon by an operator, gives a measurable outcome (real-valued). In addition, the quantum state collapses to a specific point in vector space corresponding to the measured value.
  4. The average value measured of a physical quantity associated with the quantum state is provided by the probabilistic expected value (i.e., expectation) of the operator.
  5. The deterministic time evolution of the quantum state is given by the Schrödinger equation.

The framework is not too crazy, you just have to buy into the fact that there exists an abstract object called the quantum state (p-1) that evolves according to the Schrodinger equation (p-5), and any information in the classical physical sense is obtained through operators that can be measured. 

One side note, we probably don't say " the laws of quantum mechanics" because as mentioned laws are derived from factual observables, and in the case of the quantum state, we actually never observe it directly. Remember postulates 1-3, tell us we only see real-valued outcomes with some probability. As a matter of fact, we don't even know if the quantum state corresponds to an epistemic (i.e. knowledge we have) or ontologic (i.e., the nature of reality) perspective. Stated more succinctly, does the quantum state correspond to information about reality or reality itself.

A more mathematical form

Here I'll restate the postulates but with a more mathematical signature, but first I'm going to define the core concept in quantum theory, the quantum state:
$|\Psi\rangle$ is the quantum state which is a vector in Hilbert space — a finite or infinite dimensional space that contains inner products ­— and the amplitudes of each component of the quantum state vector  are complex valued. The specific notation, the Greek letter $\Psi$ sandwiched between a vertical line and right angle, indicates that this is a vector in Hilbert space. However, in many physics and chemistry oriented problems the quantum state is represented in a position basis, i.e., the amplitudes of the quantum state vector depend on position, and therefore we can define a continuous function $\Psi(\bf{r}) = \langle \bf{r} | \Psi \rangle$ ubiquitously dubbed the "wave function". The term $\langle \bf{r} |$ is the complement quantum state in position basis, i.e. the  quantum state is written in terms of each unique position value given by the amplitude and orthogonal vector in Hilbert space.

and now the postulates with some mathematical notation: 

  1. A physical system is described by a quantum state, $|\Psi\rangle$,  that exist in $\mathcal{H}$ space who's elements are $\mathbb{C}^N$ and can be written in terms of linear combinations of elements in $\mathcal{H}$, for example, $|\Psi\rangle = \sum_i c_i |\psi_i\rangle$. The quantum state carriers information about the physical system which is extracted with reference to probability, given by $|\Psi|^2 = \langle \Psi | \Psi \rangle $. 
  2. To interpret a classical physical quantity associated with a quantum state, $|\Psi\rangle$, a corresponding, $\hat{\mathcal{O}}$, exist that acts upon the quantum state linearly, $\hat{\mathcal{O}}|\Psi\rangle$. The operator is "moving" the quantum state around in Hilbert space.
  3. The realization of observing an operator acting on a quantum state is framed in terms of positive operator-valued measure (POVM). If we have the operator, $\hat{O}$, we say that the operator can be defined as a sum of projector operators, $\hat{O}=\sum_i \Phi_i$, where the projector operators act on a subspace and are orthogonal. It then follows that the quantum state acted upon by each projector "collapses" to the state $|\Psi^{\ast}\rangle=\frac{\Phi_i |\Psi\rangle}{\sqrt{\langle \Psi | \Phi_i | \Psi \rangle}}$  with probability $\langle \Psi | \Phi_i | \Psi \rangle$. Keep in mind that what we measure physically are the eigenvalues with outcomes given by the probability.
  4. The probabilistic expected measured value for an operator, $\hat{O}$ given a quantum state $\Psi$, will be given by the expectation value, $\langle O \rangle = \langle \Psi| \hat{O} | \Psi \rangle$.
  5. The governing evolution of the quantum state is given by the time-dependent Schrödinger equation, $i \hbar \frac{d}{dt} |\Psi(t)\rangle = \hat{H}|\Psi(t)\rangle$, where $\hat{H}$ is the Hamiltonian operator. The time-dependent Schrödinger equation is due to the fact that the time evolution of the quantum state must be unitary (i.e. preserves inner products) and is given by $\hat{U}(t) = e^{-i  \hat{H} t / \hbar}$.. 
Finally I'll conclude with a quote from Paul Dirac, a interesting physicist with tremendous mathematical acumen.
“God is a mathematician of a very high order and He used advanced mathematics in constructing the universe.” — Paul A. M. Dirac

 

References


Reuse and Attribution

Thursday, July 29, 2021

Bayesian Parameter Inference

Let me first start with how I understand what  Bayes theorem about; a more detailed introduction can be found in book by D.S. Sivia [1]. The main goal is to use it to asses what the probability is for a guess, hypothesis, or model  — the posterior probability — when you select a probability function (via information you know) for some data or input given a target (e.g., guess)  — the likelihood probability describing the data  — as well as a probability function describing the guess, hypothesis, or model — the prior probability assumption. In mathematical form this is written as:

 $$ \underbrace{\text{prob}(H|X,I)}_{\text{posterior}} \propto \underbrace{\text{prob}(X|H,I)}_{\text{likelihood}} \times \underbrace{\text{prob}(H|I)}_{\text{prior}} $$,

where $H$ is the guess, hypothesis, or model parameters, $X$ is the actual observed outcome, results, or data, and $I$ is ancillary information that we know but may not be key. The term on the left-hand side of the equality is called the posterior probability density function (pdf), it provides the probability information about $H$ given $X$  (and $I$).  The proportionality is used rather because I haven't included the normalizing term  the marginalization -- which is $\text{prob}(X|I)$ and is not always trivially known since we may have sufficient $I$ to describe it. 

In the application of best parameter reliability and uncertainty using Bayes theorem, one seeks to find the optimal parameters for $H$. The general idea would be:

  1. Take the $\log$ of the posterior pdf  ($L = \log \;\text{prob}(H|X)$) to handle different scales of $H$ and expand using a Taylor Series expansion around an optimal parameters $H_o$, namely: $$ L(H) = C + (H-H_o)\frac{dL}{dH} + (H-H_o)^2 \frac{d^2L}{dH} + \cdots $$
  2. Given that the linear term will by nature of the expansion be zero. In addition from calculus the optimal value $H_o$ will be found when derivative $\frac{dL}{dH}|_{H_o}=0$, thus when $L$  and $\frac{dL}{dH}|_{H_o}$ can be evaluated analytically, it is a straightforward to calculate $H_o$.
  3. A key poin is the non-linear terms, more specifically the quadratic term, can be used to determine the uncertainty of error on the optimal value $H_o$.
So what does this do if I want to know the parameters reliability of a hypothesis, guess, or model. Well if we take the exponential of the expression in item 1, and ignore higher-order terms beyond the quadratic term, we obtain the probability distribution:

$$ \text{prob}(H|X,I) \propto A \exp \left(\frac{1}{2}\frac{d^2 L}{dH^2}\bigg|_{H_o} \left(H-H_o \right)^2 \right) $$

where $A$ is just a constant determined and upon comparison to a Gaussian/Normal distribution, we find that the terms correspond,

$$ \begin{align} \mu &= H_o \\ \sigma &= \left(-\frac{d^2 L}{dH^2}\bigg|_{H_o}\right)^{-\frac{1}{2}} \end{align} $$,

therefore the optimal parameters occur where $\frac{dL}{dH}|_{H_o} = 0 $ and $\frac{d^2 L}{dH^2}|_{H_o} \lt 0$. This provides the reliability about the parameters of  $H$ corresponding to $X$, or more succinctly, we can infer the quality of fitting parameters of observed data by utilizing probability distribution functions and update our knowledge using Bayes theorem. Ultimately this corresponds to calculating the familiar mean and standard deviation, $\mu$ and $\sigma$, for some set of guess, hypothesis, or model given some actual observation, data, or results.

The quote I picked is from Nate Silver, a political statistician, that captures the essence of Bayes' theorem and how it is such a natural though process to use in science:

"Under Bayes' theorem, no theory is perfect. Rather, it is a work in progress, always subject to further refinement and testing." 
-Nate Silver 
References
[1]. D.S. Sivia, Data Analysis: A Bayesian Tutorial, 2nd ed, Oxford University Press, 2012. See Ch. 2.2 pp 20-22


Reuse and Attribution