```John Kerl
kerl at math dot arizona dot edu
2008-07-28

Subroutine index:
* MAIN LOOP
* METROPOLIS SWEEP
* COMPUTE DELTA_N2
* UNIFORM RANDOM PERMUTATION
* COMPUTE RHO_L
* GET CYCLE COUNTS
* COMPUTE RHO_INFTY

================================================================
MAIN LOOP:

Input: Given T, alpha, L, d, N_acc.

Output:  rho_infty(T) which is a real number.

Remark:  During a full simulation suite, this program must be called many
times, with differing values of T, alpha, and L.  This program only computes
one data point.

----------------------------------------------------------------
Allocate an L x L x L matrix of points.
Each point contains the following information:

* Its own i/j/k coordinates on the lattice, each an integer between 0 and L-1

* A forward pointer and a backward pointer.  (A pointer is a RAM address of a
data structure.  Pointers are an even more efficient implementation of your
"short vector" concept.  Pointers exist in the C language, but not in
Fortran.  If they existed in Fortran, presumably your Marseilles
collaborators would have used them.)

* The squared distance for the forward permutation jump.  This is a value
that is computed during a Metropolis proposal, so we save it for the
next step instead of recomputing it.

k=5  o----o----o----o----o----o
/    /    /    /    /    /|
k=4  o----o----o----o----o----o o
/    /    /    /    /    /|/|
k=3  o----o----o----o----o----o o o
/    /    /    /    /    /|/| |
k=2  o----o----o----o----o----o o o o
/    /    /    /    /    /|/| | |
k=1 o----o----o----o----o----o o o o o
/    /    /    /    /    /|/| | | |
j=5 k=0 o----o----o----o----o----o o x======> Stored at each node x:
|    |    |    |    |    |/|/|/|/|/   * Lattice coordinates (i,j,k)
j=4     o----o----o----o----o----o o o o o    * Pointer to pi(x)
|    |    |    |    |    |/|/|/|/     * Pointer to piinv(x)
j=3     o----o----o----o----o----o o o o      * |x-pi(x)|^2
|    |    |    |    |    |/|/|/
j=2     o----o----o----o----o----o o o
|    |    |    |    |    |/|/
j=1     o----o----o----o----o----o o
|    |    |    |    |    |/
j=0     o----o----o----o----o----o

i=0  i=1  i=2  i=3  i=4  i=5

At startup, the permutation is either identity, or uniform random.  An
algorithm to generate a uniform-random permutation is well-known; see the
UNIFORM RANDOM PERMUTATION subroutine below.

The initial system energy is computed as follows:
H := 0.25 / beta * (sum over x of |x-pi(x)|^2) + alpha * number of two cycles.

The initial number of two-cycles is computed as follows:
N2 := 0
For i from 0 to L-1:
For j from 0 to L-1:
For k from 0 to L-1:
x := lattice point at (i,j,k)
pi(x) := x forward pointer
pi^2(x) := pi(x) forward pointer
if (x != pi(x) and x == pi^2(x)):
N2 := N2 + 1
N2 := N2 / 2.  (Compensate for double-count.)

rho_L = array from 0 to N of real numbers.

(Thermalization loop)
Thermalized := false.
Until thermalized:
Smooth H over a 200-point window.  Set Thermalized := true if there are
> 40 turning points in the smoothed energy.  (This smoothing computation
takes less than 0.01% of the CPU time.  Therefore, the details of the
smoothing are irrelevant for performance analysis.)

Call the subroutine METROPOLIS SWEEP. (See below.)

(Accumulation loop)
For i := 1 to N_acc:
Call the subroutine COMPUTE RHO_L.  (See below.)
For k := 0 to N:
rho sum[k] := rho sum[k] + rho_L[k].
Call the subroutine METROPOLIS SWEEP. (See below.)

(Scale rho_sum to rho_avg)
For k := 0 to N:
rho_avg[k] := rho_sum[k] / N_acc

Call subroutine COMPUTE RHO_INFTY.  (See below.)

================================================================
SUBROUTINE METROPOLIS SWEEP:

Input:   Lattice points, T, alpha, H (system energy, N2 (number of two cycles).

Output:  Perhaps modify the permutation of the lattice points.
H and N2 are also modified, if the new permutation was accepted.

----------------------------------------------------------------
There are six jump deltas to choose y near pi(x):
( 1, 0, 0 ),
(-1, 0, 0 ),
( 0, 1, 0 ),
( 0,-1, 0 ),
( 0, 0, 1 ),
( 0, 0,-1 )

For i from 0 to L-1:
For j from 0 to L-1:
For k from 0 to L-1:

x := lattice point at (i,j,k).

pi(x) := forward jump pointer from x.

Choose a pseudorandom integer between 0 and 5.  Select the jump
delta from the above table of 6 jump deltas.  This gives a point y.

piinv(y) := backward jump pointer from y.

|x-pi(x)|^2 and |y-piinv(y)|^2 are already known from the previous
Metropolis sweep (or from lattice population, if this is the first
Metropolis sweep).  But, compute |x - y|^2 and |pi(x) - piinv(y)|^2.

Call subroutine COMPUTE DELTA_N2.  (See below.)

Delta_H := 0.25 / beta *
(|x-pi(x)|^2 + |y-piinv(y)|^2 - |x-y|^2 - |pi(x)-piinv(y)|^2)
+ alpha * Delta_N2.

E := exp(-Delta_H).
U := uniform random on [0, 1).

If (U < E) (accept proposed change):
x      forward  permutation pointer := y.
y      backward permutation pointer := x.
piinvy forward  permutation pointer := pi(x).
pi(x)  backward permutation pointer := piinvy.

x forward distance squared := |x-y|^2 (save for the next sweep to
avoid recomputing it).

piinv(y) forward distance squared := |piinv(y)-pi(x)|^2 (save for the
next sweep to avoid recomputing it).

H  := H  + Delta_H.
N2 := N2 + Delta_N2.

================================================================
SUBROUTINE COMPUTE DELTA_N2:

Input:  x, pi(x), piinv(y), y.

Output:  Change in the number of two-cycles if x were to instead point to y
and piinv(y) were to instead point to pi(x).

Remark:  It took me a while to get this logic correct.  I worked some nice
drawings out on paper, and showed them to you in Tucson, but did not bring them
to Vienna.  I will send them to if you wish.  In the meanwhile, I ask you to
trust that this subroutine has been exhaustively tested.  It is correct.

----------------------------------------------------------------

Find pi^2(x) and pi(y) from the permutation forward pointers of pi(x) and y,
respectively.

if (y == pi(x))                         (Case 0)
return 0;

if (x == pi(x))                         (Case 1)
if (piinv(y) == y)
Delta_N2 :=  1;                        (Case 1a.)
else if (piinv(y) == piy)
Delta_N2 := -1;                        (Case 1b.)

else if (piinv(y) == y)                 (Case 2)
if (x == pi(x))
Delta_N2 :=  1;                        (Case 2a; same as 1a.)
else if (x == pi2x)
Delta_N2 := -1;                        (Case 2b.)

else if (x == y)                        (Case 3)
if (pi(x) == piinv(y))
Delta_N2 := -1;                        (Case 3a.)
else if (pi2x == piinv(y))
Delta_N2 :=  1;                        (Case 3b.)

else if (pi(x) == piinv(y))             (Case 4)
if (x == y)
Delta_N2 :=  -1;                       (Case 4a; same as 3a.)
else if (piy == x)
Delta_N2 :=   1;                       (Case 4b.)

else if (pi2x == x)                     (Case 5)
if (piy == piinv(y))
Delta_N2 := -2;                        (Case 5a.)
else
Delta_N2 := -1;                        (Case 5b.)

else if (piy == piinv(y))               (Case 6)
if (pi2x == x)
Delta_N2 :=  2;                        (Case 6a; same as 5a.)
else
Delta_N2 := -1;                        (Case 6b.)

else if (pi2x == piinv(y))              (Case 7)
if (piy == x)
Delta_N2 :=  2;                        (Case 7a.)
else
Delta_N2 :=  1;                        (Case 7b.)

else if (piy == x)                      (Case 8)
if (pi2x == piinv(y))
Delta_N2 :=  2;                        (Case 8a.)
else
Delta_N2 :=  1;                        (Case 8b.)

Return Delta_N2

================================================================
SUBROUTINE UNIFORM RANDOM PERMUTATION:

Input:  Lattice points with identity permutation.

Output:  Lattice points with a different permutation, all pi in S_n
being equally likely.

----------------------------------------------------------------

N := L^d

For purposes of this subroutine, make a list of N points.  This list of points
is indexed from 0 to N-1.  (This is a flattening of the L x L x L matrix of
points.)

Example:  lattice points are

x6--x7
/ |  /|
/   x4--x5
/   / /  /
x2--x3  /
|/  | /
x0--x1

Example:  list is
+--+--+--+--+--+--+--+--+
|x0|x1|x2|x3|x4|x5|x6|x7|
+--+--+--+--+--+--+--+--+

Unused start := 0

For s from 0 to N-1:
s := 0
u := uniform random integer between unused start and N-1
Swap list[u] and list[s]
Unused start := unused start + 1

Example:  after three permutation images have been selected (so s=3):
+--+--+--+--+--+--+--+--+
|x2|x3|x4|x0|x1|x5|x6|x7|
+--+--+--+--+--+--+--+--+
^\           /
|  Select a point uniformly from this unused range
|  and put it here
|       |
|       v
+<------+

For s from 0 to N-1:
Compute i,j,k from s using lexical order
Set x's forward permutation pointer to be list[s]
Set x's forward pointer's backward pointer to be x
Set x's forward distance squared to be |x-pi(x)|^2.

Example:  list is
+--+--+--+--+--+--+--+--+
|x2|x3|x4|x0|x1|x6|x5|x7|
+--+--+--+--+--+--+--+--+

Example:  lattice points are

x5--x7
/ |  /|
/   x1--x6
/   / /  /
x4--x0  /
|/  | /
x2--x3

================================================================
SUBROUTINE COMPUTE RHO_L:

Call the subroutine GET CYCLE COUNTS.

rho_value := 0.0

For k from 0 to N:
rho_value := rho_value + k * counts[k];
rho_L[k] := rho_value / N

================================================================
SUBROUTINE GET CYCLE COUNTS:

Input:   Lattice points.
Output:  Occupation numbers for cycle lengths, in an array named counts.

----------------------------------------------------------------

Counts = array from 0 to N of zeroes.

Marks = L x L x L matrix of zeroes.  (This memory is allocated once at program
start and is reused here.)

For i from 0 to L-1:
For j from 0 to L-1:
For k from 0 to L-1:

If (marks[i][j][k] == 1):
Go on to next (i,j,k) value:  this lattice site was already visited.
It participates in a cycle that was already enumerated.

Cycle length := 0
Cycle start point := points[i][j][k].
Current point := cycle start point.
Marks[i][j][k] := 1.

Loop:
Cycle length  := cycle length + 1
Next point    := current point's forward permutation pointer.
Current point := next point.
If (next point == cycle start point):
Break out of loop.
Mark the next point as visited.
Counts[cycle length] := counts[cycle length] + 1.

================================================================
SUBROUTINE COMPUTE RHO_INFTY:

Input:  rho_avg

Output: rho_fit and rho_infty

----------------------------------------------------------------

(Find the point of greatest convexity on the rho_avg curve.  Do this by
subtracting off the diagonal and finding the maximum value of rho_avg,
and the index k at which that maximum occurs.)

maxk := 0
maxdev := 0.0

For k from 1 to N:
dev := rho_avg[k] - k/N
if (dev > maxdev):
maxdev := dev
maxk := k

(Compute the vertical intercept.)
rho_finite_T := rho_avg[maxk] - maxk/N

(Find the distance from the upper left-hand corner to the intercept.)
rho_infty_T  := 1.0 - rho_finite_T

(Compute the linear fit for display purposes.)
for k from 0 to N:
rho_fit[k] := k/N + rho_finite_T
if (rho_fit[k] > 1.0)
rho_fit[k] := 1.0

return rho_infty_T

Example:

Horizontal axis runs from 0 to N.
Vertical   axis runs from 0 to 1.
Here is rho_avg as computing during Metropolis sweeps:

+------------------------+
|         *              |
|      *                 |
|    *                   |
|  *                     |
|                        |
|*                       |
|                        |
|                        |
|                        |
|                        |
+------------------------+

Subtract off the diagonal:

+------------------------+
|                        |
|                        |
|                        |
|  * * *                 |
|*   ^    *              |
|    |      *            |
|    |        *          |
|    |          *        |
|    |            *      |
|    |              *    |
+----k----------------*--+
Maximum deviation from the diagonal here

+------------------------+       ---
|        /*              |        ^
|      /                 |        | rho_infty(T)
|    *                   |        |
Vertical  |  /                     |        v
intercept >/                       |       ---
|*                       |        ^
|                        |        | rho_finite(T)
|                        |        |
|                        |        |
|                        |        v
+------------------------+       ---

```