An fpga implementation of the Smooth Particle Mesh Ewald Reciprocal Sum Compute Engine (rsce)

Steps to Calculate the SPME Reciprocal Sum

Download 1.53 Mb.

Page	8/25
Date	09.08.2017
Size	1.53 Mb.
	#29150

1 ... 4 5 6 7 8 9 10 11 ... 25

Table 2 - Steps for SPME Reciprocal Sum Calculation

4.5.Steps to Calculate the SPME Reciprocal Sum

In this section, the steps to calculate the SPME reciprocal sum are described. These steps follow closely with the steps used in the software SPME program written by A. Toukmaji [8]. For a detailed discussion on the software implementation, please refer to Appendix B: Software Implementation of the SPME Reciprocal Sum Calculation.
Table 2 describes the steps that the RSCE takes to calculate the reciprocal energy and force. The step number is indicated in the architectural diagram of the RSCE (Figure 12) as dashed circles for easy reference. In Table 2, K is the grid size, N is the number of particles, and P is the interpolation order. The steps outlined assumed that there is one RSCE working with one host computer. The strategy is to let the host computer perform those complicated calculations that only need to be performed once at startup or those with complexity of O(N).
By analyzing the complexity order of each step shown in the leftmost column of Table 2, it can be concluded that the majority of the computation time is spent in steps 7 (mesh composition), 8 (3D-FFT), 10 (3D-IFFT), and 11 (force computation). The computational complexity of steps 8 and 10 depends on the mesh size (K₁, K₂, and K₃), while that of steps 7 and 11 depend mainly on the number of particles N and the interpolation order P. Both the number of grid points (mesh size) and the interpolation order affect the accuracy of the energy and force calculations. That is, more grid points and a higher interpolation order would lead to a more accurate result. Furthermore, the number of particles N and the total number of grid points K₁×K₂×K₃ should be directly proportional.
Table 2 - Steps for SPME Reciprocal Sum Calculation

#	Freq.	Where	Operation	Order
1	startup	Host	Computes the reciprocal lattice vectors for x, y, and z directions.	1
2	startup	Host	Computes the B-Spline coefficients and their derivatives for all possible lookup fractional coordinate values and stores them into the BLM memory.	2^{Precision_coord}
3	startup	Host	Computes the energy terms, etm(m₁, m₂, m₃) for all grid points that are necessary in energy calculation and stores them into the ETM memory.	K₁×K₂×K₃
4	repeat	Host	Loads or updates the x, y, and z Cartesian coordinates of all particles.	3×N
5	repeat	Host	Computes scaled and shifted fractional coordinates for all particles and load them into the upper half of the PIM memory. Also zeros all entries in the QMMR and the QMMI memories.	3×N
6	repeat	FPGA BCC	Performs lookup and computes the B-Spline coefficients for all particles for x, y, and z directions. The value of the B-Spline coefficients depends on the fractional part of the coordinates of the particles.	3×N×P
7	repeat	FPGA MC	Composes the grid charge array using the computed coefficients. The grid point location is derived from the integer part of the coordinate. Calculated values are stored in the QMMR memory.	N×P×P×P
8	repeat	FPGA 3D-FFT	Computes F^-1(Q) by performing the inverse FFT on each row for each direction. The transformed values are stored in the QMMR and the QMMI memories.	K₁×K₂×K₃ × Log(K₁×K₂×K₃)
9	repeat	FPGA EC	Goes through each grid point to compute the reciprocal energy and update the QMM memories. It uses the grid index to lookup the values of the energy terms.	K₁×K₂×K₃
10	repeat	FPGA BCC	Performs lookup and computes the B-Spline coefficients and the corresponding derivatives for all particles for all x, y, and z directions.	2×3×N×P
11	repeat	FPGA 3D-FFT	Computes the forward F(Q) and loads the values into grid charge array QMMR. In this step, the QMMI should contain all zeros.	K₁×K₂×K₃ × Log(K₁×K₂×K₃)
12	repeat	FPGA FC	Goes through all particles, identifies their interpolated grid points, and computes the reciprocal forces for x, y, and z directions. The forces will be stored in the lower half of the PIM memory.	3×N×P×P×P
13	N/A	N/A	Repeat 4 – 12 until simulation is done.	N/A

Directory: ~pc
~pc -> The Tablet War: Apple v s The Rest
~pc -> From: object-oriented analysis and design, Grady Booch, Addison-Wesley, 1998
~pc -> Analysis of an Industry Price War: The Tablet price war
~pc -> Biography of Pok Chi Lau Home address: 2600

Download 1.53 Mb.

Share with your friends:

1 ... 4 5 6 7 8 9 10 11 ... 25