Stream User’s Guide



Download 0.95 Mb.
Page11/32
Date20.10.2016
Size0.95 Mb.
#6688
1   ...   7   8   9   10   11   12   13   14   ...   32

6.6Control flow constructs

Kernel functions can use most C control flow constructs. In general, a conditional control flow statement must use a scalar control expression to ensure that all lanes follow the same execution path; this is a limitation of SIMD machine architecture.



  • if (<scalar_expression>) { ... } is converted to a simple branch with the same control flow in every lane; vector control expressions are not allowed, as control flow must be the same for every lane. If the given block executes only sequential read (spi_read) or sequential write (spi_write) operations, spc generates special code to execute the operations without a branch. This allows the use of code such as:

if (<scalar_expression>) { spi_write(s, vi); }

within a software pipelined inner loop.



  • Looping constructs must only use scalar control expressions:

int i;

vec int v_i, d[10];

...

for (i = 0; i < 10; ++i) spi_read(in_str, d[i]); // Legal


while (v_i > 0) { ... } // Illegal - vector expression

  • A switch statement may only have a scalar expression as the switch value.

int8x4 value;

vec int16x2 data;


switch (value)

{

case 0:



spi_read(in_str, data);

break;
case 1:

if (data > 12) data = data + 14;

break;
default:

break;

}


  • goto, break, continue and return are supported, provided they exist outside of any if-statement using a vector expression. return with a value is allowed only within an inline kernel

if (i > 10) return; // Legal

if (v_i != 16) v_i += 16;

else return; // Illegal - if with vector expression

6.7Stream access functions


Kernel functions use Kernel API stream access functions to access stream data. Stream processor hardware supports three different types of stream access from kernel functions: sequential, conditional and array. Sequential access is the most efficient access method, conditional access permits a kernel function to read or write data only to or from selected lanes, and array access permits random stream access.


  • spi_array_read Read data from an array stream

  • spi_array_write Write data to an array stream

  • spi_cond_read Read data from a conditional stream

  • spi_cond_write Write data to a conditional stream

  • spi_eos Check for end of stream

  • spi_read Read data from a sequential stream

  • spi_write Write data to a sequential stream

To use stream data inside a kernel function, you must pass the stream as a parameter and use stream access functions: you cannot access a data buffer directly. This allows for very high performance execution of kernel functions, in keeping with the architecture of the DPU.

Stream access functions read or write data records. The number of records that can be read from an input stream is determined either from the length of a substream attribute in the kernel function call or from the count of the stream (that is, the number of records written by spi_load_* or by a previous call to a kernel function that used the stream as an output).



The kernel function declaration specifies the type and direction of each stream parameter. There are limitations on which combination of stream access functions can be used within a single kernel function. The allowed combinations of stream access functions are shown in the table below.


Stream Type

Modifier

Stream access Functions







spi_read

spi_write

spi_cond_read

spi_cond_write

spi_array_read

spi_array_write

spi_eos

Input sequential

in

seq_in




















Output sequential

out

seq_out





















Input conditional

cond_in




















Output conditional

cond_out





















Input array

array_in





















Output array

array_out





















I/O array

array_io






















6.7.1Sequential streams


Sequential streams have the fastest memory performance. spi_read and spi_write read and write data to and from all lanes in a sequential manner. Reading beyond the end of a stream returns zero.

On SP16, three calls to spi_read would read 48 records from the LRF, 16 at a time. The records are striped across the lanes:







Lane 0

Lane 1

...

Lane 14

Lane 15

first spi_read call

record 0

record 1

...

record 14

record 15

second spi_read call

record 16

record 17

...

record 30

record 31

third spi_read call

record 32

record 33

...

record 46

record 47

It is possible conserve space in the LRF by both reading and writing to the same sequential stream in a kernel function. To do this, pass the same stream to the kernel function twice, as both an input stream and an output stream. It is the programmer’s responsibility to make sure that the number of reads exceeds the number of writes at any time, otherwise input data may be overwritten, resulting in undefined behavior.

6.7.2Conditional streams


spi_cond_read reads conditional input stream data into a subset of the lanes, based on the value in each lane of a vector flag variable. Similarly, spi_cond_write writes conditional output stream data from a subset of the lanes, based on the value in each lane of a vector flag variable. As with sequential streams, reading beyond the end of a stream returns zero.

Due to the SIMD structure of the DPU, spi_cond_read overwrites the value of the destination variable in all lanes, regardless of the value of the conditional flag variable in the lane. If the conditional read flag is false for a lane, then the value will be a repeat of the last record read from the stream by the conditional read; if no data has been read, then the value will be zero. It is the programmer’s responsibility to ignore data returned by spi_cond_read in lanes where the read flag is false.



On SP8, three calls to spi_cond_read load 0 to 24 records, depending on the condition flags. For example:





Lane 0

Lane 1

Lane 2

Lane 3

Lane 4

Lane 5

Lane 6

Lane 7

read flag

true

true

false

true

false

false

true

false

first spi_cond_read call

r0

r1

r1

r2

r2

r2

r3

r3




























read flag

false

true

true

false

false

true

false

true

second spi_cond_read call

r3

r4

r5

r5

r5

r6

r6

r7




























read flag

true

true

true

false

true

true

true

false

third spi_cond_read call

r8

r9

r10

r10

r11

r12

r13

r13

It is possible to conserve space in the LRF by using the same conditional stream as both an input argument and an output argument in a kernel function. It is the programmer’s responsibility to make sure that the number of reads exceeds the writes at any time or input data may be overwritten; otherwise, undefined behavior will result.

6.7.3Array streams


Array streams have the slowest memory performance. spi_array_read and spi_array_write read and write data to and from all lanes in a random access manner. Stream data can be reread as many times as desired. Note that even though the stream is accessed in an arbitrary manner, multiple values are still read sequentially from the stream into each lane for each call to spi_array_read.





Lane 0

Lane 1

...

Lane 14

Lane 15

spi_array_read(str, dest, 0)

record 0

record 1

...

record 14

record 15

spi_array_read(str, dest, 1)

record 16

record 17

...

record 30

record 31

spi_array_read(str, dest, 2)

record 32

record 33

...

record 46

record 47

Reading or writing beyond the end of the stream results in undefined behavior.






Download 0.95 Mb.

Share with your friends:
1   ...   7   8   9   10   11   12   13   14   ...   32




The database is protected by copyright ©ininet.org 2024
send message

    Main page