Operators: Multi-threaded function support

FISH operators are a special class of function designed to be executed in a multi-threaded environment.

On a repeated function call made using splitting, if the symbol was declared as an operator these separate executions will be distributed on all available threads (see the program threads command). On a typical modern multi-core computer and a large set of data this can result in a very large increase in speed.

Because the functions need to be safe when multiple threads are running simultaneously, they operate in a restricted environment. There are a number of rules that make operators different from normal functions:

  1. Operators must always take at least one argument.

  2. Operators can not call normal FISH functions. They can call other operators.

  3. Operators can only call FISH library functions if they are tagged as being thread safe. Library functions are tagged as thread safe if the ‘=’ sign is preceded by a ‘:’ in the documentation description.

  4. Operators can only write to global symbols when the lock statement is used.

  5. Operators cannot modify the values passed as arguments.

  6. Operators cannot call library functions on assignment (on the left hand side of an equality operator) and pass a pointer to an object, unless that object was passed in as an argument.

  7. Operators cannot call other operators and pass a pointer to an object, unless that object was passed in as an argument.

Be aware that reading or writing to global symbols must be synchronized across all executions running simultaneously, and can therefore severely affect overall performance. It is recommended that local variables be used exclusively where possible.

Operators are created using the fish operator command, with arguments following just like fish define. The FISH lines in the definition are the same as for a normal function, subject to the restrictions above.

In order to give examples of operators, we will first generate some data to operate upon. The following FISH function will generate 100,000 data scalars at random points in a 10x10x10 cube and give them random values from 0.0 to 10.0.

model random 12000
fish define generate(n)
    local x = math.random.uniform(n) * 10.0
    local y = math.random.uniform(n) * 10.0
    local z = math.random.uniform(n) * 10.0
    local scalars = data.scalar.create(::vector(::x,::y,::z))
    data.scalar.value(::scalars) =:: math.random.uniform(n) * 10.0
end
[generate(100000)] 

The following example operator determines if the x-coordinate of a particular scalar object falls within a given range, and if so it both assigns it to the group inside in slot mark and returns true, or assigns it to the group outside and returns false.

fish operator labeldata(data,low,high)
    local x = data.scalar.pos(data)->x
    if math.in.range(low,x,high) then
        data.scalar.group(data,'mark') = 'inside'
        return true
    endif
    data.scalar.group(data,'mark') = 'outside'
    return false
end

To execute the operator, call it as a normal FISH symbol with a split argument. Since it is an operator, the repeated calls will be executed on all available threads.

[labeldata(::data.scalar.list,4.0,6.0)]

Note the use of the return statement in the operator. Since an operator is also a global symbol (just like a function), assigning a value to it would require the use of a lock statement and incur significant synchronization overhead. Instead, operators should always use the return statement to return values from the operation. As with all splitting, the return values are collected into a list.

There are occasions when reading and/or writing to a global symbol is unavoidable. In such cases, the fwd statment needs to be used inside an operator to allow a global symbol to be written to. The following example uses a FISH operator to scan the scalar data for values inside a box given by a cartesian extent, and returns the maximum, minimum, accumulated, and last value found in global symbols.

[global maxval = 0.0]
[global minval = 1e30]
[global accval = 0.0]
[global setval = 0.0]
fish operator maxdata(data,low,high)
    local p = data.scalar.pos(data)
    if math.in.range(low,p,high) then
        lock maxval = math.max(maxval,data.scalar.val(data)) ; Max value
        lock minval = math.min(minval,data.scalar.val(data)) ; Min value
        lock accval += data.scalar.val(data)         ; Accumulated value
        lock setval = data.scalar.val(data)   ; Set value (should be
                                              ; last value in list)
    endif
end
[maxdata(::data.scalar.list,(4,4,4),(6,6,6))]

Finally, one of the most common and important uses of FISH operators (and indeed their primary reason for creation) is to use during cycling. Otherwise a single threaded FISH function that checks or changes all objects in a model will easily dominate the run time of the system.

To assign an operator to execute during cycling, use either the fish callback command or the fish-call keyword to the model solve command. Specify the name of the operator, followed by the argument(s) that will be passed for each execution.

Generally the argument assigning the list of objects to iterate over is indicated using a FISH library function, which is not directly recognized by the command processor. This must therefore be indicated using inline fish. The :: prefix must be used to indicate that the argument is to be split, as always (see Splitting). Note that these values are evaluated when the command is processed, and stored for execution during cycling.

An example of this is below:

;; Persistent fish call entry
fish callback add my_function([::data.scalar.list]) -100
;; fish call added to a single solve command
model solve fish-call -100 my_function([::data.scalar.list])