Splitting: Automatic Execution of Functions on Type Contents

FISH splitting allows a function, operator, or library call to be executed repeatedly on each element of an aggregate type (a list, an array, a container of objects, etc). If done as part of a get operation, the result is a list containing the result of each execution in order.

Splitting can be used as an alternative to loop statements to perform actions on many object in a very clear and concise manner.

In order to make a split call, give the split operator ‘::’ prefix to one or more arguments of the function, operator, or library call. Arguments that are not split will be the same in every execution of the function. If multiple arguments are split, they must contain the same number of elements.

For example, to build a list of the x-coordinates of all data scalar objects we could say.

    local a = data.scalar.pos(::data.scalar.list)->x

This splits the container of all data scalar objects in the container pointed to by the return value of data.scalar.list, and calls the intrinsic data.scalar.pos once for each entry in the list. Any iterable FISH types may be split. FISH iterable types include: pointers to containers, strings, vectors, tensors, matrices, lists, maps, and arrays. Most often splitting will be done on lists or pointers to containers of model objects (such as grid points, balls, or blocks).

List Filtering

Splitting in combination with boolean list filtering can be used to quickly find a sub-list of objects selected by a specific criteria. For example, the following code will find the list of all data scalars that are tagged with the group ‘surface’.

    local allpts = list(data.scalar.list) ; A list of pointers to scalars
    local check = data.scalar.isgroup(::data.scalar.list,'surface') 
                                                 ; List of booleans
    global pts = allpts(check) ; List of scalars in 'surface'

This can be done as a single line of FISH as follows:

    global pts = ...
    list(data.scalar.list)(data.scalar.isgroup(::data.scalar.list,'surface'))

Splitting on Assignment

Splitting may also be performed on assignment to a library function. In this case, the user must indicate whether the right-hand-side of the assignment (after the equals sign) will be split or not. If not, the same value will be assigned to each split assignment to the function. If split, the elements of the list on the right will be assigned sequentially to each call to the function.

Right-hand splitting is indicated by appending the splitting operator :: to any of the assignment operators =, +=, -=, *=, or /=. So =::, +=::, -=::, *=::, and /=::.

For example, the following line of FISH will increment a random value from 0.0 to 1.0 to the x-coordinate of every data scalar in the model.

    data.scalar.pos(::data.scalar.list)->x ...
                      +=:: math.random.uniform(data.scalar.num)

Splitting and Multithreading

If splitting is performed on an operator or a library function that is tagged as thread-safe, the splitting will be done on all available threads automatically. Library functions are tagged as thread-safe if the ‘=’ sign is preceded by a ‘:’ in the documentation description. Otherwise the splitting operation will still be performed, just sequentially on a single thread.

Conclusion and Tips

The combination of boolean list selection, operators acting on list elements, and splitting allows complex algorithms to be created in relatively few lines of code without ever having to use loop statements.

Effectively using splitting requires a certain change of perspective and approach from traditional sequential programming. But once the user becomes comfortable, the reward is being able to very quickly and relatively efficiently perform operations on large quantities of data using a relatively small amount of code.

Note that while splitting is very convenient, it is in general not as efficient in a multi-threaded environment as a FISH operator that does multiple calculations on a single object at the same time using a single split. If speed is important (as is generally the case for functions executing during cycling), it is almost always worth the effort to create an operator instead of using multiple splitting implementations on existing intrinsics.