Splitting: Using Multiple Threads in FISH
Definition
FISH splitting is passing an aggregate type (a list, an array, a container of objects, etc.) as a FISH function argument using the splitting operator (::), causing the function to be repeatedly performed on all members of the aggregate type. This is a more syntactically concise alternative to loop
statements.
Consider:
local a = list
loop foreach local pnt data.scalar.list
a = list.append(a,data.scalar.pos(pnt)->x)
endloop
This builds a list (a
) of \(x\)-positions of all scalars on the global data scalar list. In an equivalent split version of the above, data.scalar.pos
is repeatedly called on each member of the data.scalar.list
— all in one line:
local a = data.scalar.pos(::data.scalar.list)->x
Significantly, splitting is performed on all available threads automatically. Considering that looping over lists of model objects and other aggregate types is quite commonly necessary in FISH, the advantage here is obvious.
Implementation
In order to make a split call, affix the split operator
::
to one or more arguments of the function (or operator or library call). Arguments that are not split will be the same in every execution of the function. If multiple arguments are split, they must contain the same number of elements.Any iterable FISH type may be split. FISH iterable types include: pointers to containers, strings, vectors, tensors, matrices, lists, maps, and arrays.
A splitting operation on an intrinsic—regardless of that function’s original return value—returns a list.
FISH library functions must be thread-safe to use splitting (these are identified with the notation
:=
in function reference documentation). Otherwise the splitting operation will still be performed, just sequentially on a single thread.
Tip
Splitting intrinsics is fast but generally it is not as efficient in a multi-threaded environment as a FISH operator (defined below). An operator using a single split will provide maximal efficiency (see the Another Comparison example below).
Most often splitting will be done on lists or on pointers to containers of model objects (such as grid points, balls, or blocks). The number of threads made available to splitting can be controlled using the global.threads
intrinsic, though by default the program determines the optimal number of threads for maximum performance.
Use
Splitting
Splitting, as defined above and seen in the examples below, is passing a “split” argument to a FISH function that is thread-safe (and will thereby accept such an argument).
To illustrate the concept a bit further, splitting is a kind of abbreviation of a loop
statement, which itself is an abbreviation of repeated single operations. The example above, if written as individual lines, might look like:
a = list
a = list.append(a,list.at(0,data.scalar.pos)->x)
a = list.append(a,list.at(1,data.scalar.pos)->x)
a = list.append(a,list.at(2,data.scalar.pos)->x)
; and so on up to n ...
One can readily see the programmatic desirability for loop
constructs to handle range-based (e.g., 0-\(n\) as here) repeated function calls. Where loops reduce the repetition of function calls, splitting eliminates the need for loop statements themselves, leaving behind the core elements of concern: the range that governs the repetition, and the function to be repeated. The comparison example below illustrates.
Effectively using splitting requires a certain change of perspective and approach from traditional sequential programming. But once the user becomes comfortable, the reward is quickly and efficiently performing operations on large quantities of data using a relatively small amount of code.
List Filtering
Splitting in combination with boolean list filtering can be used to quickly find a sub-list of objects selected by a specific criteria. For example, the following code will find the list of all data scalars that are tagged with the group ‘surface’.
local allpts = list(data.scalar.list) ; A list of pointers to scalars
local check = data.scalar.isgroup(::data.scalar.list,'surface')
; List of booleans
global pts = allpts(check) ; List of scalars in 'surface'
This can be done as a single line of FISH as follows:
global pts = ...
list(data.scalar.list)(data.scalar.isgroup(::data.scalar.list,'surface'))
Splitting on Assignment
Splitting may also be performed on assignment to a library function. In this case, the user must indicate whether the right-hand-side of the assignment (after the equals sign) will be split or not. If not, the same value will be assigned to each split assignment to the function. If split, the elements of the list on the right will be assigned sequentially to each call to the function.
Right-hand splitting is indicated by appending the splitting operator :: to any of the assignment operators =, +=, -=, *=, or /=. So =::, +=::, -=::, *=::, and /=::.
For example, the following line of FISH will increment a random value from 0.0 to 1.0 to the x-coordinate of every data scalar in the model.
data.scalar.pos(::data.scalar.list)->x ...
+=:: math.random.uniform(data.scalar.num)
The concepts described above are further explored in the Examples section below.
Operators: Writing Functions for Multiple Threads
A FISH operator is a user-written function designed to be executed in a multi-threaded environment — that is, one that will accept splitting, as described above.
Because a function must be safe when multiple threads are running simultaneously, operators dwell in a restricted environment. A number of rules constrain operators differently from normal FISH functions:
Operators must always take at least one argument.
Operators cannot call normal FISH functions. They may call other operators.
Operators may only call FISH library functions that are thread safe. Library functions are tagged as thread safe if the ‘=’ sign is preceded by a ‘:’ in the reference documentation.
Operators may only write to global symbols when the
lock
statement is used.Operators cannot modify the values passed as arguments.
Operators cannot call library functions on assignment (on the left hand side of an equality operator) and pass a pointer to an object, unless that object was passed in as an argument.
Operators cannot call other operators and pass a pointer to an object, unless that object was passed in as an argument.
Be aware that reading or writing to global symbols must be synchronized across all executions running simultaneously, and can therefore severely affect overall performance. It is recommended that local variables be used exclusively where possible.
Operators are created using the fish operator
command, with arguments following just like fish define
.
The FISH lines in the definition are the same as for a normal function, subject to the restrictions above.
In order to give examples of operators, we will first generate some data to operate upon. The following FISH function will generate 100,000 data scalars at random points in a 10x10x10 cube and give them random values from 0.0 to 10.0.
model random 12000
fish define generate(n)
local x = math.random.uniform(n) * 10.0
local y = math.random.uniform(n) * 10.0
local z = math.random.uniform(n) * 10.0
local scalars = data.scalar.create(::vector(::x,::y,::z))
data.scalar.value(::scalars) =:: math.random.uniform(n) * 10.0
end
[generate(100000)]
The following example operator determines if the x-coordinate of a particular scalar object falls within a given range, and if so it both assigns it to the group inside in slot mark and returns true, or assigns it to the group outside and returns false.
fish operator labeldata(data,low,high)
local x = data.scalar.pos(data)->x
if math.in.range(low,x,high) then
data.scalar.group(data,'mark') = 'inside'
return true
endif
data.scalar.group(data,'mark') = 'outside'
return false
end
To execute the operator, call it as a normal FISH symbol with a split argument. Since it is an operator, the repeated calls will be executed on all available threads.
[labeldata(::data.scalar.list,4.0,6.0)]
Note the use of the return statement in the operator. Since an operator is also a global symbol (just like a function), assigning a value to it would require the use of a lock statement and incur significant synchronization overhead. Instead, operators should always use the return statement to return values from the operation. As with all splitting, the return values are collected into a list.
There are occasions when reading and/or writing to a global symbol is unavoidable. In such cases, the fwd statement needs to be used inside an operator to allow a global symbol to be written to. The following example uses a FISH operator to scan the scalar data for values inside a box given by a cartesian extent, and returns the maximum, minimum, accumulated, and last value found in global symbols.
[global maxval = 0.0]
[global minval = 1e30]
[global accval = 0.0]
[global setval = 0.0]
fish operator maxdata(data,low,high)
local p = data.scalar.pos(data)
if math.in.range(low,p,high) then
lock maxval = math.max(maxval,data.scalar.val(data)) ; Max value
lock minval = math.min(minval,data.scalar.val(data)) ; Min value
lock accval += data.scalar.val(data) ; Accumulated value
lock setval = data.scalar.val(data) ; Set value (should be
; last value in list)
endif
end
[maxdata(::data.scalar.list,(4,4,4),(6,6,6))]
Finally, one of the most common and important uses of FISH operators (and indeed their primary reason for creation) is to use during cycling. Otherwise a single threaded FISH function that checks or changes all objects in a model will easily dominate the run time of the system.
To assign an operator to execute during cycling, use either the fish callback
command or the fish-call keyword to the model solve
command.
Specify the name of the operator, followed by the argument(s) that will be passed for each execution.
Generally the argument assigning the list of objects to iterate over is indicated using a FISH library function, which is not directly recognized by the command processor. This must therefore be indicated using inline fish. The :: prefix must be used to indicate that the argument is to be split, as always (see Splitting). Note that these values are evaluated when the command is processed, and stored for execution during cycling.
An example of this is below:
;; Persistent fish call entry
fish callback add my_function([::data.scalar.list]) -100
;; fish call added to a single solve command
model solve fish-call -100 my_function([::data.scalar.list])
Examples
Basic Splitting Operations
model new
zone create brick
fish define dosplitting
; split a container of model objects
local pos = gp,pos (::gp,list) ; Pos is a list of positions of
; all the grid points in the model
; create a vector, then split it to get its largest square root
; then print the value
local v = vector (l, 2, 3)
local v2max = list.max(math.sqrt(::v))
io,out (v2max)
; create a matrix and use list.xx functions to convert it
; to a list and obtain its max value
local m = matrix(math,random.uniform(9), 3, 3) ; Convert 9 random values
; to a 3x3 matrix
local lm = list.max(list(m)) ; Find the maximum value of the matrix
; by converting into a list
end
[dosplitting]
Splitting is most often done on containers of model objects, or lists obtained from them. However, it can be done all iterative types. And in all cases these types can be converted easily to lists—which makes them amenable to operations with the c list utility functions.
Find all grid points on group ‘Surface’, then sum the reaction force
fish define SurfaceForce(groupName)
local ingroup = gp.isgroup(::gp.list,groupName) ; Boolean list, true if
; on face group Surface
local gpsin = gp.list(ingroup) ; Only those gridpoints that are
; part of face group 'Surface'
local forces = gp.force.unbal(::gpsin)->z
return list.sum(forces)
; Or all in one line
return ...
list.sum(gp.force.unbal(::gp.list(gp.isgroup(::gp.list,groupName)))->z)
end
[SurfaceForce ('Surface')]
Assignment by Splitting
model new
zone create brick
zone cmodel assign elast ic
fish define splittingright
; Add the same value to the XX stress of every zone in the model
zone.stress(::zone.list)->xx += 500
; Add a different random value from 0 to 1 to the x position
; every grid point in the model
gp.pos(::gp.list )->x ::+= math.random.uniform(list.size(gp.list))
end
[splittingright]
Comparison Example
Both functions here calculate and store 24 (\(x\),\(y\)) values of a unit circle in 15 degree increments.
Old-Style
model new
fish define circle
global store = list.create(24)
loop local angle (15,360, 15)
local radian = angle * math.degrad
local v = vector (math.cos(radian), math.sin(radian))
store(angle//15) = v
end_loop
end
[circle]
fish list contents [store]
Splitting Intrinsics
model new
fish define circle
local angles = list.range(0,360,15) * math.degrad
global store = vector(::math.cos(::angles), ::math.sin(::angles))
end
[circle]
fish list contents [store]
Another Comparison: Old-Style vs. Splitting Instrinsics vs. Operators
The following three functions achieve identical results, but are progressively faster in execution. Example runtimes from one computer are shown for a model with two million zones.
Traditional (no splitting) (51 seconds)
fish define ground_freezing
loop foreach local zone zone.list
local porosity = zone.fluid.prop(zone,'porosity') ; Note: A
local expansion = porosity * 0.09 * 1.0; Porosity * water
local bulk = zone.prop(zone,'bulk')
local stress_inc = bulk * expansion ; Amount to increment
local bulk_inc * (8.96/2.16) * porosity ; Ratio of ice/water
zonc.prop(zone, 'bulk') * bulk + bulk_inc
zone.stress.xx(zone) = zone.stress.xx(zone) stress_inc ;
zone.stress.yy(zone) = zone.stress.yy(zone) stress_inc
zone.strcss.zz(zone) = zone.stress.zz(zone) stress_inc
zone.group(zone,'state') = 'frozen'
endloop
end
Split Intrinsics (11.5 seconds)
fish define freeze_zone
local porosity = zone.fluid.prop(::zone.list,'porosity') ; Note: Assumin
local expansion = porosity * 0.09 * 1.0; Porosity* water expansion * sat
local bulk = zone.prop(::zone.list,'bulk')
local stress_inc = bulk * expansion ; Amount to increment stress
local bulk_inc = porosity * (8.96/2.16) ; Ratio of ice/water bulk * porosi
zone.prop (:: zone.list, 'bulk') =:: bulk + bulk_inc
zone.stress.xx(::zone.list) =:: zone.stress.xx(::zone.list)-stress inc ;
zone.stress.yy(::zone.list) =:: zone.stress.yy(::zone.list)-stress inc
zone.stress.zz(::zone.list) =:: zone.stress.zz(::zone.list)-stress inc
zone.group (::zone.list, 'state' ) = 'frozen'
end
Operator (6.4 seconds)
fish operator freeze_zone(zone)
local porosity = zone.fluid.prop( zone,'porosity') ; Note:
local expansion = porosity * 0.09 * 1.0; Porosity * water
local bulk = zone.prop(zone,'bulk')
local stress_inc = bulk * expansion ; Amount to increment
local bulk_inc = (8.96/2.16) * porosity ; Ratio of ice/wat
zone.prop(zone, 'bulk') = bulk + bulk inc
zone.stress.xx(zone) = zone.stress.xx(zone) - stress inc ;
zone.stress.yy(zone) = zone.stress.yy(zone) - stress inc
zone.stress.zz(zone) = zone.stress.zz(zone) - stress inc
zone.group(zone,'state') = 'frozen'
end
[freeze_zone (::zone.list)]
Looking at the version that utilizes split intrinsics, the repeated (and solely appearing) split of zone.list
is a strong indicator that refactoring the function as an operator will be advantageous.
Was this helpful? ... | Itasca Software © 2024, Itasca | Updated: Dec 14, 2024 |