Of Models and WorkflowsA workflow is a set of tasks linked with each other through transitions. From a high level point of view, tasks comprise inputs, outputs and optional default values.
Tasks are the atomic computing elements of OpenMOLE, they describe what OpenMOLE should execute and delegate to remote environments. They are also what embeds your own models and/or programs.
Depending on what kind of program (binary executable, Java...) you want to embed in OpenMOLE you have to choose the adequate task.
Task execution depends on inputs variables and each task produces outputs which transmitted to the inputs of subsequent tasks. Below is a dummy task to illustrate all this jargon:
// Define a variable i of type Int val i = Val[Int] val j = Val[Int] // Instantiate a task that does nothing. // This task uses the variable i as input and j as output. Any task immediately following this one in the workflow (i.e. linked with a transition) will be able to use the variable j containing the result of this task. val t = EmptyTask() set ( inputs += i, outputs += j )
It is also possible to specify default values which are used by the task in case no input data was provided in the dataflow:
val i = Val[Int] val j = Val[Int] val t = EmptyTask() set ( inputs += i, outputs += j, // set i's default value to 0 i := 0 )
OpenMOLE scriptsWriting an OpenMOLE script consists in defining tasks, their inputs and outputs, the transitions between the tasks and the execution environment.
Some of the tasks are made to frame your program, some others are made to generate inputs: Samplings or to capture the outputs : Hooks. As you will progress into the world of OpenMOLE, you will discover how to define these various elements and build your own workflows. For now let's give a dummy example.
Let's say you have a model that take a string as input and do some stuff with it, like launching a simulation with the parameters contained in the input string. People from the lab gave you a huge CSV file where each line contains various experimental setup parameters. What you want is to run a simulation for each line of this file, execute it on the lab's cluster, and gather theirs results. Your openMOLE script would look like that:
val inputsParameter: Val[Int] val result: Val[Double] // crawl the big file and take the lines val all_the_lines = CSVSampling("EmpiricalData.CSV") set (columns += inputParameter) // encapsulates your model in an "execution" task that calls the main routine val my_model_execution = ScalaTask("mainRun(inputParameter)", inputs += inputString, outputs += result ) // a hook to catch the outputs of your model execution and put them in a CSV file val catch_output = CSVHook("path/to/save/it") // declare your computing environment val lab_Cluster = ClusterEnvironment(login, machineIP) // the workflow. Bascically it says : explore the lines and run a model execution for each, // save the outputs, all that on the cluster (results are not brought back to the local computer yet). DirectSampling( evaluation = my_model_execution on env hook catch_output, sampling = all_the_lines )