Source

Suggest edits
Documentation > Language

Content:

1 - Plug a source
2 - List files in a directory
3 - List directories in a directory
4 - A complete example
5 - Retrieve data from a HTTP URL


Sources have been designed as a possible way to inject data in the dataflow from diverse sources: CSV files, databases, sensors...
At the moment, only file-based and http(s) url Sources are available in OpenMOLE. If you need to interface OpenMOLE with an external datasource, check the contact information page to see how to reach the OpenMOLE development team.

Plug a source 🔗

Sources are plugged in the dataflow in a similar fashion to hooks. Let's consider this simple worbflow:
val files = Val[Array[File]]
val result = Val[Double]

val hello =
  ScalaTask("val result = computeFromFiles(files)") set (
    inputs += files,
    outputs += result
  )

val s = ListFilesSource(workDirectory / "directory", files)

(hello source s)
The source s is plugged at the beginning of the task hello. The source is executed prior to each execution of hello. You can also plug multiple sources on the same task using the syntax: hello source (s1, s2, s3).

List files in a directory 🔗

This source lists directories and injects an array of File objects into the dataflow. See how the range of files selected can be filtered using a regular expression as a last parameter to the source builder.
val someVariable = Val[String]
val txtFiles = Val[Array[File]]
val files = Val[Array[File]]

val s1 = ListFilesSource(workDirectory / "directory", files)

val s2 =
  ListFilesSource(workDirectory / "/${someVariable}/", txtFiles, ".*\\.txt") set (
    inputs += someVariable
)

List directories in a directory 🔗

Likewise, you can inject an array of directories in the dataflow. Directories are also represented as File objects. Again, the selection can be done either by passing a complete directory name, or a global pattern that will be matched against the names of the directories found.
val someVariable = Val[String]
val dirs = Val[Array[File]]
val aaaDirs = Val[Array[File]]

// will fill dirs with all the subdirectories of "directory"
val s1 = ListDirectoriesSource(workDirectory / "directory", dirs)

val s2 =
  // will fill aaaDirs with all the subdirectories of "directory" starting with aaa
  ListDirectoriesSource(workDirectory / "${someVariable}", aaaDirs, "^aaa.*") set (
    inputs += someVariable
  )
Sources store each entry found in an Array. In most cases, you will want each of the entries to feed a different task. Let's now see how this can be done by reusing what we've discovered with the data processing sampling.

A complete example 🔗

Here, we are collecting all the directories named care_archive. See how they are gathered in an @i{Array[File]} container and can be explored by an ExplorationTask using the keyword in. This exploration generates one @code{analysisTask} per directory collected by the source.
val directoriesToAnalyze  = Val[Array[File]]

val s = ListDirectoriesSource(workDirectory / "data/care_DoE", directoriesToAnalyze, "care_archive")

val inDir = Val[File]
val myWorkDirectory = "care_archive"

val analysisTask =
SystemExecTask(s"${myWorkDirectory}/re-execute.sh") set (
  inputFiles    += (inDir, myWorkDirectory)
)

val exploration = ExplorationTask(inDir in directoriesToAnalyze)

(exploration source s) -< analysisTask

Retrieve data from a HTTP URL 🔗

You can inject data into your workflow from a HTTP/HTTPS URL using the HttpURLSource. This is a preferred method to making network calls from within your model: when model runs are distributed on a cluster or the grid, such requests may fail. This source will retrieve raw content from the provided URL, and attribute to a String or a File prototype (in the second case, the content is written as file content). This is illustrated in the following example with a String prototype:
val stringOutput = Val[String]
val s = HttpURLSource("http://api.ipify.org",stringOutput)
ScalaTask("println(input.stringOutput)") set ((inputs,outputs) += (stringOutput)) source s hook display
and in the following example with a File prototype.
val fileOutput = Val[File]
val s = HttpURLSource("http://api.ipify.org",fileOutput)
ScalaTask("println(input.fileOutput)") set ((inputs,outputs) += (fileOutput)) source s hook CopyFileHook(fileOutput,"test.txt")