Reproducibility Tasks

The reproducibility tools listed in this Website are classified according to different reproducibility tasks: provenance capture, representation, replicability, modifiability, portability, longevity, document linkage, and experiment sharing.

Provenance Capture

Provenance capture is the ability of automatically capturing the components involved in an experiment as it is executed (e.g.: its computational steps), which may include the data used and derived (e.g.: input, intermediate, and output data and parameters), detailed documentation of the exact process that was carried out (e.g.: source code and software binaries), and information about the environment (e.g.: OS information, hardware architecture, and library dependencies). For the sake of usability, provenance capture should be as automatic as possible, i.e. the capture should be systematically and transparently performed using a single run of the experiment. As capture may come in different granularities and with different methods, we can have different subcategories:

OS-Based Capture

This type of capture provides fine-grained provenance information that relies on functionality present at the operating system (OS) level (e.g.: manipulation of system calls and computational processes), and in general does not require modification of existing scripts or programs. Often, the provenance is captured by inspecting OS processes responsible for the execution.

Examples: PASS, ReproZip (see a full list of tools here)

Code-Based Capture

In this type of capture, code is instrumented, automatically or manually through annotations, to gather provenance information. Typically, tools that support this capture mode are language-dependent.

Examples: Sumatra, noWorkflow (see a full list of tools here)

Data-Based Capture

This type of capture keeps track of changes that happen in data files. Typically, this is related to file versioning and version control systems.

Examples: Git, Mercurial (see a full list of tools here)

Workflow-Based Capture

This mechanism is related to the ability to create, execute, and manage scientific workflows, which orchestrates the execution of an experiment as a graph. Most workflow systems provide this type of capture, and they can naturally track and store these workflow steps. Unlike OS-based mechanisms, workflow systems capture information at a higher granularity.

Examples: VisTrails, Kepler, Taverna (see a full list of tools here)

[Back to Top]


Representation is related to the ability of a tool to create of a specification that reflects the structure of the experiment, including its computational steps. The specification (or representation) can be descriptive-only or executable.

Descriptive-Only Representation

This type of representation does not allow users to execute the experiment. Often, this representation is useful for debugging and documentation.

Examples: ES3 (see a full list of tools here)

Executable Representation

An executable representation, besides being descriptive, allows the experiment to be manipulated and executed (e.g.: ability to perform queries that straddle the different steps, to easily change parameters and data, and to combine steps from different experiments). It also allows the experiment to be portable, as users may re-execute and reproduce the findings at least in the original environment.

Examples: Madagascar, Taverna, VisTrails (see a full list of tools here)

[Back to Top]


A tool that supports replicability allows users to re-execute an experiment with the same parameters and data originally used. The main idea of replicability is then to repeat the execution and get the original results, possibly on a different platform. This task is particularly important for scientific publications, where reviewers and readers want to replicate the findings described in a paper (e.g.: replicate numerical results, figures, and plots). Note that, although a tool may allow an experiment to be replicated, replicability cannot be guaranteed when an experiment has processes that are non-deterministic (e.g.: generation of randomness) or that are not controlled by users themselves (e.g.: Web services). Replicability may also fail if library dependencies are updated and have their implementation changed.

Examples: VisTrails, Kepler, Sumatra, ReproZip (see a full list of tools here)

[Back to Top]


Modifiability is related to the ability of varying parameters and data, and sometimes changing the structure of the experiment. This is often useful to see how the experiment behaves with different inputs and how consistent and sensible the results are.

Examples: VisTrails, Taverna, Kepler, ReproZip (see a full list of tools here)

[Back to Top]


A tool supporting portability allows an experiment to be re-executed in an environment different from the one originally used, on either the same operating system or a different one. Ideally, portability should be attained in a transparent and automatic fashion, without the need for the user to do anything special. There are three different levels of portability:

Low Portability

Low portability level can only guarantee that an experiment is reproducible in the environment in which the experiment was originally created. This usually comes from the fact that the tools do not capture information about the environment (e.g.: software dependencies).

Examples: VisTrails, Kepler, PASS (see a full list of tools here)

Medium Portability

Medium portability guarantees that an experiment is reproducible in environments similar to the one originally used, i.e. same operating systems, or compatible operating systems and hardware architectures.

Examples: CDE, CARE (see a full list of tools here)

High Portability

When a high portability level is supported, an experiment and its results can be ported to environments different from the original one (different operating systems and possibly hardware architectures). Two different mechanisms are often used to support this level: capturing the complete environment (including the operating system), or providing a Web-based interface that can be remotely accessed, independent from the environment being used.

Examples: Vagrant, ReproZip, Galaxy, crowdLabs (see a full list of tools here)

[Back to Top]


Longevity relates to the ability to reproduce and re-execute experiments long after they were created. This functionality is important because it helps maintain the reproducibility coverage, avoiding parts of the experiment to become unreproducible after the software evolves, and update the experiment according to the new software. Note that longevity is related to the re-execution of the experiment, rather than to data preservation (for data preservation, see experiment sharing). Long-term reproducibility is provided by using two different mechanisms:

By Archiving

In this case, longevity is achieved by archiving all the necessary information, i.e. all the provenance information (data, specification, and OS environment), in a self-contained and executable package, either locally or remotely, so that the original environment can be reconstituted and the experiments reproduced.

Examples: ReproZip, CDE (see a full list of tools here)

By Upgrading

This type of longevity entails upgrading the components of the experiment that evolved or that do not work anywore, which can also require the replacement of some of these components. This ensures that the experiment can be re-used and extended with newer software libraries or to run on new hardware. Note that, once upgrades are applied, the experiment may change. Thus, replicability is not guaranteed and the upgraded experiment may output results that do not match the original ones.

Examples: VisTrails, Taverna (see a full list of tools here)

[Back to Top]

Document Linkage

Document linkage is the ability to automatically and systematically connect derived results in a document to their original data, so that they can be directly accessed, verified, and perhaps manipulated. Document linkage helps understand the relationship between results and their corresponding experiments, data, and dependencies. This often includes the creation of executable documents, which embed computational objects -- such as code and data -- used to generate the images, tables and plots included inside the documents. Document linkage may be by reference or inlined.

By Reference

This type of document linkage corresponds to data and code hosted externally, e.g. in a Web server, and referenced from the document.

Examples: Collage, DEEP, VisTrails (see a full list of tools here)


Inlined document linkage embeds the data inside the document (e.g.: snippets of source code embedded in a human-readable file, which generate experimental results, such as figures and plots). This can be seen as a form of dynamic linkage, where results in a document can be automatically updated on the fly when the experiment and data change.

Examples: Sweave, IPython (see a full list of tools here)

[Back to Top]

Experiment Sharing

Experiment sharing is the ability to provide an infrastructure to upload, archive, and share data related to an experiment (e.g.: input data, output data, specification, and information about the computational environment) so that others have access to it. This functionality is also related to data preservation, which focuses on ensuring that the data remains accessible, citable, and reusable as time passes by, an important aspect for reproducibility. There are two types of experiment sharing:

By Archival

This type of experiment sharing allows users to download data and experiments from a server, but not to execute at the server.

Examples: myExperiment, Dataverse (see a full list of tools here)

By Hosted Execution

This type of experiment sharing allows users to execute the experiment at the server.

Examples: crowdLabs, Collage (see a full list of tools here)

[Back to Top]