Roles Across Multiple Sentences (RAMS)

RAMS is the dataset associated with the paper Multi-Sentence Argument Linking. It contains 9,124 annotated events from news based on an ontology of 139 event types and 65 roles. In a 5-sentence window around each event trigger, we annotate the closest span for each role. Our code and models are available,* as well as our slides for the paper.

Download RAMS 1.0 [current version, tar.gz (4.5MB)]

Or, view a Single Example.

* We fixed a bug that affects the performance on the Beyond NomBank (BNB) dataset. See the BNB documentation for details.


The data is split into train/dev/test files. Each line in a data file contains a json string. Each json contains:

All other fields are extraneous to allow for future iterations of RAMS.


A scorer is released alongside the data.

The basic use of the scorer is below:

python --gold_file <GOLD> --pred_file <PRED> --ontology_file <ONTOLOGY> --do_all

Some notes:

  • <PRED> can be in one of two formats. In both cases, it contains one json string per line, and that json blob must contain a doc_key.
    1. It contains a gold_evt_links key, like in the gold data. Add the --reuse_gold_format flag when running the scorer.
    2. It contains a predictions, as in this example: "predictions": [[[70, 70], [63, 63, "victim", 1.0], [58, 58, "place", 1.0]]] It is a list of event-predictions (in RAMS there is only one). Each event-prediction starts with a [start, end] (inclusive) span for the event at index 0, and a [start, end, label, confidence] at subsequent indices for each argument.
  • <ONTOLOGY> is used for type constrained decoding. a tsv where the 0th column is the event name, and the (2i + 1)th column is the role name and (2i + 2)th column is the count that is permitted by the event.
  • Use -cd for type constrained decoding. Otherwise it is not on by default.
  • --do_all prints out metrics (--metrics), metrics by distance (distance), metrics by role (role_table), and csv confusion matrix (confusion). Individual metrics can be printed with their own flags (in parens).
  • The scorer is compatible with both Python2.7 and Python3.6


Please contact us if you want to obtain the older versions of the data. We encourage you to use the current version.

RAMS_1.0b.tar.gz [current, 4.5MB] is the current version. We added LICENSE information in July (see below). Please refer to this as RAMS 1.0.

RAMS_1.0.tar.gz [4.5MB] contains the same data as the current version above. The scorer incorrectly reported precision and recall.

RAMS_0.9.tar.gz [10MB] was used in an earlier version of the paper and contains human readable files. We found some overlap between splits. The current version re-split and re-released a dataset without overlap.

To cite:

  title={Multi-Sentence Argument Linking},
  author={Seth Ebner and Patrick Xia and Ryan Culkin and Kyle Rawlins and Benjamin {Van Durme}},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},

Licensing and Takedown

RAMS 1.0 consists of annotations against paragraph-sized examples drawn from articles distributed publicly on the internet.

We do not own that text nor claim copyright: examples drawn from these articles are meant for research use in algorithmic design.

We release our annotations of the underlying text under CC-BY-SA-4.0.

Notice and take down policy:

Notice: Should you consider that our data contains material that is owned by you and should therefore not be reproduced here, please:

Clearly identify yourself, with detailed contact data such as an address, telephone number or email address at which you can be contacted.

Clearly identify the copyrighted work claimed to be infringed.

Clearly identify the material that is claimed to be infringing and information reasonably sufficient to allow us to locate the material.

And contact the authors.

Take down: We will comply to legitimate requests by removing the affected sources from the next release of the annotations.