$ curl -d "outputFormat=json&apikey=1234&text=Viruses." http://npjoint.com/Cocoa/api/
{
"doc": {
"info": {
"document": "... ",
"externalMetadata": "",
"submitter": "127.0.0.1",
"externalID": "",
"id": "http://npjoint.com"
},
"meta": {
"submitterCode": "1234",
"signature": "",
"messages": []
}
},
"1":
{
"_type": "Organism",
"_typeGroup": "entities",
"name": "Viruses",
"instances": [
{
"exact": "Viruses",
"offset": 0,
"length": 7
}
]
},
}
The output above is a minimal subset of the JSON object returned by Open Calais. The Cocoa API in fact also accepts Open Calais style JSON requests, and returns the same output as above; see e.g., the OpenCalais plugin in GATE.
GENIA/Brat compatible A1 output is returned by setting the outputFormat parameter to a1 thusly:$ curl -d "outputFormat=a1&apikey=1234&text=Viruses." http://npjoint.com/Cocoa/api/ T1 Organism 0 7 Viruses
$ curl -d "outputFormat=a1&apikey=1234&text=A smorgasbord: Liver cancer, chromatophores and tigers." http://npjoint.com/Cocoa/api/ T1 Physiology 15 27 Liver cancer T2 Body_part 29 43 chromatophores T3 Organism 48 54 tigerswith this ('b1' output):
$ curl -d "outputFormat=b1&apikey=1234&text=A smorgasbord: Liver cancer, chromatophores and tigers." http://npjoint.com/Cocoa/api/ T1 Disease 15 27 Liver cancer T2 Cellular_component 29 43 chromatophores T3 Organism1 48 54 tigers T4 Organ 15 20 Liver
text :- the URI-encoded text/data, should be less than 10000 characters.
outputFormat :- 'json', 'a1', 'a1j', 'b1'. The 'a1j' output is specific to the brat annotator. 'b1' returns extended annotations.
apikey :- your API key (use "1234" for now)
An undefined acronym may be interpreted somewhat randomly by the system. To help processing, you may wish to pre-define acronyms with the acronym parameter.
acronym :- a URI-encoded string that is the acronym followed by its expansion and then by a predefined tag (see below) . These fields should be separated by a pipe ("|") symbol. An example is acronym=DC|discus cells|BODY PART
Multiple acronym definitions can be given. Each definition should be separated from the next by a newline ("\n").
Cocoa detects and tags chemical formulas. However, tagging of formulas may be undesirable for documents with abbreviations which resemble chemical formulas (or protein sequences). For such documents, turn off formula tagging by setting the mode parameter.
mode :- 'noform' - turn off formula detection; 'minform' - do not tag abbreviations in pure CAPS; formulae with a lowercase letter or a numeral will be tagged.
Compare these outputs:$ curl -d "outputFormat=a1&apikey=1234&text=HBr, H2O, GGGG, and ABCD." http://npjoint.com/Cocoa/api/ T1 Chemical 0 3 HBr T2 Chemical 5 8 H2O T3 Protein_part 10 14 GGGG T4 Unknown 20 24 ABCD
$ curl -d "mode=minform&outputFormat=a1&apikey=1234&text=HBr, H2O, GGGG, and ABCD." http://npjoint.com/Cocoa/api/ T1 Chemical 0 3 HBr T2 Chemical 5 8 H2O T3 Unknown 10 14 GGGG T4 Unknown 20 24 ABCD
$ curl -d "mode=noform&outputFormat=a1&apikey=1234&text=HBr, H2O, GGGG, and ABCD." http://npjoint.com/Cocoa/api/ T1 Unknown 0 3 HBr T2 Unknown 5 8 H2O T3 Unknown 10 14 GGGG T4 Unknown 20 24 ABCDUse the custom tag CTAG in a pseudo-acronym definition to prevent Cocoa from processing a pre-defined entity. Compare:
$ curl -d "outputFormat=a1&apikey=1234&acronym=HBr|humongous breakthrough|CTAG&text=HBr, H2O, GGGG, and ABCD." http://npjoint.com/Cocoa/api/ T1 Custom_tag 0 3 HBr T2 Chemical 5 8 H2O T3 Protein_part 10 14 GGGG T4 Unknown 20 24 ABCD
$ curl -d "outputFormat=a1&apikey=1234&acronym=ABCD|a blank compact disc|CTAG&text=HBr, H2O, GGGG, and ABCD." http://npjoint.com/Cocoa/api/ T1 Chemical 0 3 HBr T2 Chemical 5 8 H2O T3 Protein_part 10 14 GGGG T4 Custom_tag 20 24 ABCDNote: Context-sensitive tagging is never turned off, so your mileage with the acronym and mode parameters may vary.
200 Ok
404: No text to process
503 Output format unavailable
504 Bad API/license key
505 Processing error