This is part of my series of posts about the PolyAnno project – more here
Storage and API
Setup
To store the annotations I decided to use MongoDB as a classic NoSQL, non-relational database to store the JSON-LD files because of its support for handling JSONs. I decided for simplicity to use Mongoose as a layer on top of this just to save some time working on this section of the project.
I had chosen to write the backend in Node.js because of all the Javascript throughout and because it allowed me to use Express.js to set up some nice modular code to handle the different annotation APIs I needed to create across the project.
For those not familiar with Node or Express, the general idea is to set up an Express application running, normally the variable “app” and then to define how the “app” responds to different requests made to the server. You can set up requests for different addresses to redirect or respond with the relevant HTML pages of the site, but also you can put rules in place to say that if addresses are sent to certain paths of addresses then you can redirect the responses to functions defined elsewhere. This also means you can then write up the back end scripts for the different routes in different pages and then just import those functions in.
For example, you can set up an app that sends the home page HTML in response to GET “app” + “/” but anything sent to an address of the app starting with “/api/” is redirected to the instructions written for the “annoRouter”.
////// SETUP var databaseURL = 'mongodb://localhost:27017/testMongoDB'; var express = require('express'); var bodyParser = require('body-parser'); var mongoose = require('mongoose'); var transcriptions = require('./routes/transcriptions'); var app = express(); ////BASIC APP ROUTES app.use(express.static(__dirname + '/public')); app.use(express.static(__dirname + '/views')); app.use(express.static(__dirname + '/examples')); //HOME PAGE app.get('/', function(req, res) { res.sendFile(__dirname + "/index.html"); }); app.use(bodyParser.urlencoded({ extended: true })); app.use(bodyParser.json()); var port = process.env.PORT || 8080; mongoose.connect(databaseURL, { config: { autoIndex: false } }); //OTHER PAGE ROUTES app.get('/aboutus', function(req, res) { res.redirect("/aboutus.html"); }); //////MIDDLEWARE var annoRouter = express.Router(); ///////API ROUTES //TRANSCRIPTION API annoRouter.get('/transcriptions', transcriptions.findAll); annoRouter.post('/transcriptions', transcriptions.addNew); annoRouter.put('/transcriptions/:transcription_id', transcriptions.updateOne); /////////GET STARTED app.use('/api', annoRouter); //tells the server where it should be listening for req and res app.listen(8080);
Annotation JSON Design

MongoDB’s generates an “_id” field for every JSON it stores, which I added as a parameter to the relevant route to generate the “id” field in the transcription JSON to ensure a consistent URL for the JSON-LD ids in the Open Annotation Model JSON. This public OAM API means that others can interact and possibly do some future cool stuff with the database of annotations if wanted.
"id" = "route URL" + "_id"
I used the NPM Body Parser package to make my life a little easier parsing parameters and then used it with the Mongoose to write functions to encode POST requests into the relevant formats. Rather than write separate route functions for different parameters possible I decided to create one function to accept a new annotation and one to accept updates to annotation fields and then within those functions I put in checks to see which parameters were present in the request and were to be stored or updated. This meant that rather than having to encode different parameters into URL formats on the client side I could just generate JSON files and send that as one object each time.
The Schema (the format of the JSON file understood by the Mongoose package) for a basic annotation JSON file* is as follows:
{
//// This first part is a requirement for the JSON to be a standard JSON-LD
"@context": { type: [], default: [ "http://127.0.0.1:8080/context.json" ]}, "id": { type: String, },
//// The type field needs to specify “Annotation” and to comply with the OAM standard but can contain other types too
"type": { type: [] },
//// The following info is not necessary to use but I felt would be useful to collect
"text": String, "language": String, "created": { type: Date, default: Date.now }, "creator": { "id": { type: String }, "name": { type: String }, "motivation": { type: String, default: "identifying" } }, "metadata":[] }
*This is without the voting components which are discussed later in Verification.
The context.json file separates out the JSON features required to define what each field represents for the JSON to be JSON-LD. For the PolyAnno annotations, the complete context.json can be found on Github here but an except is shown below:
{ ////The permanent URLs specifying the standard definitions "@context": { "iiif": "http://iiif.io/api/image/2#", "xsd": "http://www.w3.org/2001/XMLSchema#", "exif": "http://www.w3.org/2003/12/exif/ns#", "dc": "http://purl.org/dc/elements/1.1/", "dcterms": "http://purl.org/dc/terms/", "doap": "http://usefulinc.com/ns/doap#", "svcs": "http://rdfs.org/sioc/services#", "foaf": "http://xmlns.com/foaf/0.1/", "sc": "http://iiif.io/api/presentation/2#", "oa" : "http://www.w3.org/ns/oa#", "geo": "http://geojson.org/vocab#", "as": "http://www.w3.org/ns/activitystreams#" }, ///The individual definitions of each field used in the JSONs "Annotation": { "@id": "oa:Annotation", "@type": "oa:Class" }, "body": { "@id": "oa:body", "@type": "oa:Relationship" }, "target": { "@id": "oa:target", "@type": "oa:Relationship" }, "text": { "@id": "oa:text", "@type": "oa:Property" }, "items": { "@id": "oa:target", "@type": "oa:Relationship" }, "motivation": { "@id": "oa:motivation", "@type": "oa:Relationship" }, "purpose": { "@id": "oa:purpose", "@type": "oa:Relationship" }, "source": { "@id": "oa:source", "@type": "oa:Relationship" }, "selector": { "@id": "oa:selector", "@type": "oa:Relationship" }, "created": { "@id": "dcterms:created", "@type": "@id" }, "creator": { "@id": "dcterms:creator", "@type": "@id" }, "language": { "@id": "dc:language", "@type": "xsd:string" }, "formats": { "@id": "iiif:format", "@type": "xsd:string" }, "tiles": { "@id": "iiif:hasTile", "@type": "@id" }, "profile": { "@id": "doap:implements", "@type": "@id" }, "protocol": { "@id": "dcterms:conformsTo", "@type": "@id" }, "supports": { "@id": "iiif:supports", "@type": "@vocab" }, "service": { "@type": "@id", "@id": "svcs:has_service" }, "license": { "@type": "@id", "@id": "dcterms:license" }, "logo": { "@type": "@id", "@id": "foaf:logo" }, "attribution": { "@id": "sc:attributionLabel" } } }
I found it useful to collect all the definitions for the project into one context.json file rather than varying and creating individual ones for each different JSON type (transcription, translation, vector etc).
Next Up: Transcription
This is part of my series of posts about the PolyAnno project – more here