There’s been a bunch of discussion recently about replies to annotations. Specifically, about how our users’ experience of replies is currently pretty awful. To wit:

I wanted to explain, for everyone’s benefit, what the underlying causes for these issues are, and what I think it would take to fix these issues, or at least put us in a place where fixing them would become easier.


First of all, a quick reminder of how replies work at the moment.

Replies are stored as annotations with an additional references field, which contains an ordered list of the reply’s ancestors in the conversation tree. For example, a direct reply to the annotation with ID A will have a references field containing [A]. A reply to that reply (which we’ll give ID B) would have a references field containing [A, B]. Here’s an example conversation thread:

├── B references [A]
├── C references [A]
│   └── D references [A, C]
└── *E is missing*
    └── F references [A, E]
        └── G references [A, E, F]

One thing this shows is that this approach to building a conversation thread is tolerant of “missing” (i.e. deleted) replies in the thread. We can infer their existence and render a tolerable thread regardless. (Here’s an example of what this looks like in practice.)

Replies are stored in the annotation table in the Postgres database and as documents of type annotation in the Elasticsearch index.

In addition, replies have access controls which are independent of the access control settings of their parent annotation. That means that they can independently be set to either “shared” within a group (members of that group can see the reply) or not (only the reply’s author can see the reply).

Our user interface allows you to make private (non-shared) replies to public annotations, and public replies to (your own) private annotations2.


This approach to storing replies poses a number of problems. Mostly, it makes it hard to do a number of perfectly reasonable things in a way that’s efficient enough for the hard time constraints of a web application request.

  1. Returning a list of the N most recent annotations3 (for any fixed N) with their replies requires multiple round-trips to the search index. We first have to retrieve the N most recent annotations, and then have to page through the set of replies to those annotations. We do not know in advance how many replies we are going to find.
  2. Searching for some term without caring whether it’s found in an annotation or a reply is easy to do, but rendering the results in a way that makes sense is difficult, because we don’t know without issuing more queries whether or not the results we found have their own replies.
  3. Fetching the entire conversation thread associated with any single reply using our API is cumbersome due to the decision to treat replies as if they’re annotations. Specifically, this requires three API calls at the moment (1. fetch reply, 2. fetch thread root, 3. fetch all “annotations” referencing the thread root).

There are more problems, but these are the main ones.


So let’s assume we want to fix the two most heinous problems here, namely:

How might we go about that? Here’s a sketch of a proposal.

  1. Introduce new API calls for creating/updating/deleting replies. To start with, these would continue to create replies as if they were annotations, but with different validation requirements.

  2. Update the Hypothesis client to use these API calls for creating/updating/deleting replies.

  3. Move replies into their own database table, annotation_replies, which reflects the fact that many of the fields associated with annotations don’t make any sense for replies. For example:

    To make a range of queries easy, we’d probably include both a root_id and the references field as an array.

  4. Update the search indexing routines so that the conversation tree is denormalised into annotation documents, and replies are not indexed in their own right as separate documents.

    This would allow us to fetch an annotation and all of its replies, already in a conversation tree, using a single call to Elasticsearch.

    Or, to fetch a set of N annotations matching some query, and all their replies, using a single call to Elasticsearch.

    Updates to replies result in the reindexing of their thread root annotation.

  5. Expose the conversation tree attached to each annotation in the annotations API. (For example, at /api/search).

  6. Update the client to use a pre-assembled conversation tree rather than building its own from the _separate_replies result value.

This approach would put most of the work (assembling conversation threads) off the request path in the indexer background worker, and would also considerably reduce the amount of work the client needs to do to render annotation threads.

But, but…

There are a number of questions about this approach for which I don’t have answers. Namely:

  1. Searching for threads that contain a given term (including in replies) is easy, but identifying which bit of the thread matched seems harder. I’m not yet sure how we’d do this.

    This seems like it’d be much easier if reply threads weren’t arbitrarily nested…

  2. While nowhere near as bad as the current hard limit on the number of replies returned per page of annotations, we would probably have to limit the size of the conversation thread returned with an annotation. How could we support “load more” in such situations? Or could we just restrict the size of a conversation for now?

  3. This approach only delivers substantial operational benefit if we can assume that replies have the same access control restrictions as their thread root. This seems like a perfectly reasonable requirement, but it’s not how our software works right now. How would we migrate existing replies that don’t match this requirement? (e.g. private replies on public annotations).

  1. More importantly, we don’t explicitly expose the fact that we haven’t loaded all replies, and we don’t provide any facility to “load the rest of this thread”.

  2. Yes, really.

  3. Or, for that matter, exactly N annotations matching any kind of query.