Livefyre recently released a new way of commenting and interacting with Content on your favorite websites: Sidenotes.
Sidenotes lets any user respond to specific sections of the Content, not just a reply to the whole thing. The added context allows Sidenotes to display the annotation right next to the annotated image or paragraph (like this one).
But how does it work? This is how we built it.
Livefyre StreamHub provided most of the infrastructure needed for the Sidenotes Application. We didn’t have to start from scratch. We leveraged Livefyre’s existing world-class services to manage
- Content - Comments, Tweets, Facebook posts, Instagram photos, RSS items are all normalized and stored as Content in StreamHub.
- Collections - Every conversation powered by Livefyre is made up of a Collection of Content. StreamHub APIs provide high-availability indexing into these Collections billions of times per month.
- Spam Protection - Livefyre Engineering’s affectionately labeled “Ministry of Abuse and Classification” keeps out all the bad stuff.
- Identity - Whether Livefyre.com profiles for Livefyre Community Comments, Livefyre Enterprise Profiles, or single sign on integrations with custom Identity Providers, StreamHub provides a link to any authentication system necessary.
While we had a great foundation in StreamHub, there were still some challenges to solve.
Most importantly, we had to provide a solution to the question: “How can Sidenotes anchor user-generated StreamHub Content to subsections of another piece of external content”?
To accomplish this, we’ve added the notion of a Block to Livefyre. Each new sidenote is anchored to a Block in the corresponding article. That block is frequently a paragraph of text, but it can also be an image, video, or any other HTML Element. When you use Sidenotes, you see a small icon next to each Block showing the number of sidenotes on that block, and clicking allows you to add a new one.
The resulting problem for Livefyre Engineering was to determine the best algorithm for provisioning Blocks and allocating new sidenotes into the right Block.
The simplest imaginable solution to this would involve storing the annotated text on each sidenote, and then to display each sidenote, search the article for that substring.
But we knew that in order to make the best product, we would need to do more than just finding an exact match. Content on the internet changes all the time, and as it should. Editors add or correct new information, fix typos, or A/B test different ways of presenting their Content.
How do you anchor annotations to an always-shifting, living document?
We first looked to existing products to reverse-engineer their methods for dealing with this. There are quite a few annotations experiences that only annotate Content that is managed by the same system. This is a comparatively trivial Engineering problem, and each solution used by these sort of closed systems wouldn’t work for our mission of bringing annotation to the rest of the web.
Of the similar products we did find, they usually used one of two naive solutions that would not be resilient to changes in the annotated Content.
Anchor by paragraph index
Some of the systems we found would attach annotations based on paragraph ordering. As soon as an editor added a paragraph, all annotations on later parts of the document would be lost or, perhaps worse, attached to the wrong section of the document. In the worse case, adding an introduction paragraph at the beginning would break annotations for the entire article.
Anchor based on hash of the text
The other common method we found is to hash the contents of the annotated text, and store that with the annotation. This has the benefit of being agnostic to the order of annotated sections. Sections can be added, removed, or rearranged, and annotations will stay anchored.
However, as soon as any paragraph changes, the hash will also change and all previous annotations would be lost.
This is because most hashing algorithms (e.g. md5) are designed to minimize collisions. That is, two different inputs should never produce the same output. Usually, even only slight changes to the input will result in wildly different output hashes.
md5('Hello, world') -> bc6e6f16b8a077ef5fbc8d59d0b931b9
md5('Hello, world!') -> 6cd3556deb0da54bca060b4c39479839
What we wanted was a way of hashing slightly different inputs and getting either the same or only slightly different outputs.
This would let us reduce entire paragraphs of varying lengths to a fixed-size hash, and help us identify when a section had been edited so we could keep anchoring annotations across edited versions.
We started researching Locality-Sensitive Hashing (LSH), which refers to hashing methods that are designed not to minimize collision, but specifically to maximize it.
Locality-Sensitive Hashing has many applications, including
- Spam detection - Once you identify something as spam, you also want to quickly determine that very similar things (e.g. with only one character added) are also spam. e.g. Nilsimsa Hashing
- DNA Analysis - Given two sequences of nucleotides, determine if they are likely to be from the same person.
- De-duplication - Search engines use LSH to determine when the same page contents are being returned from different URLs, so they can deduplicate documents in their indexes.
Compared to the
md5 example above, an LSH would produce results something like:
LSH('Hello, world') -> bc7e6f16b8a087ef5fbc8d50d0b931c9
LSH('Hello, world!') -> bc6e6f16b8a077ef6fbc8d59d0b931b9
This is all getting kind of theoretical, and we know that the best solutions frequently come only once one has developed a solid intuition about the problem space, so we set about trying to visualise these locality-senstive hashes.
We built a dataset of four different paragraphs. For three of them, we created three different versions by changing punctuation, spelling, or adding sentences. In this example, there were ten total paragraphs.
Each LSH hash of N bits can be mapped to a point in <= N-dimensional space. Slightly different inputs that result in slightly different hashes will result in points in this space whose are only slightly far apart. That is, their distance will be low.
You can see in this image four clusters of points. One of the clusters (on the top) only has one point, and the other three clusters (representing the paragraphs we made edits to), each have three points. Keep in mind that while we humans can spot the gestalt of these four clusters, so far to this algorithm has no idea.
Allocating Sidenotes to Blocks
This research provided us a method to allocate Sidenotes into Blocks, even when the Sidenotes were annotating slightly different text. Each cluster in the above image should be one Block.
When you create a new Sidenotes in a paragraph, we allocate it to a Block by first applying an LSH algorithm to the text, then checking the following:
- Is the hash an exact match for an already-created Block? If so, add the annotation to that block.
- If not, perform a Nearest Neighbor Search to find existing Blocks that are a small distance from this hash. We use a relatively simple Hamming distance metric. If we find an existing block within a certain distance threshold of the new hash, we place the annotation in that Block. This is what happens when the contents of a paragraph have slightly changed. This proximity search is what lets the algorithm recognize the clusters in the above graph.
- If there are no Blocks with similar hashes, this must be the first time someone has annotated this section of the article. So a new Block is provisioned with the hash of what’s being annotated.
We have now successfully identified ‘clusters’ in the LSH points in this high dimensional space. Each cluster represents the same text but at a few different versions. And for each cluster, we’ve provisioned a Block.
After clustering the dataset visualized above, we gave each inferred Block a unique color so we could quickly check how our algorithm performed.
Now armed with a Block provisioning and allocation strategy, we felt comfortable starting work on Sidenotes and putting all the pieces together. Livefyre’s App Engineering, Product Design, and Product Management teams did a ton of brainstorming, user experience research, and straight up hacking and execution to make Sidenotes a reality.
Today, when Sidenotes loads on a webpage, it does the following:
1) Boot with configuration pointing to a StreamHub Collection and selectors indicating which elements of the page should be annotatable
2) Ask StreamHub for any known blocks in the Collection and, for each, the corresponding LSH hash and number of Sidenotes in the block
3) For each existing block and corresponding hash
- Find the annotatable element that is most similar to that hash
- Render the Sidenotes intent on that element, including the number of Sidenotes in that Block
When a user clicks on this intent, Sidenotes asks StreamHub for the top Sidenotes in that Block, then renders them.
When a user selects text and posts a new sidenote, we allocate it to a Block in the method described above, and post the Content to StreamHub with a parameter indicating the Block to attach it to.
We had a ton of fun envisioning and creating Sidenotes, and we sincerely hope that it makes it more fun to hang out on your favorite websites, interact with your favorite communities, and engage and give feedback to your favorite authors.
Have questions? Post them in the comments below.
Just kidding. Sidenote it!