Core Concepts
Understanding the mental model and architecture behind OpenOT.
Understanding the mental model and architecture behind OpenOT.
OpenOT is built on a few core concepts that work together to enable real-time collaboration. This guide explains how they are implemented in the library, diving into the technical details that developers need to know.
At its heart, OT is about consistency. When two users edit the same document at the same time, their operations might conflict. OT provides a way to "transform" these operations so that everyone ends up with the same result (Convergence).
Consider two clients, Alice and Bob, starting with the same document state .
If the server applies then , the final state is . But Bob applied locally first! When he receives , he can't just apply it, because was meant for state , not .
We need a transformation function that produces , which is the version of adapted to apply after .
The fundamental property of OT is:
This ensures that no matter which order operations arrive, the final document state is identical.
Imagine a document with the text "ABC".
"X" at index 0: Insert(0, "X")"C" at index 2: Delete(2, 1)Scenario 1: Server receives A first.
"XABC"."ABC", not "XABC".Delete(3, 1)."C")."XAB".Scenario 2: Server receives B first.
"AB"."ABC".Insert(0, "X")."XAB".Both paths converge.
In OpenOT, changes are not stored as diffs or full snapshots, but as Operations. An operation describes how to change the document from one state to the next.
TextType)OpenOT implements a standard Retain/Insert/Delete format (compatible with ShareJS/Ot.js types). An operation is an array of components that traverses the document.
// Change "Hello World" -> "Hello Alice World"
const op = [
{ r: 6 }, // Retain 6 chars ("Hello ")
{ i: "Alice " }, // Insert "Alice "
{ r: 5 }, // Retain 5 chars ("World")
];r): Skip over existing characters. This is crucial for keeping indices aligned. If you don't retain, you are implicitly deleting.i): Add new characters at the current position.d): Remove characters at the current position.Composition: Operations can be composed. If you have (A -> B) and (B -> C), you can merge them into (A -> C). This is vital for performance—clients can squash buffered keystrokes into a single packet.
Every operation is applied to a specific version of the document. We track this with a monotonically increasing Revision Number.
"" (Empty)"Hello" (Op: Insert "Hello")"Hello World" (Op: Insert " World")When a client sends an operation, it includes the revision it thinks is current (e.g., Rev 5).
Presence refers to ephemeral user state: cursors, selections, and names.
In OpenOT, presence is typically broadcasted via the same transport (WebSocket) but bypasses the persistence layer.
However, presence does relate to OT. If I am at index 10, and you insert text at index 0, my cursor visually stays in the same place, but its index must shift to 15.
OpenOT provides utilities to transform "Selection Ranges" against operations, ensuring that remote cursors don't jump around incorrectly when text is edited.
OpenOT uses a "Log + Snapshot" model, similar to database write-ahead logs (WAL).
Every applied operation is appended to an ordered log.
[Op0, Op1, Op2, ... OpN]
This allows:
Replaying 100,000 operations to load a document is too slow. We periodically save the full document state.
OpenOT defines an IBackendAdapter interface. You can swap storage without changing application logic.
OpenOT is network-agnostic. The core logic is pure data manipulation. The TransportAdapter interface abstracts the communication layer.
interface TransportAdapter {
send(message: any): void;
connect(onReceive: (message: any) => void): void;
}WebSockets (Stateful):
@open-ot/transport-websocket.HTTP / Server-Sent Events (Stateless):
Peer-to-Peer (WebRTC):