In complex applications such as low-code and rich text editors, the design of data structures becomes crucial. In these scenarios, the state management is not the usual solutions like redux
or mobx
, but rather requires a customized design tailored to specific use cases. Here, we aim to explore an atomic, collaborative, and highly extensible application-level state management solution based on Immer
and OT-JSON
.
The idea of combining Immer
with OT-JSON
is derived from slate
. Let's first take a look at the basic data structure of slate
, where the example below describes a highlighted block. This data structure resembles a zero-code/low-code structure as it contains various children
with descriptions of node decorations such as bold
, border
, background
, etc.
The design here is quite intriguing. As discussed earlier, fundamentally low-code and rich text are both based on a DSL description to manipulate DOM structures. Rich text mainly operates on the DOM through keyboard inputs, while zero-code operates through drag-and-drop or other methods. The common design approach comes from the state management of slate
.
The related demos of this implementation can be found in https://github.com/WindRunnerMax/webpack-simple-environment/tree/master/packages/immer-ot-json
.
As mentioned earlier, the customization of data structure to specific scenarios mainly refers to the flexibility of JSON structures. For instance, in the description of highlighted blocks, we can design it as a separate object or flatten it using a Map
to describe node decorations. Similarly, the text content mentioned above specifies the need to describe it using the text
property.
Atomic design is crucial. In this context, we divide atomicity into structural and operational aspects. Structural atomicity allows free node composition, while operational atomicity enables state manipulation through descriptions, making it convenient for tasks like component rendering, state changes, managing history operations, etc.
Free node composition is applicable in various scenarios. For example, in form structures, each form item can be nested within other form items, with composite patterns setting rules to limit specific aspects. Operational atomicity facilitates handling state changes. In a form, for instance, expanding/collapsing nested form items requires state changes.
While atomic operations may not always be ideal, composing ops
to perform operations akin to an action
paradigm is a common practice, requiring handling through compose
. State management may not always need persistence; temporary state management can be easily achieved using client-side-xxx
attributes, while handling values with AXY+Z
would be more intricate.
The foundation of collaborative algorithms also lies in atomic operations. Similar to the redux
paradigm, an action
operation is convenient but fails to address collaborative conflicts or handle historical operations adequately. This limitation arises from its one-way, discrete operational model where each action
only conveys independent intent, lacking the explicit maintenance of global state causality (Operation A
affecting Operation B
states).
OT-JSON
helps extend atomic operations to complex collaborative editing scenarios by introducing operational transformation OT
to resolve conflicts. In addition to the front-end integration of operational transformation, backend collaborative frameworks like ShareDB
are required. Alternatively, CRDT collaborative algorithms also present viable options, depending on application requirements.
Moreover, OT-JSON
inherently supports maintaining operation history, with each operation carrying sufficient contextual information, enabling the system to trace the complete chain of state changes. This forms the basis for advanced features like undo/redo, version history retrieval, etc. Causal relationships between operations are explicitly recorded, enforcing constraints such as Operation A
must be applied before Operation B
.
The design of extensibility can be quite rich, with tree structures naturally suited to support nested data interactions. For instance, various modules of Feishu's documents are expanded in the form of Blocks
. Interestingly, Feishu's data structure collaboration also utilizes OT-JSON
, while text collaboration leverages EasySync
as a sub-type of OT-JSON
, enhancing scalability.
Extensibility does not imply complete freedom to integrate plugins. Data structures within plugins still need to align with the scheduling of OT-JSON
, and special sub-types like text also require dedicated scheduling. This systematic framework can unify heterogeneous content modules into a collaborative system, enabling consistent state management, collaborative editing, history tracking, and other functionalities.
Immer
simplifies operations on immutable data structures, introducing the concept of a draft state, allowing developers to write code in an intuitive mutable manner while generating entirely new immutable objects under the hood. In traditional approaches, modifying deeply nested data requires meticulously unwrapping each layer, making it error-prone and adding complexity to the code.
The Immer
library allows developers to directly assign values, add or remove properties to a temporary draft object, and even use array methods like push
and pop
, just like manipulating regular objects. Once all modifications are done, Immer generates a new object based on the changes made to the draft state, while sharing the unchanged parts with the original data structure. This mechanism not only avoids the performance cost of deep copying but also ensures the immutability of data.
An important feature of Immer
is that it uses a lazy proxy mechanism, where the Proxy
object is created only when the data is accessed during the modification process. This means that proxies are generated on-demand along the access path, avoiding the need to generate all proxies when creating a draft. This mechanism significantly reduces unnecessary performance overhead, especially when dealing with large and complex objects.
For instance, modifying a deeply nested property like draft.a.b.c = 1
, Immer will create proxies along the access path layer by layer, such as Proxy(a)
, Proxy(a.b)
, Proxy(a.b.c)
. Therefore, when using Immer, it is important to only access the parts that need to be modified when making changes to an object, and perform other proxy operations within the draft to avoid unnecessary proxy generation.
In slate
, there are 9
atomic operations implemented to describe changes, including operations for text manipulation like insert_text
, node manipulation like insert_node
, and selection transformation like set_selection
. However, although slate
provides features for operation transformation and reversal, it lacks a standalone package, resulting in many internal design implementations that lack universality.
insert_node
: Insert a node.insert_text
: Insert text.merge_node
: Merge nodes.move_node
: Move a node.remove_node
: Remove a node.remove_text
: Remove text.set_node
: Set a node.set_selection
: Set a selection.split_node
: Split a node.Similarly, in OT-JSON
, there are 11
operations implemented, and the structure design of json0
has been extensively validated in production environments, aiming to ensure data consistency among different clients through structured data representation. Furthermore, in rich text scenarios, there is still a need for extension in terms of SubType
, such as the extension of EasySync
type in Feishu, which naturally requires more operations to describe changes.
{p:[path], na:x}
: Add x
value to the specified path [path]
.{p:[path,idx], li:obj}
: Insert object obj
before the index idx
in the list [path]
.{p:[path,idx], ld:obj}
: Delete object obj
from the index idx
in the list [path]
.{p:[path,idx], ld:before, li:after}
: Replace object before
at index idx
in the list [path]
with object after
.{p:[path,idx1], lm:idx2}
: Move the object at index idx1
in the list [path]
to index idx2
.{p:[path,key], oi:obj}
: Add key key
and object obj
to the object at path [path]
.{p:[path,key], od:obj}
: Delete key key
and value obj
from the object at path [path]
.{p:[path,key], od:before, oi:after}
: Replace object before
at key key
in path [path]
with object after
.{p:[path], t:subtype, o:subtypeOp}
: Apply subtype operation o
of type t
to the object at path [path]
.{p:[path,offset], si:s}
: Insert string s
at offset offset
in the string at path [path]
, using subtypes internally.{p:[path,offset], sd:s}
: Delete string s
at offset offset
in the string at path [path]
, using subtypes internally.Apart from atomic operations, the core lies in the implementation of operation transformation algorithms, which form the foundation of collaboration. The atomic operations in JSON are not entirely independent and must be orchestrated through operation transformation to ensure the execution order follows causal dependencies. Moreover, the implementation of operation reversal is crucial as it enables functionalities like undo and redo.
In editor applications such as low-code platforms, rich text editors, drawing boards, form engines, etc., using a JSON data structure alone to describe content is insufficient. Analogous to components where div
describes the view and states are separately defined, and changes are driven by events, in the editor scenario, JSON serves both as a view description and the state to be manipulated.
Deleting an item follows a similar implementation. Here, ld
represents the value to be deleted. It's important to note that the value being deleted is specific rather than an index. This format helps in the convenience of the invert
conversion. Similarly, after the change, we observe that only the modified portion of the Immer
draft object is new, while the rest is reference-reused.
For updating an item in OT-JSON
, we need to define both oi
and od
. This is like a combination of two atomic operations where insertion occurs first, followed by deletion. Similarly, by placing both values instead of just handling the index, we don't need a snapshot
for assistance during invert
, and Immer
's reuse efficiency remains unaffected.
The application scenario of operation transformation is mainly in collaborative editing, but it also has a wide range of applications in non-collaborative situations. For example, when uploading an image, we should not place the state of uploading in the undo
stack, whether it is treated as an irreversible operation or merged with operations already in the undo
stack, the implementation of operation transformation is required.
We can understand the meaning of b'=transform(a, b)
as follows: assuming both a
and b
are derived from the same draft
branch, then b'
is the result when a
has been applied. At this point, b
needs to be transformed based on a
to directly apply b'
. We can also understand that transform
resolves the impact of operation a
on operation b
, maintaining causality.
Here we still test the most basic operations of insert
, delete
, and retain
in operation transformation. In fact, we can see that the offset in the causality relationship is quite important. For example, if the remote operation b
and the impending operation a
are both delete operations, when the b
operation is executed, the content that a
operation needs to delete needs to recalculate the index after the result of b
operation.
Operation inversion, the invert
method, is mainly used to implement functionalities like undo
and redo
. As mentioned earlier, when performing apply
, many operations require the original values, which are not actually validated during execution, but this allows for direct conversion during invert
without the need for a snapshot
to assist in calculating values.
Furthermore, invert
supports batch operation inversion, as in the example below where the parameter received is Op[]
. One can carefully consider that during application, the data operations are forward, while during inversion, the execution order needs to be reversed. For instance, the three operations on abc
should correspond to the reversed op
of cba
after inversion.
Batch application of operations is a very tricky problem. OT-JSON
supports the simultaneous application of multiple op
, but during apply
, data is operated on individually. This scenario is still quite common, for example, in implementing a drawing board, holding down shift
and clicking on graphic nodes to select multiple images, and then performing a deletion operation, this is a batch operation based on draft
, theoretically involving causality.
In the example below, let's assume there are 4
op
now, with duplicate index value handling. The expected result should be to delete the values of 1/2/3
, resulting in [0, 4, 5, 6]
. However, the actual result obtained is [0, 2, 4]
, showing that apply
is independently executed without handling the interdependency between op
.
So, as mentioned earlier, transform
handles the impact of operation a
on operation b
, maintaining causality. In such cases, we can address the interdependence of operations by using transform
directly to tackle the issue.
However, the function signature of transform
is transform(op1, op2, side)
, indicating the need to transform between two sets of operations. Since our current ops
consist of only one set of operations, we need to consider how to handle this. Transforming an empty set with ops
would result in []
, which is incorrect, so we must attempt to process them individually.
Therefore, initially, I contemplated trimming out the applied ops
operations and directly removing their affected values through transform
. It's crucial to consider whether reversing the order of applied operations before transformation is necessary, and you'll notice that deleting values and handling duplicate operations works correctly.
You might consider encapsulating this process and calling the function directly to obtain the final result, avoiding mixing the logic throughout the entire application process. It's worth contrasting the Delta
OT implementation, where single Delta
ops are processed based on relative positions, while OT-JSON uses absolute positions, necessitating conversion during batch processing.
However, it seems that the example above performs fine, but considering real-world scenarios, we may need to test the order of execution. In the following example, even though we only adjusted the order of ops
, we ended up with incorrect results.
Reflecting on this, how can we unravel this cause and effect issue? Perhaps considering that this should be a result of applying a
first and then b
changes accordingly. In a scenario like abcd
, it should start with a
as the base, transform b/c/d
, then with b
as the base, transform c/d
, and so on.
The essence of this issue is actually when multiple op
s are combined, each operation has an independent absolute position instead of being implemented as relative positions. For example, in Delta
, the compose
operation will be calculated as relative positions. Naturally, we can encapsulate this as a composeWith
method, which would be very useful for merging ops
such as combining historical operations.
Lastly, we can also consider a scenario where a path is held, similar to implementing the Ref
module in a rich text editor. For example, when uploading an image, if the 'loading' state results in a user operation changing the original path, then when the upload is complete and the actual address is written into the node, it's necessary to get the latest path
.
Here, we use a simple list scenario as an example, implementing basic state management based on Immer
and OT-JSON
. The list scenario is quite common, and here we will implement functionalities like adding, deleting items, handling selections, and history operations, many of which are inspired by slate
's state management.
When OT-JSON
applies changes, it actually executes each operation one by one. Therefore, when managing state using OT-JSON
, one might easily think of a scenario where changing internal data state might not cause a rerender if the value
provided by the provider
at the top level object reference remains unchanged, potentially not triggering a render.
Why might it not trigger a render? If after a state change, the directly referenced object does not change, setState
won't trigger a render. However, if there are multiple component states, other state changes will still cause the entire component's state to refresh, for instance, the Child
component below doesn't have any changed props
, but a change in the count
value will still make the function component execute.
Of course, when ignoring other state changes, if the top level object reference remains unchanged at that moment, then naturally the entire view won't refresh. Therefore, we must start from the changed nodes and update the references upwards. In the example below, if C
changes, then references of A
, C
need to change, while other objects remain with their original values. Immer
conveniently helps us achieve this capability.
Certainly, as seen in the previous examples, even when the value of props
remains unchanged, modifications at the top level still trigger a complete re-execution of the function component. In such cases, the use of React.memo
is necessary to control whether the function component needs to re-execute. By wrapping the Child
component with memo
as shown below, re-execution of the component when the count
value changes can be avoided.
Typically, when making changes, we need to obtain the target path
to be processed, especially when components need to operate on themselves post-render. In regular changes, we often rely more on expressions of selection nodes to determine the target nodes to process. However, in more complex modules or interactions such as asynchronous image uploads, this approach may not suffice to fulfill such functionalities.
When using React.memo
to control component rendering, it implicitly introduces a challenge. For instance, with nested two-level lists and content nodes [1,2]
, if a new node is inserted at position [1]
, the original value should theoretically become [2,2]
, but since the function component is not re-executed, it will still retain the original [1,2]
.
By maintaining the original [1,2]
, specifically, if we pass the path
to props
during rendering and customize the equal
function of memo
to include the path
, changes at lower index levels will lead to a large number of node components being re-executed, resulting in degraded performance. However, if the path
is not passed to props
, the component internally cannot access the rendered node's path
.
In the process of implementing plugins, the same plugin renders multiple components of the same type at different path
locations. Therefore, obtaining the path
of components rendered by the plugin requires passing it through outer rendering states, rendering the aforementioned props
passing approach unsuitable. Consequently, we utilize WeakMap
to facilitate path
retrieval.
By using two WeakMap
s, we can achieve the functionality of findPath
. NODE_TO_INDEX
is used to store the mapping between nodes and their indices, while NODE_TO_PARENT
is used to store the mapping between nodes and their parent nodes. These two WeakMap
s enable the retrieval of path
, and every update of nodes ensures that the mapping relationships of lower index values are updated.
During the actual path
retrieval, starting from the target node, we can continuously search for parent nodes through NODE_TO_PARENT
until reaching the root node. In this search process, we can utilize NODE_TO_INDEX
to obtain the path
, indicating that discovering the path
only requires traversing through the hierarchy levels rather than the entire state tree.
So actually, we can think of a question here as well. When we update the path
value, it needs to be executed during the rendering process. In other words, if we want to get the latest path
, it must be done after the rendering is completed. Therefore, we need to control the entire scheduling process timeline properly, otherwise, we may not be able to get the latest path
. Hence, usually we also need to dispatch a render completion event in useEffect
.
Another aspect to consider here is that since the actual editor engine relies on the lifecycle of useEffect
itself, meaning that the effect
side effect of the parent component must be triggered only after all child components are rendered. Therefore, the rendering of the entire node at the outermost Context
level cannot be implemented using React.lazy
. Of course, the actual plugin content rendering can be lazy-loaded.
The selection status module also relies on React
for state management, primarily used as a Provider
. The maintenance of the selection expression itself depends on the path
, so clicking on a node can directly use the above findPath
to write the selection status.
Similar to the path search mentioned above, we do not pass the path
of the node itself as props
to the node. Therefore, the node needs to know whether it is selected, so the design here needs to consider two parts: first is the global selection status, where Context
is used to provide value
directly, and second is the state of the node itself, where each node requires its own independent Context
.
The management of the global selection status is divided into two parts. The global hooks
are used to provide the selection values for all child components, where child components can directly use useContext
. Additionally, the application's entry point needs to use the editor's events to manage the selection status value of the Context
.
The design of the selection status for individual components is quite interesting. Firstly, considering that there are only two states for the selection status, selected or unselected, each node should have a Provider
placed around it to manage the state. So, if it's a deeply nested component, changing the value at the deepest level of the Provider
is required to change the selection status.
So, it is necessary to rely on the change of the top-level selection
status to trigger the change in the top-level Provider
, and then each level of state change needs to re-execute the function component to handle the selection status change and rendering as needed. This means that when a deeply nested node is selected, all nodes with lower path indices will also be selected.
In this case, React.memo
is still needed. Since selected
will be passed as props
to child components, when selected
value changes, the child components will re-execute. Therefore, the transformation here starts from the top level, where each selection status change, either from selected to unselected or from unselected to selected, will trigger a rerender
.
The History
module is closely integrated with the data operations of OT-JSON
, and applies transformations deeply using transform
, including transformations of selections and data. In addition, the invert
method is essential; the operation reversal forms the basis for undo
and redo
.
Firstly, it is important to consider when to handle undo
. Clearly, we only need to process stack data when performing an apply
operation, and even then, we need to ensure that only user-triggered content is processed. If the operation originates from the History
module itself or from remotely coordinated data, new data should not be pushed onto the stack.
Do not forget to record the selection. After triggering an undo, the selection should revert to its previous state. Therefore, we actually handle two situations: recording the current selection value at the will apply
moment, and then pushing the latest changes into the stack during the actual apply
.
Typically, we do not want to push onto the stack each time a change is made, especially with high-frequency operations like text input or node dragging. Therefore, we can consider merging operations within a time slice, consolidating them into a single undo ops
. In this case, we need to figure out how to merge the top of the stack's ops
with the current changes
, which is where our composeWith
method comes into play.
The undo
and redo
methods usually need to be used together. When there are no user operations ongoing, the changes
applied by the history
module itself need to be transformed and then pushed into another stack. This means that the changes
executed by undo
need to be inverted before being pushed to the redo
stack, and vice versa.
Similarly, the transformation of the selection also relies on transform
, requiring changes in the path
parameter only. The reason for transforming the selection is that the previously stored range
is based on unchanged values, and popping from the stack means that these changes have been executed, thus needing a transformation to get the latest selection. Additionally, when restoring the selection, it should ideally be restored as close as possible to the changed selection.
In reality, the transformation operations utilized in History
are far more extensive than these. In collaborative scenarios, we must consider how to handle remote
operations, as the principle is that we can only undo our own actions. Additionally, in scenarios such as image uploads, there is a need to merge operations from a specific undo
stack. Here, transformation operations are necessary to address the side effects caused by the movement of ops
, for which we can consider an implementation based on Delta
.
Here we have designed an application state management solution based on Immer
and OT-JSON
. By leveraging the draft mechanism of Immer
to simplify immutable data operations and combining the atomic operations and collaborative algorithms of OT-JSON
, we have achieved an atomic, collaborative, and highly scalable application-level state management solution, as well as a view performance optimization solution with on-demand rendering. Overall, this solution is more suitable for dynamic composition and state management of nested data structures.
In actual applications, it is still important to choose the appropriate state management solution based on the specific scenarios. In application-level scenarios such as rich text editing, drawing boards, and low-code platforms, the top-level architectural design is crucial, and all state changes and node types should be extended from this architectural design. On the other hand, at our business logic level, the focus is more on the functional implementation of business logic, which is relatively more flexible. The majority of implementations are procedural logic-oriented and pay more attention to the organization of code.