In our previous discussions, we focused on the design of view layer adapters, particularly concerning full initialization rendering, including lifecycle synchronization, state management, rendering modes, and DOM mapping states. Here, we need to handle incremental updates for changes, which is a performance consideration and requires an implementation of immutable state objects to facilitate Op operations while minimizing DOM changes.
Here, we will not yet introduce rendering issues at the view layer but will implement detailed handling at the Model level. Specifically, we will create immutable state objects where only the nodes that are updated will be recreated, while others will be reused directly. This approach complicates the module's implementation, as it handles state objects directly without relying on frameworks like immer, so we'll start by considering a simple update model.
Going back to the initial implementation of the State module for updating document content, we directly rebuilt all LineState and LeafState objects and monitored the OnContentChange event in the BlockModel of the React view layer to apply updates to BlockState.
This method is straightforward, and a full state update ensures that the React state is updated. However, the performance issue arises when document content is very large; full computation leads to extensive state rebuilding, and these changes will result in React's diff, consequently triggering a full update of the document view—this performance overhead is generally unacceptable.
Typically, we need to determine state updates based on changes. First, we need to define the granularity of updates, such as reusing the existing LineState when there are no changes at the line level. This means attempting to reuse the Origin List as much as possible while generating a Target List. This approach can obviously avoid rebuilding parts of the state, maximizing the reuse of original objects.
The overall strategy involves first executing transformations to produce the latest list, then setting the row and col pointer values for both the old and new lists, recording the starting row during updates. Deletions and additions can be handled as usual, while updates would be treated as a delete followed by an add. Content processing will require a separate discussion for single-line and multi-line cases, with the content in between treated as a rebuild operation.
Finally, we can place the incremental additions and deletions of LineState data into Changes, allowing us to obtain the actual Ops for additions and deletions. By doing this, we optimize performance since only the middle portion of the original list and target list needs to be rebuilt, while the other line states can be directly reused. Additionally, this data does not exist in the apply's delta, and can also be considered as supplementary information.
There are some important points that require our attention. We are currently maintaining a state model, which means that all updates are no longer direct compose operations, but rather manipulations of the state objects we have implemented. Essentially, we need to realize line-level compose methods, and the implementation here is crucial; any inaccuracies in how we handle the data could lead to issues with the state.
Furthermore, in this approach, our determination of whether a LineState needs to be newly created is based on all LeafState within the entire line. In other words, we must traverse all the ops again, and since we ultimately need to segment the Delta after compose into line-level content, we realistically need to traverse at least twice even after applying changes.
At this point, we need to consider optimization strategies. First, for the initial retain, we should directly and completely reuse the original LineState, including the remaining nodes after processing. For the intermediary nodes, we should design independent update strategies; theoretically, this portion needs to be completely handled as a new state object, which could reduce the traversal of certain Leaf Ops.
In this case, if a node is new, we simply construct a new LineState. Deleted nodes should not be included from the original LineState into the new list. As for updated nodes, we need to update the original LineState object since the line does undergo updates, and the key point is that we need to reuse the key value of the original LineState.
Here, we simply aim to describe the issue of reuse. A convenient implementation is to use \n as an identifier for targeting the State, which means we need to treat \n independently as a separate state. For instance, if we insert \n at the position indicated by 123|456, then 123 would represent a new LineState, and 456 would be the original LineState, thereby allowing for the reuse of the key.
It’s important to note that LineState does not have a corresponding Op in Delta, while the corresponding LeafState does have specific Ops. This means that when updating LineState, we cannot directly control it based on changes; we must find mappable states, and the simplest scheme is to map based on the \n nodes.
To summarize, initially, we considered updating first and then computing diff, but later we shifted to updating while also recording. The advantage of updating while recording is that it avoids the overhead of re-traversing all Leaf nodes while also mitigating the complexity of diff. However, there is a challenge: if multiple retain operations are performed internally, we cannot directly reuse LineState.
Using index as a key is typically feasible; however, in some uncontrolled scenarios, it may cause rendering issues due to in-place reuse. We will temporarily set aside the performance issues caused by the diff algorithm. In the example below, we can see that each time we remove an element from the top of the array, the actual effect on the input values appears to delete the elements at the tail. This highlights the issue of in-place reuse, which is particularly pronounced in uncontrolled circumstances, such as our ContentEditable component; thus, we need to revisit the approach to key values here.
Considering our earlier discussion about avoiding any changes in text content from triggering a change in the key value that would lead to a rebuild, we cannot directly use a computed immutable object reference to handle the key value. The only operation method left to describe a single op, aside from insert, is attributes.
However, if we base it on attributes, we would need to accurately control the merging of insert, requiring the use of the old object reference, and an op without attributes would be difficult to handle. Therefore, we might have to convert it to a string for processing, but this would similarly prevent maintaining complete stability of the key, as changes in the previous index would lead to subsequent values changing.
In slate, I previously believed that the generated key had a one-to-one correspondence with the nodes, meaning that whenever a node A changed, the hierarchy key representing it would necessarily change. However, upon further examination, I found that after updating to generate a new Node, it synchronously updates the Path and the PathRef corresponding to the Node, as well as its corresponding key value.
Upon later examining the range model implemented by Lexical, I discovered that it uniquely identifies each leaf node using the key value, and selections are described based on this key as well. The overall structure is somewhat similar to the selection structure in Slate, or one could say, resembles a DOM tree structure. Here we focus on Range selections, but Lexical actually has three additional types of selections.
What is particularly important here is the state maintenance when the key value changes because the content of the editor needs to be editable. However, if we aim for immutable, it is clear that directly mapping key based on state object references would lead to the invalid recreation of the entire editor DOM. For instance, adjusting the level of a heading would cause a complete row rebuild due to the change in the entire row's key.
Thus, exploring how to reuse the key values as much as possible becomes a significant question. The key at our editor's row level is specially maintained, achieving both immutable property and key value reuse. Currently, the key of the leaf state is dependent on the index value. Therefore, if we investigate Lexical's implementation, we can similarly apply this to maintaining our key values.
Through debugging in the playground, we can observe that even if we cannot determine if it is implemented as immutable, we can see that Lexical maintains key values in a somewhat left-biased manner. Therefore, in our editor implementation, we can adopt a similar approach, merging with direct adherence to left values for reuse, while if splitting starts at 0, we directly reuse; if it starts with a non-0, we create a new key.
There is also another small issue: when we create LeafState, we immediately obtain the corresponding key value and then consider reusing the original key. This can lead to many unused key values being created, resulting in a significant numerical difference between key values during updates. However, this does not affect overall functionality and performance; it just looks odd during debugging.
Therefore, we can optimize this part of the performance by not immediately creating the key value upon creation, but rather setting the key value from the external context during initialization and updates. This implementation is quite similar to how we handle index and offset: we process all related values during the update, and strict checks are performed during development mode rendering.
Furthermore, during unit testing, it was found that maintaining a separate key value on leaf means that the special node \n would also have its own independent key. In this case, the key value maintained at the line level might as well reuse the key value of the \n leaf. Of course, this is only a theoretical implementation and might lead to unexpected refresh issues.
In the initial design of the view module, our state management form was to perform a full Delta update, then use EachLine to iterate and rebuild all states. In fact, we maintained two data models: Delta and State, and establishing their mapping relationship incurs some overhead; the target state during rendering is Delta, not State.
This model inevitably consumes performance, as each time Apply is called, the entire document needs to be updated and the line state re-iterated. While it may not heavily burden performance when it comes to calculation iterations, since we are dealing with new objects each time, updating the view can cause greater performance loss, as computational efficiency is generally acceptable, but the cost of updating the view DOM is significantly higher.
In reality, reusing the key value solves the issue of avoiding the complete re-mounting of the line state view. Even though we reuse the key, since the entire State instance is reconstructed, React will continue with the subsequent re-render process. Therefore, what we need to address here is how to avoid the view re-render as much as possible in the absence of changes.
Due to our implementation of row-level immutable state management, we can directly compare the references of state objects in the view to determine whether a re-render is necessary. Therefore, we only need to enhance the ViewModel nodes with React.memo. In this scenario, there's no need to rewrite the comparison function; we can simply rely on our immutable state reuse to achieve the desired effect.
Similarly, the LineView also needs to incorporate memo. Furthermore, since the component itself may experience internal state changes—such as controlling Composing combined input—we will use useMemo for internal node calculations to cache the results and avoid repeated computations.
The view refresh still directly controls the reference of the lines state. The core content changes and view layer re-renders depend directly on event module communication. Since each access to the lines state generates a new reference, React interprets this as a state change, thereby triggering a re-render.
Even though rendering is triggered, the existence of key and memo enables comparison based on the line state. The LineModel view will only trigger the update logic if the reference of the LineState object changes; otherwise, it will reuse the original view. We can directly observe this using React's devtools recording or Highlight.
The incremental update of views is relatively straightforward, as the logic for implementing immutable objects and maintaining key values is handled at the core layer. The view layer primarily relies on this to calculate whether a re-render is necessary. A similar implementation can also be applied in low-code scenarios; after all, rich text essentially functions as a zero-code editor, where the elements assembled are not components but text.
Previously, we mainly discussed the design of the view layer's adapter, focusing on full view initialization rendering and the rules governing the state model's structural correspondence to the DOM. Here, we primarily consider performance optimizations during update processing, concentrating on minimizing DOM and Op operations, maintaining key values, and realizing incremental rendering in React.
Next, we need to consider how to prevent the specified DOM structure from being disrupted when inputting content, mainly involving dirty DOM checks, selection updates, and rendering Hooks. These topics were discussed in detail in the processing of input methods in sections #8 and #9, so we won't delve into them here again.
We should now discuss preset components for editing nodes, such as zero-width characters, Embed nodes, Void nodes, etc. The goal is to provide preset components for the editor's plugin extensions, which include some default behaviors and also predefine parts of the DOM structure, enabling editor operations within specified constraints.