A First Look at Serialization and Deserialization of Rich Text

In a rich text editor, serialization and deserialization are crucial steps involving content copying, pasting, importing, exporting, and more. When users copy content within the editor, the rich text is converted into standard HTML format and stored in the clipboard. During a paste operation, the editor must then parse and convert this HTML content into the editor's proprietary JSON structure for unified content management across editors.

Description

When using online document editors, you might wonder how formatting can be directly copied instead of just plain text, even allowing for copying content from a browser to Office Word while retaining formatting. It might seem like magic, but once we understand the basics of clipboard operations, the underlying implementation becomes clear.

In terms of clipboard operations, while copying, we might think we are copying plain text only, but clearly, copying plain text alone cannot achieve the functionalities mentioned. The clipboard can indeed store complex content. Taking Word as an example, when we copy text from Word, several key values are written into the clipboard:

text/plain
text/html
text/rtf
image/png

The text/plain looks familiar, resembling the commonly seen Content-Type or MIME-Type. Hence, one might consider the clipboard as a type of Record<string, string>. However, let's not overlook the image/png type. Since files can be directly copied to the clipboard, the commonly used form for clipboard types is Record<string, string | File>. For example, when copying this text, the clipboard would contain the following content:

text/plain
For example, when copying this text, the clipboard would contain the following content

text/html
<meta charset="utf-8"><strong style="...">For example, when copying this text</strong><em style="...">the clipboard would contain the following content</em>

When performing a paste operation, simply reading the content from the clipboard is all that's required. For instance, when copying content from Yuque to Feishu, Yuque writes text/plain and text/html to the clipboard, which can then be checked for the presence of the text/html key. If found, it can be read and parsed into Feishu's proprietary format, allowing the content to be pasted with the correct formatting. If text/html is not present, the content from text/plain can be directly written into Feishu's private JSON data.

Additionally, a consideration to be made is that in the aforementioned example, during copying, the conversion from JSON to HTML strings is required, and during pasting, the conversion from HTML strings to JSON is needed. These operations involve serialization and deserialization, incurring performance costs and potential content loss. Perhaps these costs can be minimized. Typically, for pasting within the application, the clipboard data can be directly mapped to the current JSON data, avoiding the need for HTML parsing and maintaining content integrity. For instance, in Feishu, there are separate clipboard keys for docx/text and data-lark-record-data as distinct JSON data sources.

Having understood how the clipboard works, let's discuss serialization. When it comes to copying, many may think of clipboard.js, suitable for higher compatibility (e.g., IE), but for modern browsers, utilizing the HTML5 standard API directly is more advisable. In browsers, two commonly used APIs for copying are document.execCommand("copy") and navigator.clipboard.write/writeText.

document.execCommand("selectAll");
const res = document.execCommand("copy");
console.log(res); // true
const dataItems: Record<string, Blob> = {};
for (const [key, value] of Object.entries(data)) {
  const blob = new Blob([value], { type: key });
  dataItems[key] = blob;
}
navigator.clipboard.write([new ClipboardItem(dataItems)])

For deserialization or pasting behavior, we have document.execCommand("paste") and navigator.clipboard.read/readText at our disposal. However, it's essential to note that calls to the execCommand API consistently fail, while clipboard.read requires user authorization. This issue has been previously researched regarding trusted events in browser extensions; even with the clipboardRead permission declaration in the manifest, direct clipboard reading is not possible and must be executed in a Content Script or even chrome.debugger.

document.addEventListener("paste", (e) => {
  const data = e.clipboardData;
  console.log(data);
});
const res = document.execCommand("paste");
console.log(res); // false
navigator.clipboard.read().then(res => {
  for (const item of res) {
    item.getType("text/html").then(console.log).catch(() => null)
  }
});

Well, the current focus here is not on this topic. What we are concerned with is the serialization and deserialization of content, specifically in the design of the copy-paste module of a rich text editor. Of course, this module will have broader uses beyond that, such as delivering Word documents and generating Markdown formats. Therefore, the design of this module should address the following key issues:

  1. Pluginization. The modules in the editor are inherently designed to be pluginized, and as such, the design of the clipboard module for serialization/deserialization formats should also allow for flexible extension. Particularly when adapting to specific private formats of editors like Feishu, Yuque, etc., it should be possible to freely control related behaviors.
  2. Universality. Since rich text editing requires mapping between the DOM and selection MODEL, the generated DOM structure is typically complex. When copying content from a document to the clipboard, we aim for a more standardized structure to enable better parsing when pasting to other platforms such as Feishu, Word, etc.
  3. Integrity. It is essential that during serialization and deserialization, the content's integrity is maintained, meaning no content loss occurs due to these processes. This might involve compromising on performance to ensure content integrity. However, for the editor's own format, performance is the main concern. Since the registered modules are consistent, it should be possible to directly apply the data without traversing the entire parsing process.

Thus, this article will use slate as an example to handle the design of the clipboard module for nested structures and quill for flat structures. Moreover, using the content of Feishu documents as a case study, it will cover the serialization and deserialization design based on different types such as inline structures, paragraph structures, composite structures, embedded structures, and block-level structures.

Nested Structures

The basic data structure of slate is a tree-structured JSON type, and relevant implementations can be found at https://github.com/WindRunnerMax/DocEditor. Let's take headers and bold formatting as an example to describe their basic content structure:

[
  { children: [{ text: "Editor" }], heading: { type: "h1", id: "W5xjbuxy" } },
  { children: [{ text: "Bold" , bold: true}, { text: "Format" }] },
];

In fact, the data structure in slate is very similar to a nested DOM structure, to the extent that the DOM structure and data structure correspond entirely one-to-one. For instance, even when rendering zero-width character renderings in the Embed structure, they exist in the data structure. Therefore, ideally, this JSON structure should be convertible directly to the corresponding DOM structure during serialization and deserialization.

However, complete correspondence is an ideal scenario. The actual organization of content in rich text editors may vary. For example, when implementing a blockquote structure, the outer wrapping blockquote tag may either be present in the data structure itself or dynamically rendered based on line attributes during rendering. In such cases, directly serializing it into complete HTML from the data structure's perspective may not be feasible.

// Structure Rendered
[
  {
    blockquote: true,
    children:[
      { children: [{ text: "Quote Block Line 1" }] },
      { children: [{ text: "Quote Block Line 2" }] },
    ]
  }
];

// Dynamically Rendered
[
  { children: [{ text: "Quote Block Line 1" }], blockquote: true },
  { children: [{ text: "Quote Block Line 2" }], blockquote: true },
];

Additionally, our implemented editor will necessarily be pluginized, and in the clipboard module, we cannot accurately determine how plugins organize data structures. In the world of rich text editors, there are unwritten rules; the content we write into the clipboard needs to have a standardized structure as much as possible to facilitate pasting content across editors. Therefore, if we aim to ensure standardized data, the clipboard module should provide basic serialization and deserialization interfaces, while the actual implementation is left to the plugins themselves.

Based on this fundamental concept, let's first look at the serialization implementation - the conversion process from JSON structure to HTML. As mentioned earlier, for the editor's format, the focus is on performance. As the registered modules are uniform, it should be possible to directly apply the data without the need for the entire parsing process. Hence, we also need to write an additional application/x-doc-editor key in the clipboard to directly store Fragment data.```

{
  "text/plain": "Editor\nBold Format",
  "text/html": "<h1 id=\"W5xjbuxy\">Editor</h1><div data-line><strong>Bold</strong> Format</div>",
  "application/x-doc-editor": '[{"children":[{"text":"Editor"}],"heading":{"type":"h1","id":"W5xjbuxy"}},{"children":[{"text":"Bold","bold":true},{"text":" Format"}]}]',
}

Next, let's think about how to write the content to the clipboard and the scenarios where it will be triggered. Besides using Ctrl+C to copy content, users might also want to trigger the copy action through a button. For example, in Feishu, users can copy entire lines/blocks via the toolbar. Therefore, we cannot directly write data using clipboardData in the OnCopy event; we need to actively trigger an additional Copy event.

As mentioned earlier, navigator.clipboard.write can also write to the clipboard. Calling this API does not require actually triggering the Copy event. However, when using this method to write data, exceptions may be thrown. Additionally, this API must be used in an HTTPS environment; otherwise, the function will not be defined at all.

In the example below, the document must have focus, and there needs to be a click on the page within a certain delay. Otherwise, a DOMException will be thrown. Even when the focus is on the page, executing the code will still throw a DOMException, indicating that the application/x-doc-editor type is not supported.

(async () => {
  await new Promise((resolve) => setTimeout(resolve, 3000));
  const params = {
    "text/plain": "Editor",
    "text/html": "<span>Editor</span>",
    "application/x-doc-editor": '[{"children":[{"text":"Editor"}]}]',
  }
  const dataItems = {};
  for (const [key, value] of Object.entries(params)) {
    const blob = new Blob([value], { type: key });
    dataItems[key] = blob;
  }
  // DOMException: Type application/x-doc-editor not supported on write.
  navigator.clipboard.write([new ClipboardItem(dataItems)]);
})();

Since this API does not support writing custom types, we need to actively trigger a Copy event to write to the clipboard. Although we can embed this data as an HTML attribute value in text/html, we choose to handle it separately here. Hence, with the same data, we use document.execCommand to write to the clipboard by creating a new textarea element.

const data = {
  "text/plain": "Editor",
  "text/html": "<span>Editor</span>",
  "application/x-doc-editor": '[{"children":[{"text":"Editor"}]}]',
}
const textarea = document.createElement("textarea");
textarea.addEventListener("copy", event => {
  for (const [key, value] of Object.entries(data)) {
    event.clipboardData && event.clipboardData.setData(key, value);
  }
  event.stopPropagation();
  event.preventDefault();
});
textarea.style.position = "fixed";
textarea.style.left = "-999px";
textarea.style.top = "-999px";
textarea.value = data["text/plain"];
document.body.appendChild(textarea);
textarea.select();
document.execCommand("copy");
document.body.removeChild(textarea);

It is evident that due to textarea.select(), the focus of the original editor will be lost. Therefore, it is crucial to note that when performing the copy operation, the current selection value needs to be recorded. After writing to the clipboard, the focus should be set back to the editor, and the selection restored.

Next, let's delve into the definition of pluginization. Here, the Context is quite straightforward, simply requiring the recording of the current processing Node and the already processed html node. Within the plugin, we need to implement the serialize method to serialize the Node into HTML, while willSetToClipboard is a Hook definition that gets invoked when about to write to the clipboard.

// packages/core/src/clipboard/utils/types.ts
/** Fragment => HTML */
export type CopyContext = {
  /** Node base */
  node: BaseNode;
  /** HTML target */
  html: Node;
};

// packages/core/src/plugin/modules/declare.ts
abstract class BasePlugin {
  /** Serialize Fragment to HTML */
  public serialize?(context: CopyContext): void;
  /** Content about to be written to clipboard */
  public willSetToClipboard?(context: CopyContext): void;
}

Since our specific transformations are implemented within plugins, our main task is to schedule the execution of plugins. To facilitate data handling, we are not using the Immutable form here. Our Context object remains consistent throughout the scheduling process. This means that all methods within plugins handle processing in-place. Therefore, scheduling is directly done through the plugin component, fetching the html node from the context after calling.

// packages/core/src/plugin/modules/declare.ts
public call<T extends CallerType>(key: T, payload: CallerMap[T], type?: PluginType) {
  const plugins = this.current;
  for (const plugin of plugins) {
    try {
      // @ts-expect-error payload match
      plugin[key] && isFunction(plugin[key]) && plugin[key](payload);
    } catch (error) {
      this.editor.logger.warning(`Plugin Exec Error`, plugin, error);
    }
  }
  return payload;
}

const context: CopyContext = { node: child, html: textNode };
this.plugin.call(CALLER_TYPE.SERIALIZE, context);
value.appendChild(context.html);

The crucial aspect lies in our designed serialize scheduling method. Our core concept here is: when processing text lines, we create an empty Fragment node as a line node. Then, iterate through each text value of the current line, extract each Text value to create a text node, creating a context object in this manner, and then dispatch plugins with PLUGIN_TYPE.INLINE level to insert the serialized HTML node into the line node.

// packages/core/src/clipboard/modules/copy.ts
if (this.reflex.isTextBlock(current)) {
  const lineFragment = document.createDocumentFragment();
  current.children.forEach(child => {
    const text = child.text || "";
    const textNode = document.createTextNode(text);
    const context: CopyContext = { node: child, html: textNode };
    this.plugin.call(CALLER_TYPE.SERIALIZE, context, PLUGIN_TYPE.INLINE);
    lineFragment.appendChild(context.html);
  });
}

Subsequently, for each line node, we similarly need to dispatch plugins at PLUGIN_TYPE.BLOCK level, placing the processed content into the root node and returning the content. This completes the basic serialization operation for text lines. By adding additional identifiers on the DOM nodes, it helps for us to idempotently handle deserialization later on.

// packages/core/src/clipboard/modules/copy.ts
After the basic line structure processing is completed, attention also needs to be paid to the outer `Node` node. The data processing here is similar to that of line nodes. However, it is important to note that this is a recursive structure processing. The execution sequence of the `JSON` structure here follows a depth-first traversal, which means processing text nodes and line nodes first, then handling external block structures, processing from the inside out to ensure the processing of the entire `DOM` tree structure.

```js
// packages/core/src/clipboard/modules/copy.ts
if (this.reflex.isBlock(current)) {
  const blockFragment = document.createDocumentFragment();
  current.children.forEach(child => this.serialize(child, blockFragment));
  const context: CopyContext = { node: current, html: blockFragment };
  this.plugin.call(CALLER_TYPE.SERIALIZE, context, PLUGIN_TYPE.BLOCK);
  root.appendChild(context.html);
  return root as T;
}

On the other hand, the deserialization process is relatively simple. The Paste event cannot be triggered at will, it must be triggered by a user's trusted event. Therefore, we can only read the values in clipboardData through this event. The data of interest here, in addition to the previously copied key, is the files field that needs to be processed. For deserialization, we also need to implement it specifically in the plugin, which also requires modifying the Context in place.

// packages/core/src/clipboard/utils/types.ts
/** HTML => Fragment */
export type PasteContext = {
  /** Target Node */
  nodes: BaseNode[];
  /** Base HTML */
  html: Node;
  /** Base FILE */
  files?: File[];
};

/** Clipboard => Context */
export type PasteNodesContext = {
  /** Base Node */
  nodes: BaseNode[];
};

// packages/core/src/plugin/modules/declare.ts
abstract class BasePlugin {
  /** Deserialize HTML into Fragment */
  public deserialize?(context: PasteContext): void;
  /** Pasted content is about to be applied to the editor */
  public willApplyPasteNodes?(context: PasteNodesContext): void;
}

The dispatching here is similar to serialization. If there is an application/x-doc-editor key in the clipboard, the value is read directly. If there are files to be processed, all plugins are scheduled to handle them. Otherwise, the value of text/html needs to be read. If it does not exist, then the content of text/plain is directly read, and the constructed JSON is applied to the editor.

// packages/core/src/clipboard/modules/paste.ts
const files = Array.from(transfer.files);
const textDoc = transfer.getData(TEXT_DOC);
const textHTML = transfer.getData(TEXT_HTML);
const textPlain = transfer.getData(TEXT_PLAIN);
if (textDoc) {
  // ...
}
if (files.length) {
  // ...
}
if (textHTML) {
  // ...
}
if (textPlain) {
  // ...
}

The key point here is the processing of text/html, which involves deserializing HTML nodes into Fragment nodes. The processing method here is similar to serialization, requiring recursive data handling. Firstly, DOMParser object is used to parse the HTML, then nodes are processed in a depth-first traversal, from inner to outer, similar to serialization, requiring plugin scheduling for implementation.

// packages/core/src/clipboard/modules/paste.ts
const parser = new DOMParser();
const html = parser.parseFromString(textHTML, TEXT_HTML);

// ...
const root: BaseNode[] = [];
// NOTE: Termination condition. `Text`, `Image`, and other nodes will be processed here
if (current.childNodes.length === 0) {
  if (isDOMText(current)) {
    const text = current.textContent || "";
    root.push({ text });
  } else {
    const context: PasteContext = { nodes: root, html: current };
    this.plugin.call(CALLER_TYPE.DESERIALIZE, context);
    return context.nodes;
  }
  return root;
}
const children = Array.from(current.childNodes);
for (const child of children) {
  const nodes = this.deserialize(child);
  nodes.length && root.push(...nodes);
}
const context: PasteContext = { nodes: root, html: current };
this.plugin.call(CALLER_TYPE.DESERIALIZE, context);
return context.nodes;

Next, we will use slate as an example to handle the design of the nested clipboard module. Taking the content of Feishu documents as the source and target, we will process serialization and deserialization plugins based on inline structures, paragraph structures, composite structures, embedded structures, and block structures under the above basic pattern scheduling, categorized by type.

Inline Structures

Inline structures refer to bold, italics, underline, strikethrough, inline code blocks, and other inline structure styles. Let's take bold as an example to handle serialization and deserialization. For the serialization of inline structures, we simply need to wrap a strong node around it if it is a text node. Note that we need to handle this in place.

// packages/plugin/src/bold/index.tsx
export class BoldPlugin extends LeafPlugin {
  public serialize(context: CopyContext) {
    const { node, html } = context;
    if (node[BOLD_KEY]) {
      const strong = document.createElement("strong");
      // NOTE: Using the 'Wrap Base Node' plus in-place replacement approach
      strong.appendChild(html);
      context.html = strong;
    }
  }
}

For deserialization, we also need preprocessing as a prerequisite. We need to handle pure text content first, which is a common handling method, i.e., when all nodes are text nodes, we need to add a first-level line node. Also, we need to format the data. Ideally, we should filter all nodes with a Normalize step, but here we will simply handle empty node data.

// packages/plugin/src/clipboard/index.ts
export class ClipboardPlugin extends BlockPlugin {
  public deserialize(context: PasteContext): void {
    const { nodes, html } = context;
    if (nodes.every(isText) && isMatchBlockTag(html)) {
      context.nodes = [{ children: nodes }];
    }
  }
public willApplyPasteNodes(context: PasteNodesContext): void {
    const nodes = context.nodes;
    const queue: BaseNode[] = [...nodes];
    while (queue.length) {
      const node = queue.shift();
      if (!node) continue;
      node.children && queue.push(...node.children);
      // FIX: Handle the scenario of nodes without text, for example <div><div></div></div>
      if (node.children && !node.children.length) {
        node.children.push({ text: "" });
      }
    }
  }
}

When processing the content, it involves identifying the presence of bold formatting in HTML nodes, and applying bold formatting to all text nodes in the currently processed Node tree. In this case, in-place data processing is also required. A method applyMark has been encapsulated here to handle text node formatting. Interestingly, because our goal is to construct the entire JSON, we don't need to focus on using the slate Transform module to operate on the Model.

// packages/plugin/src/clipboard/utils/apply.ts
export class BoldPlugin extends LeafPlugin {
  public deserialize(context: PasteContext): void {
    const { nodes, html } = context;
    if (!isHTMLElement(html)) return void 0;
    if (isMatchTag(html, "strong") || isMatchTag(html, "b") || html.style.fontWeight === "bold") {
      // applyMarker packages/plugin/src/clipboard/utils/apply.ts
      context.nodes = applyMarker(nodes, { [BOLD_KEY]: true });
    }
  }
}

Paragraph Structure

Paragraph structure refers to styles such as headings, line heights, and text alignment. Here we take headings as an example to handle serialization and deserialization. For serializing paragraph structure, when the Node is a heading node, we simply construct relevant HTML nodes, wrap the original nodes in place, and assign them to the context, using a nested node approach.

// packages/plugin/src/heading/index.tsx
export class HeadingPlugin extends BlockPlugin {
  public serialize(context: CopyContext): void {
    const element = context.node as BlockElement;
    const heading = element[HEADING_KEY];
    if (!heading) return void 0;
    const id = heading.id;
    const type = heading.type;
    const node = document.createElement(type);
    node.id = id;
    node.setAttribute("data-type", HEADING_KEY);
    node.appendChild(context.html);
    context.html = node;
  }
}

Deserialization, on the other hand, involves the opposite operation. It checks if the current HTML node being processed is a heading node, and if so, converts it into a Node node. In this case, in-place data processing is also required. Unlike inline nodes, all line nodes need to be added to the heading format using applyLineMarker.

```js // packages/plugin/src/heading/index.tsx export class HeadingPlugin extends BlockPlugin { public deserialize(context: PasteContext): void { const { nodes, html } = context; if (!isHTMLElement(html)) return void 0; const tagName = html.tagName.toLocaleLowerCase(); if (tagName.startsWith("h") && tagName.length === 2) { let level = Number(tagName.replace("h", "")); if (level <= 0 || level > 3) level = 3; // applyLineMarker packages/plugin/src/clipboard/utils/apply.ts context.nodes = applyLineMarker(this.editor, nodes, { [HEADING_KEY]: { type: `h` + level, id: getId() }, }); } } }

Composite Structure

Composite structure here refers to styled structures like block quotes, ordered lists, unordered lists, etc. Let's take block quotes as an example to handle serialization and deserialization. When serializing composite structures, we also need to wrap the related HTML nodes when the Node is a block quote node.

// packages/plugin/src/quote-block/index.tsx
export class QuoteBlockPlugin extends BlockPlugin {
  public serialize(context: CopyContext): void {
    const element = context.node as BlockElement;
    const quote = element[QUOTE_BLOCK_KEY];
    if (!quote) return void 0;
    const node = document.createElement("blockquote");
    node.setAttribute("data-type", QUOTE_BLOCK_KEY);
    node.appendChild(context.html);
    context.html = node;
  }
}

Deserialization involves checking if the node is a block quote node and constructing the corresponding Node. The difference from the heading module is that while headings apply formatting to relevant line nodes, block quotes nest a layer of structure within the original node.

// packages/plugin/src/quote-block/index.tsx
export class QuoteBlockPlugin extends BlockPlugin {
  public deserialize(context: PasteContext): void {
    const { nodes, html } = context;
    if (!isHTMLElement(html)) return void 0;
    if (isMatchTag(html, "blockquote")) {
      const current = applyLineMarker(this.editor, nodes, {
        [QUOTE_BLOCK_ITEM_KEY]: true,
      });
      context.nodes = [{ children: current, [QUOTE_BLOCK_KEY]: true }];
    }
  }
}

Embedded Structure

Embedded structure here refers to image, video, flowchart, and other styled structures. Let's take images as an example to handle serialization and deserialization of embedded structures. When serializing embedded structures, we simply need to wrap the related HTML nodes when the Node is an image node. Unlike previous nodes, at this point, we do not need to nest DOM nodes, just replace the standalone node in place.

// packages/plugin/src/image/index.tsx
export class ImagePlugin extends BlockPlugin {
  public serialize(context: CopyContext): void {
    const element = context.node as BlockElement;
    const img = element[IMAGE_KEY];
    if (!img) return void 0;
    const node = document.createElement("img");
    node.src = img.src;
    node.setAttribute("data-type", IMAGE_KEY);
    node.appendChild(context.html);
    context.html = node;
  }
}

In terms of deserialization structure, check if the current HTML node being processed is an image node, if it is, then convert it to a Node node. The key difference from the previous conversion is that we do not need a nested structure this time, we just need to set 'children' to a zero-width character as a placeholder. In practice, a common operation here is that pasting image content usually requires transferring the original 'src' to our service, for example, the images in Feishu are temporary links, and in production, resources need to be transferred.

// packages/plugin/src/image/index.tsx
export class ImagePlugin extends BlockPlugin {
  public deserialize(context: PasteContext): void {
    const { html } = context;
    if (!isHTMLElement(html)) return void 0;
    if (isMatchTag(html, "img")) {
      const src = html.getAttribute("src") || "";
      const width = html.getAttribute("data-width") || 100;
      const height = html.getAttribute("data-height") || 100;
      context.nodes = [
        {
          [IMAGE_KEY]: {
            src: src,
            status: IMAGE_STATUS.SUCCESS,
            width: Number(width),
            height: Number(height),
          },
          uuid: getId(),
          children: [{ text: "" }],
        },
      ];
    }
  }
}

Block Structure

Block structure refers to highlighted blocks, code blocks, tables, and other structural styles. Here, we will use highlighted blocks as an example to handle serialization and deserialization. Highlighted blocks are a customized structure in Feishu, essentially a nested structure of 'Editable'. The two layers of 'callout' nested structure here are for compatibility with Feishu's structure. Serializing block structures in Slate is similar to handling reference structures, simply nesting combined structures in the outer layer.

// packages/plugin/src/highlight-block/index.tsx
export class HighlightBlockPlugin extends BlockPlugin {
  public serialize(context: CopyContext): void {
    const { node: node, html } = context;
    if (this.reflex.isBlock(node) && node[HIGHLIGHT_BLOCK_KEY]) {
      const colors = node[HIGHLIGHT_BLOCK_KEY]!;
      // Extract specific color values
      const border = colors.border || "";
      const background = colors.background || "";
      const regexp = /rgb\((.+)\)/;
      const borderVar = RegExec.exec(regexp, border);
      const backgroundVar = RegExec.exec(regexp, background);
      const style = window.getComputedStyle(document.body);
      const borderValue = style.getPropertyValue(borderVar);
      const backgroundValue = style.getPropertyValue(backgroundVar);
      // Build HTML container node
      const container = document.createElement("div");
      container.setAttribute(HL_DOM_TAG, "true");
      container.classList.add("callout-container");
      container.style.border = `1px solid rgb(` + borderValue + `)`;
      container.style.background = `rgb(` + backgroundValue + `)`;
      container.setAttribute("data-emoji-id", "balloon");
      const block = document.createElement("div");
      block.classList.add("callout-block");
      container.appendChild(block);
      block.appendChild(html);
      context.html = container;
    }
  }
}

Deserialization involves determining if the current HTML node being processed is a highlighted block node, and if so, converting it to a Node node. The handling here is similar to that of reference blocks, but with an additional layer of nesting in the outer structure.

// packages/plugin/src/highlight-block/index.tsx
export class HighlightBlockPlugin extends BlockPlugin {
  public deserialize(context: PasteContext): void {
    const { nodes, html: node } = context;
    if (isHTMLElement(node) && node.classList.contains("callout-block")) {
      const border = node.style.borderColor;
      const background = node.style.backgroundColor;
      const regexp = /rgb\((.+)\)/;
      const borderColor = border && RegExec.exec(regexp, border);
      const backgroundColor = background && RegExec.exec(regexp, background);
      if (!borderColor || !backgroundColor) return void 0;
      context.nodes = [
        {
          [HIGHLIGHT_BLOCK_KEY]: {
            border: borderColor,
            background: backgroundColor,
          },
          children: nodes,
        },
      ];
    }
  }
}

Flat Structure

The fundamental data structure of quill is a flat structure in JSON format, and the related DEMO implementations can be found at https://github.com/WindRunnerMax/BlockKit. Let's take headers and bold formatting as an example to describe the basic content structure:

[
  { insert: "Editor" },
  { attributes: { heading: "h1" }, insert: "\n" },
  { attributes: { bold: "true" }, insert: "Bold" },
  { insert: "Format" },
  { insert: "\n" },
];

The serialization scheme is similar to slate, where we need to provide basic serialization and deserialization interfaces in the clipboard module, while the specific implementation belongs to the plugin itself. When it comes to serialization methods, we iterate through lines in a basic line-by-line manner, first handling the text in Delta structure, then addressing the formatting of line structures. However, due to the flat data structure of delta, we cannot handle it recursively. Instead, we should loop until we reach EOL to update the current line node with a new one.

// packages/core/src/clipboard/modules/copy.ts
const root = rootNode || document.createDocumentFragment();
let lineFragment = document.createDocumentFragment();
const ops = normalizeEOL(delta.ops);
for (const op of ops) {
  if (isEOLOp(op)) {
    const context: SerializeContext = { op, html: lineFragment };
    this.editor.plugin.call(CALLER_TYPE.SERIALIZE, context);
    let lineNode = context.html as HTMLElement;
    if (!isMatchBlockTag(lineNode)) {
      lineNode = document.createElement("div");
      lineNode.setAttribute(LINE_TAG, "true");
      lineNode.appendChild(context.html);
    }
    root.appendChild(lineNode);
    lineFragment = document.createDocumentFragment();
    continue;
  }
  const text = op.insert || "";
  const textNode = document.createTextNode(text);
  const context: SerializeContext = { op, html: textNode };
  this.editor.plugin.call(CALLER_TYPE.SERIALIZE, context);
  lineFragment.appendChild(context.html);
}

The overall deserialization process is more similar to slate since we handle data based on HTML, deeply recursively traversing to first process leaf nodes and then extra nodes based on the processed delta. The final output data structure will be flat, eliminating the need for special focus on Normalization operations.

// packages/core/src/clipboard/modules/paste.ts
public deserialize(current: Node): Delta {
  const delta = new Delta();
  // Termination conditions for handling Text, Image, and other nodes
  if (!current.childNodes.length) {
    if (isDOMText(current)) {
      const text = current.textContent || "";
      delta.insert(text);
    } else {
      const context: DeserializeContext = { delta, html: current };
      this.editor.plugin.call(CALLER_TYPE.DESERIALIZE, context);
      return context.delta;
    }
    return delta;
  }
  const children = Array.from(current.childNodes);
  for (const child of children) {
    const newDelta = this.deserialize(child);
    delta.ops.push(...newDelta.ops);
  }
  const context: DeserializeContext = { delta, html: current };
  this.editor.plugin.call(CALLER_TYPE.DESERIALIZE, context);
  return context.delta;
}

Additionally, for the handling of block-level nested structures, our approach may be more complex, but it is still in the design phase in the current implementation. The serialization process is similar to the following workflow. Unlike the previous structure, when dealing with block structures, the clipboard's serialization module is called directly and the content is embedded.

| -- bold ··· <strong> -- | | -- line -- | | -- <div> ---| | | -- text ··· <span> ---- | | | | root -- lines -- | -- line -- leaves ··· <elements> --------- <div> ---| -- normalize -- html | | | -- codeblock -- ref(id) ··· <code> ------- <div> ---| | | | -- table -- ref(id) ··· <table> ---------- <div> ---|

The deserialization process is relatively more complex because we need to maintain the reference relationships of nested structures. Although the HTML content parsed through DOMParser itself is nested, our baseline parsing method targets a flat Delta structure. However, structures like block and table need nested referenced structures, and the relationship with the id needs to be established according to a convention.

| -- <b> -- text ··· text|r -- bold|r -- | | -- head|r -- align|r -- | | -- <a> -- text ··· text|r -- link|r -- | | -- deltas | -- <u> -- text ··· text|r -- unl|r --- | | -- block|id -- ref|r -- | | -- <i> -- text ··· text|r -- em|r ---- |

Next, we will use the delta data structure as an example to handle the design of a flat clipboard module. Similarly, based on inline structure, paragraph structure, composite structure, embedded structure, and block-level structure, under the scheduling of the above basic patterns, plugins for serialization and deserialization will be implemented according to different types.

Inline Structure

Inline structure refers to the styling of bold, italic, underline, strikethrough, inline code blocks, etc., inline. Here, we take bold as an example to handle serialization and deserialization. The serialization of inline structure is basically consistent with slate. We will start executing this using unit tests.

// packages/core/test/clipboard/bold.test.ts
it("serialize", () => {
  const plugin = getMockedPlugin({
    serialize(context) {
      if (context.op.attributes?.bold) {
        const strong = document.createElement("strong");
        strong.appendChild(context.html);
        context.html = strong;
      }
    },
  });
  editor.plugin.register(plugin);
  const delta = new Delta().insert("Hello", { bold: "true" }).insert("World");
  const root = editor.clipboard.copyModule.serialize(delta);
  const plainText = getFragmentText(root);
  const htmlText = serializeHTML(root);
  expect(plainText).toBe("HelloWorld");
  expect(htmlText).toBe(`<div data-node="true"><strong>Hello</strong>World</div>`);
});

For deserialization, it involves checking if the current HTML node being processed is a bold node, and if so, converting it to a Delta node.

```js // packages/core/test/clipboard/bold.test.ts it("deserialize", () => { const plugin = getMockedPlugin({ deserialize(context) { const { delta, html } = context; if (!isHTMLElement(html)) return void 0; if (isMatchHTMLTag(html, "strong") || isMatchHTMLTag(html, "b") || html.style.fontWeight === "bold") { // applyMarker packages/core/src/clipboard/utils/deserialize.ts applyMarker(delta, { bold: "true" }); } }, }); editor.plugin.register(plugin); const parser = new DOMParser(); const transferHTMLText = `<div><strong>Hello</strong>World</div>`; const html = parser.parseFromString(transferHTMLText, "text/html"); const rootDelta = editor.clipboard.pasteModule.deserialize(html.body); const delta = new Delta().insert("Hello", { bold: "true" }).insert("World"); expect(rootDelta).toEqual(delta); });

Paragraph Structure

Paragraph structure refers to styles such as headings, line height, and text alignment. Here, we will focus on serialization and deserialization with headings as an example. To serialize paragraph structure, when a Node is a heading node, we construct the related HTML node, wrap the original node in place, and assign it to the context, using nested nodes as well.

// packages/core/test/clipboard/heading.test.ts
it("serialize", () => {
  const plugin = getMockedPlugin({
    serialize(context) {
      const { op, html } = context;
      if (isEOLOp(op) && op.attributes?.heading) {
        const element = document.createElement(op.attributes.heading);
        element.appendChild(html);
        context.html = element;
      }
    },
  });
  editor.plugin.register(plugin);
  const delta = new MutateDelta().insert("Hello").insert("\n", { heading: "h1" });
  const root = editor.clipboard.copyModule.serialize(delta);
  const plainText = getFragmentText(root);
  const htmlText = serializeHTML(root);
  expect(plainText).toBe("Hello");
  expect(htmlText).toBe(`<h1>Hello</h1>`);
});

Deserialization, on the other hand, involves identifying if the current HTML node being processed is a heading node and then converting it to a Node node. In this case, in-place data processing is required as well. Unlike inline nodes, all line nodes need to be added in heading format using applyLineMarker.

// packages/core/test/clipboard/heading.test.ts
it("deserialize", () => {
  const plugin = getMockedPlugin({
    deserialize(context) {
      const { delta, html } = context;
      if (!isHTMLElement(html)) return void 0;
      if (["h1", "h2"].indexOf(html.tagName.toLowerCase()) > -1) {
        applyLineMarker(delta, { heading: html.tagName.toLowerCase() });
      }
    },
  });
  editor.plugin.register(plugin);
  const parser = new DOMParser();
  const transferHTMLText = `<div><h1>Hello</h1><h2>World</h2></div>`;
  const html = parser.parseFromString(transferHTMLText, TEXT_HTML);
  const rootDelta = editor.clipboard.pasteModule.deserialize(html.body);
  const delta = new Delta()
    .insert("Hello")
    .insert("\n", { heading: "h1" })
    .insert("World")
    .insert("\n", { heading: "h2" });
  expect(rootDelta).toEqual(MutateDelta.from(delta));
});

Composite Structure

In this context, composite structure refers to block quotes, ordered lists, unordered lists, and similar structured styles. Here, we use block quotes as an example to handle serialization and deserialization. To serialize composite structures, I also need to construct related HTML nodes for wrapping when the Node is a block quote node. In a flat structure, handling composite structures would typically occur during rendering, so the serialization process is similar to handling headings.

// packages/core/test/clipboard/quote.test.ts
it("serialize", () => {
  const plugin = getMockedPlugin({
    serialize(context) {
      const { op, html } = context;
      if (isEOLOp(op) && op.attributes?.quote) {
        const element = document.createElement("blockquote");
        element.appendChild(html);
        context.html = element;
      }
    },
  });
  editor.plugin.register(plugin);
  const delta = new MutateDelta().insert("Hello").insert("\n", { quote: "true" });
  const root = editor.clipboard.copyModule.serialize(delta);
  const plainText = getFragmentText(root);
  const htmlText = serializeHTML(root);
  expect(plainText).toBe("Hello");
  expect(htmlText).toBe(`<blockquote>Hello</blockquote>`);
});

Deserialization involves identifying whether the node is a block quote node and constructing the corresponding Node node. Unlike the heading module, where the format is applied to the relevant line nodes, block quotes involve nesting a layer of structure on the original node. The deserialization structure handling is similar to the heading handling, as the HTML structure is nested, applying the quote format across all line nodes during application.

// packages/core/test/clipboard/quote.test.ts
it("deserialize", () => {
  const plugin = getMockedPlugin({
    deserialize(context) {
      const { delta, html } = context;
      if (!isHTMLElement(html)) return void 0;
      if (isMatchHTMLTag(html, "p")) {
        applyLineMarker(delta, {});
      }
      if (isMatchHTMLTag(html, "blockquote")) {
        applyLineMarker(delta, { quote: "true" });
      }
    },
  });
  editor.plugin.register(plugin);
  const parser = new DOMParser();
  const transferHTMLText = `<div><blockquote><p>Hello</p><p>World</p></blockquote></div>`;
  const html = parser.parseFromString(transferHTMLText, TEXT_HTML);
  const rootDelta = editor.clipboard.pasteModule.deserialize(html.body);
  const delta = new Delta()
    .insert("Hello")
    .insert("\n", { quote: "true" })
    .insert("World")
    .insert("\n", { quote: "true" });
  expect(rootDelta).toEqual(MutateDelta.from(delta));
});

Embed Structure

Embed structure refers to styles like images, videos, flowcharts, etc. Here we'll focus on the serialization and deserialization of images. When serializing embed structures, we simply need to wrap related HTML nodes when the Node is an image node. Unlike previous nodes, there's no need to nest DOM nodes here; we can simply replace the standalone node in place.

// packages/core/test/clipboard/image.test.ts
it("serialize", () => {
  const plugin = getMockedPlugin({
    serialize(context) {
      const { op } = context;
      if (op.attributes?.image && op.attributes.src) {
        const element = document.createElement("img");
        element.src = op.attributes.src;
        context.html = element;
      }
    },
  });
  editor.plugin.register(plugin);
  const delta = new Delta().insert(" ", {
    image: "true",
    src: "https://example.com/image.png",
  });
  const root = editor.clipboard.copyModule.serialize(delta);
  const plainText = getFragmentText(root);
  const htmlText = serializeHTML(root);
  expect(plainText).toBe("");
  expect(htmlText).toBe(`<div data-node="true"><img src="https://example.com/image.png"></div>`);
});

For deserializing structures, check if the current HTML node being processed is an image node; if so, convert it to a Node node. Similarly, a common operation here is that pasting image content often requires migrating the original src to our service; for example, images in Lark are temporary links, and in production, resources need to be migrated.

// packages/core/test/clipboard/image.test.ts
it("deserialize", () => {
  const plugin = getMockedPlugin({
    deserialize(context) {
      const { html } = context;
      if (!isHTMLElement(html)) return void 0;
      if (isMatchHTMLTag(html, "img")) {
        const src = html.getAttribute("src") || "";
        const delta = new Delta();
        delta.insert(" ", { image: "true", src: src });
        context.delta = delta;
      }
    },
  });
  editor.plugin.register(plugin);
  const parser = new DOMParser();
  const transferHTMLText = `<img src="https://example.com/image.png"></img>`;
  const html = parser.parseFromString(transferHTMLText, TEXT_HTML);
  const rootDelta = editor.clipboard.pasteModule.deserialize(html.body);
  const delta = new Delta().insert(" ", { image: "true", src: "https://example.com/image.png" });
  expect(rootDelta).toEqual(delta);
});

Block Structure

Block Structure refers to highlighted blocks, code blocks, tables, and other structural styles. Here we use block structure as an example to handle serialization and deserialization. Nesting structures are not yet implemented, thus only the test cases for the mentioned deltas diagram are implemented here. The primary approach is to proactively call the serialization method when reference relationships exist to write them to HTML.

it("serialize", () => {
  const block = new Delta().insert("inside");
  const inside = editor.clipboard.copyModule.serialize(block);
  const plugin = getMockedPlugin({
    serialize(context) {
      const { op } = context;
      if (op.attributes?._ref) {
        const element = document.createElement("div");
        element.setAttribute("data-block", op.attributes._ref);
        element.appendChild(inside);
        context.html = element;
      }
    },
  });
  editor.plugin.register(plugin);
  const delta = new Delta().insert(" ", { _ref: "id" });
  const root = editor.clipboard.copyModule.serialize(delta);
  const plainText = getFragmentText(root);
  const htmlText = serializeHTML(root);
  expect(plainText).toBe("inside\n");
  expect(htmlText).toBe(
    `<div data-node="true"><div data-block="id"><div data-node="true">inside</div></div></div>`
  );
});

Deserialization involves determining if the current HTML node being processed is a block-level node, and if so, converting it into a Node node. The approach here is to generate an id when encountering the block node while traversing nodes in a depth-first manner, place it in the deltas, and then reference that node in the ROOT structure.

it("deserialize", () => {
  const deltas: Record<string, Delta> = {};
  const plugin = getMockedPlugin({
    deserialize(context) {
      const { html } = context;
      if (!isHTMLElement(html)) return void 0;
      if (isMatchHTMLTag(html, "div") && html.hasAttribute("data-block")) {
        const id = html.getAttribute("data-block")!;
        deltas[id] = context.delta;
        context.delta = new Delta().insert(" ", { _ref: id });
      }
    },
  });
  editor.plugin.register(plugin);
  const parser = new DOMParser();
  const transferHTMLText = `<div data-node="true"><div data-block="id"><div data-node="true">inside</div></div></div>`;
  const html = parser.parseFromString(transferHTMLText, TEXT_HTML);
  const rootDelta = editor.clipboard.pasteModule.deserialize(html.body);
  deltas[ROOT_BLOCK] = rootDelta;
  expect(deltas).toEqual({
    [ROOT_BLOCK]: new Delta().insert(" ", { _ref: "id" }),
    id: new Delta().insert("inside"),
  });
});

Daily Challenge

References