
Announcements
📌 Requirement
We have built a PCF component (RichTextDiffPCF) that is integrated into a Canvas App in PowerApps. The goal is to visually compare two HTML inputs — originalHtml and modifiedHtml — and highlight the differences directly in the app. This component takes both HTML strings as input and renders a side-by-side or inline diff highlighting:
Insertions in bold ()
Deletions with strikethrough ()
Unchanged text displayed normally, preserving formatting and styles
To achieve this, we use the diff-match-patch library to compare text block-by-block, attempting to maintain the HTML tag and inline style structure.
✅ What is Working
The component renders diffs correctly in most common scenarios, such as minor edits in a paragraph or changes to heading text.
It correctly maintains the original HTML tag (
,
Errors (e.g., malformed HTML) are caught and displayed as fallback messages in the control.
❌ Problem Faced
In certain cases — especially when:
The HTML block structure changes (e.g., number or order of
or
The content has extra spaces, line breaks, or encoding issues,
…the component ends up:
Striking through the entire original block, and
Reinserting the entire modified block in bold — even when only a word or two has changed.
This behavior makes the diff unreadable and misleading. Instead of showing just the changes, it suggests the whole block was deleted and reinserted.
🔍 Expected Behavior
We expect the diff to:
Compare and highlight only the changed text within similar blocks,
Avoid full block-level replacements unless the content is truly unrelated,
Show a clean, semantic diff that accurately reflects the edit intent.
🤔 Questions for the Community
Has anyone else experienced similar issues using diff-match-patch in PCF components inside Canvas Apps?
Are there better techniques for HTML-aware diffing that work well in PCF?
Is there a recommended approach from Microsoft for comparing rich text in Canvas Apps using PCF?
Can we improve alignment of blocks (e.g., paragraph-to-paragraph) to reduce false positives during diff?
📎 Context
Platform: PowerApps Canvas App
PCF Environment: TypeScript-based control
Comparison Library: diff-match-patch
Input: originalHtml and modifiedHtml passed via component properties
Rendering: innerHTML of PCF container
Index.ts code -
import { Console } from "console";
import { IInputs, IOutputs } from "./generated/ManifestTypes";
import { diff_match_patch, DIFF_DELETE, DIFF_INSERT } from "diff-match-patch";
export class RichTextDiffPCF implements ComponentFramework.StandardControl<IInputs, IOutputs> {
private container: HTMLDivElement;
private diffHtml = "";
private notifyOutputChanged: () => void;
/**
* Empty constructor.
*/
constructor() {
// Empty
}
public init(
context: ComponentFramework.Context<IInputs>,
notifyOutputChanged: () => void,
state: ComponentFramework.Dictionary,
container: HTMLDivElement
): void {
this.container = container;
this.notifyOutputChanged = notifyOutputChanged;
}
public updateView(context: ComponentFramework.Context<IInputs>): void {
try {
const originalHtml = context.parameters.originalHtml.raw || "";
const modifiedHtml = context.parameters.modifiedHtml.raw || "";
// console.log(originalHtml);
// console.log(modifiedHtml);
const originalBlocks = this.extractBlockElements(originalHtml);
//console.log(originalBlocks);
const modifiedBlocks = this.extractBlockElements(modifiedHtml);
//const originalBlocks = originalHtml;
//const modifiedBlocks = modifiedHtml;
const dmp = new diff_match_patch();
const diffHtmlBlocks: string[] = [];
const maxLength = Math.max(originalBlocks.length, modifiedBlocks.length);
console.log(maxLength);
for (let i = 0; i < maxLength; i++) {
const original = originalBlocks[i]?.content || "";
const modified = modifiedBlocks[i]?.content || "";
console.log(original)
console.log(modified)
const tag = modifiedBlocks[i]?.tag || originalBlocks[i]?.tag || "p";
const style = modifiedBlocks[i]?.style || originalBlocks[i]?.style || "";
const diffs = dmp.diff_main(original, modified);
//console.log(diffs);
dmp.diff_cleanupSemantic(diffs);
const blockHtml = this.convertDiffsToHtml(diffs);
//console.log(blockHtml);
const styledTag = style ? `<${tag} style="${style}">` : `<${tag}>`;
diffHtmlBlocks.push(`${styledTag}${blockHtml}`);
}
this.diffHtml = diffHtmlBlocks.join("");
this.container.innerHTML = this.diffHtml;
//this.notifyOutputChanged();
} catch (error) {
console.error("🔒 Error during updateView():", error);
this.diffHtml = `
this.container.innerHTML = this.diffHtml;
this.notifyOutputChanged();
}
}
/**
* It is called by the framework prior to a control receiving new data.
* @returns an object based on nomenclature defined in manifest, expecting object[s] for property marked as "bound" or "output"
*/
private extractBlockElements(html: string): { tag: string; content: string; style: string; rawHtml: string }[] {
const blocks: { tag: string; content: string; style: string; rawHtml: string }[] = [];
try {
const tempDiv = document.createElement("div");
tempDiv.innerHTML = html;
const blockTags = new Set([
"address", "article", "aside", "blockquote", "canvas", "dd", "div", "dl", "dt",
"fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5",
"h6", "header", "hr", "li", "main", "nav", "noscript", "ol", "p", "pre", "section",
"table", "tfoot", "ul", "video"
]);
const walker = document.createTreeWalker(tempDiv, NodeFilter.SHOW_ELEMENT, null);
let node: Element | null;
while ((node = walker.nextNode() as Element | null)) {
const tag = node.tagName.toLowerCase();
if (blockTags.has(tag)) {
const textContent = node.textContent?.replace(/\s+/g, " ").trim() || "";
const style = node.getAttribute("style") || "";
const rawHtml = node.innerHTML.trim();
if (textContent) {
blocks.push({ tag, content: textContent, style, rawHtml });
}
walker.currentNode = node;
}
}
} catch (error) {
console.error("🔒 Error in extractBlockElements():", error);
}
return blocks;
}
private convertDiffsToHtml(diffs: [number, string][]): string {
try {
const result: string[] = [];
for (const [op, data] of diffs) {
const escaped = data
.replace(/&/g, "&")
.replace(/, "<")
.replace(/>/g, ">")
.replace(/\n/g, "
");
if (op === DIFF_INSERT) {
result.push(`${escaped}`);
} else if (op === DIFF_DELETE) {
result.push(`${escaped}`);
} else {
result.push(escaped);
}
}
return result.join("");
} catch (error) {
console.error("🔒 Error in convertDiffsToHtml():", error);
return `[Error rendering diff]`;
}
}
public getOutputs(): IOutputs {
return {
diffHtml: this.diffHtml
};
}
public destroy(): void {
// Optional cleanup logic
}
}
it’s clear you’ve built a solid foundation with your RichTextDiffPCF component. The issue you're facing is a common challenge when using diff-match-patch with HTML: it’s text-based, not HTML-aware, so it doesn't understand tag structure or semantics, which leads to block-level mismatches and excessive insert/delete diffs.
diff_cleanupSemantic() to improve diff quality.The core issue is misalignment of blocks between originalHtml and modifiedHtml. When the number/order of blocks changes, your loop compares mismatched blocks (e.g., <p> vs <div>), which leads to:
Instead of comparing blocks by index, try aligning them by content similarity or tag+style+text hash. For example:
function alignBlocks(
original: Block[],
modified: Block[]
): [Block | null, Block | null][] {
const aligned: [Block | null, Block | null][] = [];
const used = new Set
for (const orig of original) {
let bestMatch: Block | null = null;
let bestIndex = -1;
let bestScore = 0;
for (let i = 0; i < modified.length; i++) {
if (used.has(i)) continue;
const mod = modified[i];
const score = similarity(orig.content, mod.content); // e.g., Jaccard or Levenshtein
if (score > bestScore) {
bestScore = score;
bestMatch = mod;
bestIndex = i;
}
}
if (bestMatch && bestScore > 0.5) {
aligned.push([orig, bestMatch]);
used.add(bestIndex);
} else {
aligned.push([orig, null]);
}
}
// Add remaining unmatched modified blocks
modified.forEach((mod, i) => {
if (!used.has(i)) aligned.push([null, mod]);
});
return aligned;
}
This will reduce false positives and improve diff granularity.
diff-match-patch is not designed for HTML. Consider alternatives like:
You can still use diff-match-patch for inline text diffs within aligned blocks, but use a DOM-aware diff for block-level alignment.
Whitespace, line breaks, and encoding issues can throw off diffs. Normalize both inputs:
function normalizeHtml(html: string): string {
return html
.replace(/\s+/g, " ")
.replace(/ /g, " ")
.trim();
}
Apply this before extracting blocks.
If a block has no match, you can:
There’s no official Microsoft-recommended HTML diffing strategy for PCF, but your approach is valid. For richer scenarios, Microsoft often recommends:
Would you like help implementing:
🏷️ Tag me if you have any further questions or if the issue persists.
✅ Click "Accept as Solution" if my post helped resolve your issue—it helps others facing similar problems. ❤️ Give it a Like if you found the approach useful in any way.