From ccd6ac4d44b5c91aa2d1c2534c7be57963b616d5 Mon Sep 17 00:00:00 2001 From: Wilfred Hughes Date: Sun, 21 Mar 2021 13:34:01 -0700 Subject: Add notes on LCS weaknesses --- text_diff_notes.md | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 text_diff_notes.md diff --git a/text_diff_notes.md b/text_diff_notes.md new file mode 100644 index 000000000..c4a183dea --- /dev/null +++ b/text_diff_notes.md @@ -0,0 +1,48 @@ +Consider changing: + +``` +foo(); +bar(); +``` + +To: + +``` +if (true) { + foo(); +} +``` + +What we want: + +``` ++ if (true) { + foo(); +- bar(); ++ } +``` + +A longest-common-subsequence algorithm is wrong here. The longest +subsequence is five tokens: + +``` +( ) ( ) ; +``` + +which leads to: + +``` ++if+ (+true+) +{+ + +foo+(); + -bar-(); ++}+ +``` + +so we claim `foo` is added. We want the following *four* tokens to be +preserved: + +``` +foo ( ) ; +``` + +Proposed solution: advance on both sides, keep first match. -- cgit v1.2.3