# HG changeset patch
# User "Wallace, Eric S" <eric.s.wallace@intel.com>
# Date 1123190759 28800
# Node ID a61728b58dc0ef6efdd2ff80f9bc73707b890cf7
# Parent  16700cdd90556c2f2af01b90ac5a62787f4fa26e
Fix array overflow bug in bdiff

I ran into a bug while importing a large repository into mercurial.
The diff algorithm does not allocate a big enough array of hunks
for some test cases. This results in memory corruption, and possibly,
as in my case, a seg fault.

You should be able to reproduce this problem with any case of more
than a few lines that follows this pattern:

a  b
=  =
1  1
   2
2  3
   4
3  5
   .
4  .
   .
5
.
.
.

I.e., "a" has blank lines on every other line that have been removed in
"b". In this case, the number of matching hunks is equal to the number
of lines in "b". This is more than ((an + bn)/4 + 2). I'm not sure what
motivates this formula, but when I changed it to the smaller of an or
bn (+ 1), it works.

[comment added by mpm]


diff -r 16700cdd9055 -r a61728b58dc0 mercurial/bdiff.c
--- a/mercurial/bdiff.c	Thu Aug 04 13:22:36 2005 -0800
+++ b/mercurial/bdiff.c	Thu Aug 04 13:25:59 2005 -0800
@@ -229,7 +229,8 @@
 	/* allocate and fill arrays */
 	t = equatelines(a, an, b, bn);
 	pos = calloc(bn, sizeof(struct pos));
-	l.head = l.base = malloc(sizeof(struct hunk) * ((an + bn) / 4 + 2));
+	/* we can't have more matches than lines in the shorter file */
+	l.head = l.base = malloc(sizeof(struct hunk) * ((an<bn ? an:bn) + 1));
 
 	if (pos && l.base && t) {
 		/* generate the matching block list */