Chapter 13: Red-Black Trees
Contents
13.1 Properties of red-black trees
A red-black tree is a binary search tree that satisfies the following red-black properties:
- Every node is either red or black.
给 BST 的结点添加颜色属性,用以维持平衡 - The root is black.
这条性质可不可以删除呢?(Exercise 13.1-3)
那将变成 relaxed red-black tree ,也是可以平衡的
令根为黑色只是为了方便(比如后面的插入操作) - Every leaf (NIL) is black.
否则根据性质4,最下一层非空结点将不能为红色 - If a node is red, then both its children are black.
保证了黑色结点在纵向上(每条垂直路径)至少占一半 - For each node, all simple paths from the node to descendant leaves contain the same number of black nodes.
简称,黑高一致,保证了黑色结点在横向上分布均匀
又因为黑色结点在纵向上至少占一半,所以整棵树至少有一半结点在横向上大致平衡
Balance
Red-black trees ensure that no such path is more than twice as long as any other, so that the tree is approximately balanced. (Exercise 13.1-5)
We call the number of black nodes on any simple path from, but not including, a node x down to a leaf the black-height of the node, denoted bh(x).
由性质4易知 \(bh(x) \ge h(x)/2\)
由性质5易知 \(size(x) \ge 2^{bh(x)}-1\)
由引理易知 \(n \ge 2^{h/2\ -\ 1}-1\)
因此 \(h = O(lg\ n)\)
NULL
之前我们往往用
NULL 表示空结点,现在我们用一个特殊的结点
T.nil表示(为什么呢?其实仅仅是为了方便)
为了节省空间,整棵树的空节点合并为了一个:
现在整棵树的叶子都是那个结点,但是当我们一般说某个结点有多少个孩子时,是不包括
T.nil的(只计算internal nodes)
因此可以这样说:如果某个结点只有一个孩子,那么它一定为黑色,它的孩子一定为红色(为什么呢?这是个挺常用的性质)
13.2 Rotations
由 Exercise 13.2-4 可知,仅仅利用旋转,就可以把 BST 调整成任何想要的形状(足够快,而且不破坏 BST 的性质)
因此大多数平衡树都有用旋转这个操作
1 2 3 4 5 6 7 8 9 10 11 12 13 |
LEFT-ROTATE(T, x) y = x.right x.right = y.left if y.left != T.nil y.left.p = x y.p = x.p if x.p == T.nil T.root = y elseif x == x.p.left x.p.left = y else x.p.right = y y.left = x x.p = y |
1 2 3 4 5 6 |
void rotate(int &x, int p) { int y = tree[x].son[!p]; tree[x].son[!p] = tree[y].son[p]; tree[y].son[p] = x; x = y; } |
13.3 Insertion
首先按照 BST 的套路插入:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
RB-INSERT(T, z) y = T.nil x = T.root while x != T.nil y = x if z.key < x.key x = x.left else x = x.right if y == T.nil T.root = z elseif z.key < y.key y.left = z else y.right = z z.p = y z.left = T.nil z.right = T.nil z.color = RED RB-INSERT-FIXUP(T, z) |
Fixup
我们令新插入的结点为红色(为什么呢?因为性质5一旦被破坏,是很难调整回来的)
那么现在有哪些性质可能被破坏了呢?
- 如果 z 为根节点,则性质2被破坏(小问题)
- 如果 z 的父亲为红色, 则性质4被破坏(主要矛盾)
接下来的工作便是修复(以此维持红黑树的平衡性):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
RB-INSERT-FIXUP(T, z) while z.p.color == RED if z.p == z.p.p.left y = z.p.p.right if y.color == RED z.p.color = BLACK //case 1 y.color = BLACK //case 1 z.p.p.color = RED //case 1 z = z.p.p //case 1 else if z == z.p.right z = z.p //case 2 LEFT-ROTATE(T, z) //case 2 z.p.color = BLACK //case 3 z.p.p.color = RED //case 3 RIGHT-ROTATE(T, z.p.p) //case 3 else (same as then clause with "right" and "left" exchanged) T.root.color = BLACK |
CLRS中用了一大段循环不变量来证明程序的正确性:
The while loop above maintains the following three-part invariant at the start of each iteration of the loop:
a. Node z if red.
b. If z.p if the root, then z.p is the black.
c. If the tree violates any of the red-black properties, then it violates at most one of them, and the violation is of either property 2 or property 4. If the tree violates property 2, it is because z is the root and is red. If the tree violates property 4, it is because both z and z.p are red.
一言以蔽之,就是保证当通过 while 循环的 test 条件之后,
z 与
z.p 不为
T.root 且为红色
所以循环要解决的问题就只是双红,而且循环里的
z.p.p不为
T.nil
下面分别说说那三个case:
- z’s uncle y is red
把锅(红色)甩给爷爷,进入下一重循环
然后爷爷或者进入case 2/3,终止循环
或者再次进入case 1,再次甩锅
最坏情况下会一直甩到根节点,所以时间复杂度为O(lg n) - z’s uncle y is black and z is a right child
直接通过旋转转化成case 3 - z’s uncle y is black and z is a left child
这里用了一种常用的操作:
父亲把家产(黑色)传给儿子(原本为红色),自己就没钱了(变成红色),然后让儿子继承自己的位置(通过旋转),以此维持社会和谐(黑高一致)
更一般地,可以写个子函数叫“换色旋转”,在下面的删除里也会用上
13.4 Deletion
首先是一个子函数(类似于第十二章里的 TRANSPLANT())
1 2 3 4 5 6 7 |
RB-TRANSPLANT(T, u, v) if u.p == T.nil T.root = v elseif u == u.p.left u.p.left = v else u.p.right = v v.p = u.p |
The procedure RB-T RANSPLANT differs from T RANSPLANT in two ways.
First, line 1 references the sentinel T.nil instead of NIL.
Second, the assignment to v.p in line 6 occurs unconditionally:
we can assign to v.p even if v points to the sentinel.
In fact, we shall exploit the ability to assign to v.p when v = T.nil.
这里提到了用
T.nil代替
NULL的好处,其实不止这一个(还有哪些?)
然后是删除
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
RB-DELETE(T, z) y = z y-original-color = y.color if z.left == T.nil x = z.right RB-TRANSPLANT(T, z, z.right) elseif z.right == T.nil x = z.left RB-TRANSPLANT(T, z, z.left) else y = TREE-MINMUM(z.right) y-original-color = y.color x = y.right if y.p == z x.p = y else RB-TRANSPLANT(T, y, y.right) y.right = z.right y.right.p = y RB-TRANSPLANT(T, z, y) y.left = z.left y.left.p = y y.color = z.color if y-original-color == BLACK RB-DELETE-FIXUP(T, x) |
(一个小细节:第12、13行看似没用(
x本来就是
y的孩子,为什么还要
x.p = y?),实则有用(为什么呢?因为
x有可能为
T.nil,那么
x.p就不一定是
y))
基本上和普通的 BST 的删除一样
关键是分析删除之后会怎样破坏性质
首先要对
x、
y有直观理解:
-
y是对于颜色来说被删除的点
怎么理解这句话呢?
在 case 1/2 里, y等于 z
在 case 3 里, y等于 z的后继
首先是 z被它的后继 y覆盖,但 y要继承 z的颜色,所以对于颜色来说, z并没有被删除
然后是把 y从原来的位置删除掉,即转化成了 case 1/2(因为后继最多只有一个孩子)所以对于颜色来说,真正被删除的是 y - x是代替了 y的点( x有可能为 T.nil,但是无所谓(再次体现了 T.nil的优势))
现在再来分析可能会被破坏的性质:
- 如果 y为根, x为红,则破坏性质2(小问题)
- 如果 y-original-color为黑,则破坏性质5
- 如果
x和
x.p都为红,则破坏性质4
首先注意到,此时 y-original-color一定为黑,同时也破坏了性质5
所以再用插入那里的套路处理双红是不妥当的
然后注意到,只要让 x继承 y的黑色,就可以了
所以这也是小问题
下面来分析
y-original-color为黑的情况(主要矛盾):
前面提到过,性质5是很难修复的,我们的原则是不破坏它,所以这里我们要强行不破坏它:
统一让
x继承
y的黑色,如果
x原本为红色,就会变成“红黑”,如果
x原本为黑色,就会变成“双黑”
于是强行变成了性质1被破坏
Fixup
然后我们的任务就是在保持黑高一致的前提下,修复性质1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
RB-DELETE-FIXUP(T, x) while x != T.root and x.color == BLACK if x == x.p.left w = x.p.right if w.color == RED w.color = BLACK //case 1 x.p.color = RED //case 1 LEFT-ROTATE(T, x.p) //case 1 w = x.p.right //case 1 if w.left.color == BLACK and w.right.color == BLACK w.color = RED //case 2 x = x.p //case 2 else if w.right.color == BLACK w.left.color = BLACK //case 3 w.color = RED //case 3 RIGHT-ROTATE(T, w) //case 3 w = x.p.right //case 3 w.color = x.p.color //case 4 x.p.color = BLACK //case 4 w.right.color = BLACK //case 4 LEFT-ROTATE(T, x.p) //case 4 x = T.root //case 4 else (same as then clause with "right" and "left" exchanged) x.color = BLACK |
(The extra black on a node is reflected in x’s pointing to the node rather than in the color attribute)
The key idea is that in each case, the transformation applied preserves the number of black nodes (including x’s extra black) from (and including) the root of the subtree shown to each of the subtrees \(\alpha ,\beta ,\dots ,\zeta\).
下面分别说说那四个case:
- x’s sibling w is red
通过上面提到换色旋转(B、D 之间)转化成 case 2/3/4 - x’s sibling w is black, and w’s right children is black, and w’s left children is black
因为 w的两个孩子都是黑色,所以 w可以变成红色,所以可以甩锅给父亲(类似于插入的套路) - x’s sibling w is black, and w’s right children is black, and w’s left children is red
通过上面提到的换色旋转(C、D 之间)转化成 case 4 - x’s sibling w is black, and w’s right children is red
首先是换色旋转(B、D 之间),然后把 E 染黑
Code
完整的代码稍后上传。。。