The representation method outlined in 5.3.1 solves the problem of
implementing list structure, provided that we have an infinite amount of
memory. With a real computer we will eventually run out of free space in which
to construct new pairs. However,
most of the pairs generated in a typical computation are used only to hold
intermediate results. After these results are accessed, the pairs are no
longer needed—they are
garbage. For instance, the computation
5.3.1 节所述的表示方法解决了实现表结构的问题——前提是我们拥有无限量的内存。在真实的计算机上,我们最终会耗尽用于构造新序对的空闲空间。然而,在典型的计算过程中,生成的大多数序对只是用来保存中间结果。这些结果被访问之后,相应的序对就不再需要了——它们成了垃圾 (garbage)。例如,以下计算
constructs two lists: the enumeration and the result of filtering the
enumeration. When the accumulation is complete, these lists are no longer
needed, and the allocated memory can be reclaimed. If we can arrange to
collect all the garbage periodically, and if this turns out to recycle memory
at about the same rate at which we construct new pairs, we will have preserved
the illusion that there is an infinite amount of memory.
会构造两个表:枚举结果和过滤后的结果。积累完成后,这两个表就不再需要了,所占用的内存可以被回收。如果我们能定期收集所有垃圾,并且这一过程回收内存的速率大致等于我们构造新序对的速率,就能维持内存无限充足的假象。
In order to recycle pairs, we must have a way to determine which allocated
pairs are not needed (in the sense that their contents can no longer influence
the future of the computation). The method we shall examine for accomplishing
this is known as
garbage collection. Garbage collection is based on
the observation that, at any moment in a Lisp interpretation, the only objects
that can affect the future of the computation are those that can be reached by
some succession of car and cdr operations starting from the
pointers that are currently in the machine registers.
Any memory cell that is not so accessible may be recycled.
为了回收序对,我们必须有办法判定哪些已分配的序对不再需要(即其内容不再能影响计算的未来走向)。我们将考察的实现这一目的的方法称为垃圾收集 (garbage collection)。垃圾收集基于这样一个观察:在 Lisp 解释执行的任意时刻,能够影响计算未来的对象,只有那些可以从当前机器寄存器中的指针出发,经过若干次 `car` 和 `cdr` 操作所能到达的对象。任何以这种方式无法访问的内存单元都可以被回收。
There are many ways to perform garbage collection. The method we shall examine
here is called
stop-and-copy. The basic idea is to divide memory
into two halves: “working memory” and “free memory.” When cons
constructs pairs, it allocates these in working memory. When working memory is
full, we perform garbage collection by locating all the useful pairs in working
memory and copying these into consecutive locations in free memory. (The
useful pairs are located by tracing all the car and cdr pointers,
starting with the machine registers.) Since we do not copy the garbage, there
will presumably be additional free memory that we can use to allocate new
pairs. In addition, nothing in the working memory is needed, since all the
useful pairs in it have been copied. Thus, if we interchange the roles of
working memory and free memory, we can continue processing; new pairs will be
allocated in the new working memory (which was the old free memory). When this
is full, we can copy the useful pairs into the new free memory (which was the
old working memory).
Subheading: Implementation of a stop-and-copy garbage collector
执行垃圾收集有很多种方法。我们在此考察的方法称为停止并复制 (stop-and-copy)。其基本思想是将内存划分为两半:“工作内存”和“空闲内存”。`cons` 构造序对时,在工作内存中分配空间。当工作内存满时,我们通过定位工作内存中所有有用的序对并将其复制到空闲内存的连续位置来执行垃圾收集(有用序对的定位方法是从机器寄存器出发,追踪所有的 `car` 和 `cdr` 指针)。由于我们不复制垃圾,空闲内存中将有额外的空间可用于分配新序对。此外,工作内存中的内容不再需要,因为其中所有有用的序对都已被复制。因此,若我们交换工作内存和空闲内存的角色,就可以继续处理:新序对将在新的工作内存(即原来的空闲内存)中分配。当新的工作内存满时,再将有用序对复制到新的空闲内存(即原来的工作内存)中。子标题:停止并复制垃圾收集器的实现
We now use our register-machine language to describe the stop-and-copy
algorithm in more detail. We will assume that there is a register called
root that contains a pointer to a structure that eventually points at
all accessible data. This can be arranged by storing the contents of all the
machine registers in a pre-allocated list pointed at by root just before
starting garbage collection. We also assume that, in addition to the current
working memory, there is free memory available into which we can copy the
useful data. The current working memory consists of vectors whose base
addresses are in registers called the-cars and the-cdrs, and the
free memory is in registers called new-cars and new-cdrs.
现在我们用寄存器机器语言更详细地描述停止并复制算法。我们假设有一个名为 `root` 的寄存器,它包含一个指向某结构的指针,该结构最终指向所有可访问的数据。可以这样安排:在垃圾收集开始之前,将所有机器寄存器的内容存入一个由 `root` 指向的预先分配的表中。我们还假设,除了当前工作内存之外,还有可用的空闲内存供我们复制有用数据。当前工作内存由若干向量构成,其基地址分别存放在名为 `the-cars` 和 `the-cdrs` 的寄存器中,空闲内存则存放在名为 `new-cars` 和 `new-cdrs` 的寄存器中。
Garbage collection is triggered when we exhaust the free cells in the current
working memory, that is, when a cons operation attempts to increment the
free pointer beyond the end of the memory vector. When the
garbage-collection process is complete, the root pointer will point into
the new memory, all objects accessible from the root will have been
moved to the new memory, and the free pointer will indicate the next
place in the new memory where a new pair can be allocated. In addition, the
roles of working memory and new memory will have been interchanged—new pairs
will be constructed in the new memory, beginning at the place indicated by
free, and the (previous) working memory will be available as the new
memory for the next garbage collection. Figure 5.15 shows the
arrangement of memory just before and just after garbage collection.
Figure 5.15: Reconfiguration of memory by the garbage-collection process.
当我们耗尽当前工作内存中的空闲单元时,垃圾收集便被触发——即当一次 `cons` 操作试图将 `free` 指针递增到内存向量末尾之外时。垃圾收集过程完成后,`root` 指针将指向新内存,所有从 `root` 可访问的对象都已移至新内存,`free` 指针将指示新内存中下一个可分配新序对的位置。此外,工作内存和新内存的角色将互换——新序对将在新内存中从 `free` 所指示的位置开始构造,而(原来的)工作内存将作为下一次垃圾收集的新内存备用。图 5.15 展示了垃圾收集前后内存的排列情况。图 5.15:垃圾收集过程对内存的重新配置。
The state of the garbage-collection process is controlled by maintaining two
pointers: free and scan. These are initialized to point to the
beginning of the new memory. The algorithm begins by relocating the pair
pointed at by root to the beginning of the new memory. The pair is
copied, the root pointer is adjusted to point to the new location, and
the free pointer is incremented. In addition, the old location of the
pair is marked to show that its contents have been moved. This marking is done
as follows: In the car position, we place a special tag that signals
that this is an already-moved object. (Such an object is traditionally called
a
broken heart.) In the cdr position
we place a
forwarding address that points at the location to which
the object has been moved.
垃圾收集过程的状态由两个指针 `free` 和 `scan` 来控制。它们被初始化为指向新内存的起始位置。算法从将 `root` 所指向的序对迁移到新内存的起始位置开始:将该序对复制过去,调整 `root` 指针指向新位置,并递增 `free` 指针。此外,还要在序对的旧位置做标记,表明其内容已被移走。标记方式如下:在 `car` 位置放置一个特殊标签,表示这是一个已移走的对象(这样的对象传统上称为残心 (broken heart));在 `cdr` 位置放置一个转发地址 (forwarding address),指向该对象已被移往的新位置。
After relocating the root, the garbage collector enters its basic cycle. At
each step in the algorithm, the scan pointer (initially pointing at the
relocated root) points at a pair that has been moved to the new memory but
whose car and cdr pointers still refer to objects in the old
memory. These objects are each relocated, and the scan pointer is
incremented. To relocate an object (for example, the object indicated by the
car pointer of the pair we are scanning) we check to see if the object
has already been moved (as indicated by the presence of a broken-heart tag in
the car position of the object). If the object has not already been
moved, we copy it to the place indicated by free, update free,
set up a broken heart at the object’s old location, and update the pointer to
the object (in this example, the car pointer of the pair we are
scanning) to point to the new location. If the object has already been moved,
its forwarding address (found in the cdr position of the broken heart)
is substituted for the pointer in the pair being scanned. Eventually, all
accessible objects will have been moved and scanned, at which point the
scan pointer will overtake the free pointer and the process will
terminate.
迁移根节点之后,垃圾收集器进入其基本循环。算法的每一步中,`scan` 指针(初始指向已迁移的根节点)指向一个已被移至新内存、但其 `car` 和 `cdr` 指针仍指向旧内存中对象的序对。这些对象被逐一迁移,`scan` 指针随之递增。迁移一个对象(例如,当前扫描序对的 `car` 指针所指向的对象)时,我们首先检查该对象是否已被迁移(即对象的 `car` 位置是否存在残心标签)。若尚未迁移,则将其复制到 `free` 所指示的位置,更新 `free`,在对象的旧位置建立残心,并将指向该对象的指针(本例中即正在扫描的序对的 `car` 指针)更新为指向新位置。若对象已被迁移,则用残心的 `cdr` 位置中存放的转发地址替换正在扫描的序对中的相应指针。最终,所有可访问的对象都将被迁移和扫描,此时 `scan` 指针将追上 `free` 指针,过程终止。
We can specify the stop-and-copy algorithm as a sequence of instructions for a
register machine. The basic step of relocating an object is accomplished by a
subroutine called relocate-old-result-in-new. This subroutine gets its
argument, a pointer to the object to be relocated, from a register named
old. It relocates the designated object (incrementing free in
the process), puts a pointer to the relocated object into a register called
new, and returns by branching to the entry point stored in the register
relocate-continue. To begin garbage collection, we invoke this
subroutine to relocate the root pointer, after initializing free
and scan. When the relocation of root has been accomplished, we
install the new pointer as the new root and enter the main loop of the
garbage collector.
我们可以将停止并复制算法描述为一个寄存器机器的指令序列。迁移一个对象的基本步骤由一个名为 `relocate-old-result-in-new` 的子程序完成。该子程序从名为 `old` 的寄存器中获取其参数——即待迁移对象的指针;它迁移指定对象(过程中递增 `free`),将指向已迁移对象的指针放入名为 `new` 的寄存器,并通过跳转到 `relocate-continue` 寄存器中存储的入口点来返回。为启动垃圾收集,我们在初始化 `free` 和 `scan` 之后,调用此子程序来迁移 `root` 指针。当 `root` 的迁移完成后,我们将新指针作为新的 `root` 安装,并进入垃圾收集器的主循环。
In the main loop of the garbage collector we must determine whether there are
any more objects to be scanned. We do this by testing whether the scan
pointer is coincident with the free pointer. If the pointers are equal,
then all accessible objects have been relocated, and we branch to
gc-flip, which cleans things up so that we can continue the interrupted
computation. If there are still pairs to be scanned, we call the relocate
subroutine to relocate the car of the next pair (by placing the
car pointer in old). The relocate-continue register is
set up so that the subroutine will return to update the car pointer.
在垃圾收集器的主循环中,我们需要判断是否还有更多对象需要扫描。方法是检验 `scan` 指针是否与 `free` 指针重合。若两者相等,则所有可访问的对象都已迁移,我们跳转到 `gc-flip`,在那里做好收尾工作以便继续被中断的计算。若仍有序对需要扫描,则调用 `relocate` 子程序迁移下一个序对的 `car`(将 `car` 指针放入 `old`)。`relocate-continue` 寄存器被设置为使子程序返回后执行更新 `car` 指针的操作。
At update-car, we modify the car pointer of the pair being
scanned, then proceed to relocate the cdr of the pair. We return to
update-cdr when that relocation has been accomplished. After relocating
and updating the cdr, we are finished scanning that pair, so we continue
with the main loop.
在 `update-car` 处,我们修改正在扫描的序对的 `car` 指针,然后继续迁移该序对的 `cdr`。迁移完成后返回到 `update-cdr`。在迁移并更新 `cdr` 之后,该序对的扫描即告完成,继续执行主循环。
The subroutine relocate-old-result-in-new relocates objects as follows:
If the object to be relocated (pointed at by old) is not a pair, then we
return the same pointer to the object unchanged (in new). (For example,
we may be scanning a pair whose car is the number 4. If we represent
the car by n4, as described in 5.3.1, then we want
the “relocated” car pointer to still be n4.) Otherwise, we
must perform the relocation. If the car position of the pair to be
relocated contains a broken-heart tag, then the pair has in fact already been
moved, so we retrieve the forwarding address (from the cdr position of
the broken heart) and return this in new. If the pointer in old
points at a yet-unmoved pair, then we move the pair to the first free cell in
new memory (pointed at by free) and set up the broken heart by storing a
broken-heart tag and forwarding address at the old location.
Relocate-old-result-in-new uses a register oldcr to hold the
car or the cdr of the object pointed at by
old.
子程序 `relocate-old-result-in-new` 按如下方式迁移对象:若待迁移对象(由 `old` 指向)不是序对,则将指向该对象的指针原样返回到 `new` 中(例如,我们可能正在扫描一个 `car` 为数 4 的序对;若如 5.3.1 节所述用 `n4` 来表示该 `car`,则我们希望“迁移后”的 `car` 指针仍为 `n4`)。否则,必须执行迁移。若待迁移序对的 `car` 位置包含残心标签,则该序对实际上已被移走,因此从残心的 `cdr` 位置取出转发地址并在 `new` 中返回。若 `old` 中的指针指向一个尚未移走的序对,则将其移至新内存中第一个空闲单元(由 `free` 指向),并在旧位置存入残心标签和转发地址以建立残心。`Relocate-old-result-in-new` 使用一个名为 `oldcr` 的寄存器来保存由 `old` 所指对象的 `car` 或 `cdr`。
At the very end of the garbage-collection process, we interchange the role of
old and new memories by interchanging pointers: interchanging the-cars
with new-cars, and the-cdrs with new-cdrs. We will then
be ready to perform another garbage collection the next time memory runs out.
在垃圾收集过程的最后,我们通过交换指针来互换新旧内存的角色:将 `the-cars` 与 `new-cars` 互换,将 `the-cdrs` 与 `new-cdrs` 互换。此后,当内存再次耗尽时,我们便可以执行下一次垃圾收集了。
(accumulate
+
0
(filter odd? (enumerate-interval 0 n))) begin-garbage-collection
(assign free (const 0))
(assign scan (const 0))
(assign old (reg root))
(assign relocate-continue
(label reassign-root))
(goto (label relocate-old-result-in-new))
reassign-root
(assign root (reg new))
(goto (label gc-loop)) gc-loop
(test (op =) (reg scan) (reg free))
(branch (label gc-flip))
(assign old
(op vector-ref)
(reg new-cars)
(reg scan))
(assign relocate-continue
(label update-car))
(goto (label relocate-old-result-in-new)) update-car
(perform (op vector-set!)
(reg new-cars)
(reg scan)
(reg new))
(assign old
(op vector-ref)
(reg new-cdrs)
(reg scan))
(assign relocate-continue
(label update-cdr))
(goto (label relocate-old-result-in-new))
update-cdr
(perform (op vector-set!)
(reg new-cdrs)
(reg scan)
(reg new))
(assign scan (op +) (reg scan) (const 1))
(goto (label gc-loop)) relocate-old-result-in-new
(test (op pointer-to-pair?) (reg old))
(branch (label pair))
(assign new (reg old))
(goto (reg relocate-continue))
pair
(assign oldcr
(op vector-ref)
(reg the-cars)
(reg old))
(test (op broken-heart?) (reg oldcr))
(branch (label already-moved))
(assign new (reg free)) ; new location for pair
;; Update free pointer.
(assign free (op +) (reg free) (const 1))
;; Copy the car and cdr to new memory.
(perform (op vector-set!)
(reg new-cars)
(reg new)
(reg oldcr))
(assign oldcr
(op vector-ref)
(reg the-cdrs)
(reg old))
(perform (op vector-set!)
(reg new-cdrs)
(reg new)
(reg oldcr))
;; Construct the broken heart.
(perform (op vector-set!)
(reg the-cars)
(reg old)
(const broken-heart))
(perform (op vector-set!)
(reg the-cdrs)
(reg old)
(reg new))
(goto (reg relocate-continue))
already-moved
(assign new
(op vector-ref)
(reg the-cdrs)
(reg old))
(goto (reg relocate-continue)) gc-flip
(assign temp (reg the-cdrs))
(assign the-cdrs (reg new-cdrs))
(assign new-cdrs (reg temp))
(assign temp (reg the-cars))
(assign the-cars (reg new-cars))
(assign new-cars (reg temp))