灯下 登录
计算机科学 / SICP / subsection 中英对照

5.3.2 Maintaining the Illusion of Infinite Memory

The representation method outlined in 5.3.1 solves the problem of

implementing list structure, provided that we have an infinite amount of

memory. With a real computer we will eventually run out of free space in which

to construct new pairs. However,

most of the pairs generated in a typical computation are used only to hold

intermediate results. After these results are accessed, the pairs are no

longer needed—they are

garbage. For instance, the computation

5.3.1 节所述的表示方法解决了实现表结构的问题——前提是我们拥有无限量的内存。在真实的计算机上,我们最终会耗尽用于构造新序对的空闲空间。然而,在典型的计算过程中,生成的大多数序对只是用来保存中间结果。这些结果被访问之后,相应的序对就不再需要了——它们成了垃圾 (garbage)。例如,以下计算

constructs two lists: the enumeration and the result of filtering the

enumeration. When the accumulation is complete, these lists are no longer

needed, and the allocated memory can be reclaimed. If we can arrange to

collect all the garbage periodically, and if this turns out to recycle memory

at about the same rate at which we construct new pairs, we will have preserved

the illusion that there is an infinite amount of memory.

会构造两个表:枚举结果和过滤后的结果。积累完成后,这两个表就不再需要了,所占用的内存可以被回收。如果我们能定期收集所有垃圾,并且这一过程回收内存的速率大致等于我们构造新序对的速率,就能维持内存无限充足的假象。

In order to recycle pairs, we must have a way to determine which allocated

pairs are not needed (in the sense that their contents can no longer influence

the future of the computation). The method we shall examine for accomplishing

this is known as

garbage collection. Garbage collection is based on

the observation that, at any moment in a Lisp interpretation, the only objects

that can affect the future of the computation are those that can be reached by

some succession of car and cdr operations starting from the

pointers that are currently in the machine registers.

Any memory cell that is not so accessible may be recycled.

为了回收序对,我们必须有办法判定哪些已分配的序对不再需要(即其内容不再能影响计算的未来走向)。我们将考察的实现这一目的的方法称为垃圾收集 (garbage collection)。垃圾收集基于这样一个观察:在 Lisp 解释执行的任意时刻,能够影响计算未来的对象,只有那些可以从当前机器寄存器中的指针出发,经过若干次 `car` 和 `cdr` 操作所能到达的对象。任何以这种方式无法访问的内存单元都可以被回收。

There are many ways to perform garbage collection. The method we shall examine

here is called

stop-and-copy. The basic idea is to divide memory

into two halves: “working memory” and “free memory.” When cons

constructs pairs, it allocates these in working memory. When working memory is

full, we perform garbage collection by locating all the useful pairs in working

memory and copying these into consecutive locations in free memory. (The

useful pairs are located by tracing all the car and cdr pointers,

starting with the machine registers.) Since we do not copy the garbage, there

will presumably be additional free memory that we can use to allocate new

pairs. In addition, nothing in the working memory is needed, since all the

useful pairs in it have been copied. Thus, if we interchange the roles of

working memory and free memory, we can continue processing; new pairs will be

allocated in the new working memory (which was the old free memory). When this

is full, we can copy the useful pairs into the new free memory (which was the

old working memory).

Subheading: Implementation of a stop-and-copy garbage collector

执行垃圾收集有很多种方法。我们在此考察的方法称为停止并复制 (stop-and-copy)。其基本思想是将内存划分为两半:“工作内存”和“空闲内存”。`cons` 构造序对时,在工作内存中分配空间。当工作内存满时,我们通过定位工作内存中所有有用的序对并将其复制到空闲内存的连续位置来执行垃圾收集(有用序对的定位方法是从机器寄存器出发,追踪所有的 `car` 和 `cdr` 指针)。由于我们不复制垃圾,空闲内存中将有额外的空间可用于分配新序对。此外,工作内存中的内容不再需要,因为其中所有有用的序对都已被复制。因此,若我们交换工作内存和空闲内存的角色,就可以继续处理:新序对将在新的工作内存(即原来的空闲内存)中分配。当新的工作内存满时,再将有用序对复制到新的空闲内存(即原来的工作内存)中。子标题:停止并复制垃圾收集器的实现

We now use our register-machine language to describe the stop-and-copy

algorithm in more detail. We will assume that there is a register called

root that contains a pointer to a structure that eventually points at

all accessible data. This can be arranged by storing the contents of all the

machine registers in a pre-allocated list pointed at by root just before

starting garbage collection. We also assume that, in addition to the current

working memory, there is free memory available into which we can copy the

useful data. The current working memory consists of vectors whose base

addresses are in registers called the-cars and the-cdrs, and the

free memory is in registers called new-cars and new-cdrs.

现在我们用寄存器机器语言更详细地描述停止并复制算法。我们假设有一个名为 `root` 的寄存器,它包含一个指向某结构的指针,该结构最终指向所有可访问的数据。可以这样安排:在垃圾收集开始之前,将所有机器寄存器的内容存入一个由 `root` 指向的预先分配的表中。我们还假设,除了当前工作内存之外,还有可用的空闲内存供我们复制有用数据。当前工作内存由若干向量构成,其基地址分别存放在名为 `the-cars` 和 `the-cdrs` 的寄存器中,空闲内存则存放在名为 `new-cars` 和 `new-cdrs` 的寄存器中。

Garbage collection is triggered when we exhaust the free cells in the current

working memory, that is, when a cons operation attempts to increment the

free pointer beyond the end of the memory vector. When the

garbage-collection process is complete, the root pointer will point into

the new memory, all objects accessible from the root will have been

moved to the new memory, and the free pointer will indicate the next

place in the new memory where a new pair can be allocated. In addition, the

roles of working memory and new memory will have been interchanged—new pairs

will be constructed in the new memory, beginning at the place indicated by

free, and the (previous) working memory will be available as the new

memory for the next garbage collection. Figure 5.15 shows the

arrangement of memory just before and just after garbage collection.

Figure 5.15: Reconfiguration of memory by the garbage-collection process.

当我们耗尽当前工作内存中的空闲单元时,垃圾收集便被触发——即当一次 `cons` 操作试图将 `free` 指针递增到内存向量末尾之外时。垃圾收集过程完成后,`root` 指针将指向新内存,所有从 `root` 可访问的对象都已移至新内存,`free` 指针将指示新内存中下一个可分配新序对的位置。此外,工作内存和新内存的角色将互换——新序对将在新内存中从 `free` 所指示的位置开始构造,而(原来的)工作内存将作为下一次垃圾收集的新内存备用。图 5.15 展示了垃圾收集前后内存的排列情况。图 5.15:垃圾收集过程对内存的重新配置。

The state of the garbage-collection process is controlled by maintaining two

pointers: free and scan. These are initialized to point to the

beginning of the new memory. The algorithm begins by relocating the pair

pointed at by root to the beginning of the new memory. The pair is

copied, the root pointer is adjusted to point to the new location, and

the free pointer is incremented. In addition, the old location of the

pair is marked to show that its contents have been moved. This marking is done

as follows: In the car position, we place a special tag that signals

that this is an already-moved object. (Such an object is traditionally called

a

broken heart.) In the cdr position

we place a

forwarding address that points at the location to which

the object has been moved.

垃圾收集过程的状态由两个指针 `free` 和 `scan` 来控制。它们被初始化为指向新内存的起始位置。算法从将 `root` 所指向的序对迁移到新内存的起始位置开始:将该序对复制过去,调整 `root` 指针指向新位置,并递增 `free` 指针。此外,还要在序对的旧位置做标记,表明其内容已被移走。标记方式如下:在 `car` 位置放置一个特殊标签,表示这是一个已移走的对象(这样的对象传统上称为残心 (broken heart));在 `cdr` 位置放置一个转发地址 (forwarding address),指向该对象已被移往的新位置。

After relocating the root, the garbage collector enters its basic cycle. At

each step in the algorithm, the scan pointer (initially pointing at the

relocated root) points at a pair that has been moved to the new memory but

whose car and cdr pointers still refer to objects in the old

memory. These objects are each relocated, and the scan pointer is

incremented. To relocate an object (for example, the object indicated by the

car pointer of the pair we are scanning) we check to see if the object

has already been moved (as indicated by the presence of a broken-heart tag in

the car position of the object). If the object has not already been

moved, we copy it to the place indicated by free, update free,

set up a broken heart at the object’s old location, and update the pointer to

the object (in this example, the car pointer of the pair we are

scanning) to point to the new location. If the object has already been moved,

its forwarding address (found in the cdr position of the broken heart)

is substituted for the pointer in the pair being scanned. Eventually, all

accessible objects will have been moved and scanned, at which point the

scan pointer will overtake the free pointer and the process will

terminate.

迁移根节点之后,垃圾收集器进入其基本循环。算法的每一步中,`scan` 指针(初始指向已迁移的根节点)指向一个已被移至新内存、但其 `car` 和 `cdr` 指针仍指向旧内存中对象的序对。这些对象被逐一迁移,`scan` 指针随之递增。迁移一个对象(例如,当前扫描序对的 `car` 指针所指向的对象)时,我们首先检查该对象是否已被迁移(即对象的 `car` 位置是否存在残心标签)。若尚未迁移,则将其复制到 `free` 所指示的位置,更新 `free`,在对象的旧位置建立残心,并将指向该对象的指针(本例中即正在扫描的序对的 `car` 指针)更新为指向新位置。若对象已被迁移,则用残心的 `cdr` 位置中存放的转发地址替换正在扫描的序对中的相应指针。最终,所有可访问的对象都将被迁移和扫描,此时 `scan` 指针将追上 `free` 指针,过程终止。

We can specify the stop-and-copy algorithm as a sequence of instructions for a

register machine. The basic step of relocating an object is accomplished by a

subroutine called relocate-old-result-in-new. This subroutine gets its

argument, a pointer to the object to be relocated, from a register named

old. It relocates the designated object (incrementing free in

the process), puts a pointer to the relocated object into a register called

new, and returns by branching to the entry point stored in the register

relocate-continue. To begin garbage collection, we invoke this

subroutine to relocate the root pointer, after initializing free

and scan. When the relocation of root has been accomplished, we

install the new pointer as the new root and enter the main loop of the

garbage collector.

我们可以将停止并复制算法描述为一个寄存器机器的指令序列。迁移一个对象的基本步骤由一个名为 `relocate-old-result-in-new` 的子程序完成。该子程序从名为 `old` 的寄存器中获取其参数——即待迁移对象的指针;它迁移指定对象(过程中递增 `free`),将指向已迁移对象的指针放入名为 `new` 的寄存器,并通过跳转到 `relocate-continue` 寄存器中存储的入口点来返回。为启动垃圾收集,我们在初始化 `free` 和 `scan` 之后,调用此子程序来迁移 `root` 指针。当 `root` 的迁移完成后,我们将新指针作为新的 `root` 安装,并进入垃圾收集器的主循环。

In the main loop of the garbage collector we must determine whether there are

any more objects to be scanned. We do this by testing whether the scan

pointer is coincident with the free pointer. If the pointers are equal,

then all accessible objects have been relocated, and we branch to

gc-flip, which cleans things up so that we can continue the interrupted

computation. If there are still pairs to be scanned, we call the relocate

subroutine to relocate the car of the next pair (by placing the

car pointer in old). The relocate-continue register is

set up so that the subroutine will return to update the car pointer.

在垃圾收集器的主循环中,我们需要判断是否还有更多对象需要扫描。方法是检验 `scan` 指针是否与 `free` 指针重合。若两者相等,则所有可访问的对象都已迁移,我们跳转到 `gc-flip`,在那里做好收尾工作以便继续被中断的计算。若仍有序对需要扫描,则调用 `relocate` 子程序迁移下一个序对的 `car`(将 `car` 指针放入 `old`)。`relocate-continue` 寄存器被设置为使子程序返回后执行更新 `car` 指针的操作。

At update-car, we modify the car pointer of the pair being

scanned, then proceed to relocate the cdr of the pair. We return to

update-cdr when that relocation has been accomplished. After relocating

and updating the cdr, we are finished scanning that pair, so we continue

with the main loop.

在 `update-car` 处,我们修改正在扫描的序对的 `car` 指针,然后继续迁移该序对的 `cdr`。迁移完成后返回到 `update-cdr`。在迁移并更新 `cdr` 之后,该序对的扫描即告完成,继续执行主循环。

The subroutine relocate-old-result-in-new relocates objects as follows:

If the object to be relocated (pointed at by old) is not a pair, then we

return the same pointer to the object unchanged (in new). (For example,

we may be scanning a pair whose car is the number 4. If we represent

the car by n4, as described in 5.3.1, then we want

the “relocated” car pointer to still be n4.) Otherwise, we

must perform the relocation. If the car position of the pair to be

relocated contains a broken-heart tag, then the pair has in fact already been

moved, so we retrieve the forwarding address (from the cdr position of

the broken heart) and return this in new. If the pointer in old

points at a yet-unmoved pair, then we move the pair to the first free cell in

new memory (pointed at by free) and set up the broken heart by storing a

broken-heart tag and forwarding address at the old location.

Relocate-old-result-in-new uses a register oldcr to hold the

car or the cdr of the object pointed at by

old.

子程序 `relocate-old-result-in-new` 按如下方式迁移对象:若待迁移对象(由 `old` 指向)不是序对,则将指向该对象的指针原样返回到 `new` 中(例如,我们可能正在扫描一个 `car` 为数 4 的序对;若如 5.3.1 节所述用 `n4` 来表示该 `car`,则我们希望“迁移后”的 `car` 指针仍为 `n4`)。否则,必须执行迁移。若待迁移序对的 `car` 位置包含残心标签,则该序对实际上已被移走,因此从残心的 `cdr` 位置取出转发地址并在 `new` 中返回。若 `old` 中的指针指向一个尚未移走的序对,则将其移至新内存中第一个空闲单元(由 `free` 指向),并在旧位置存入残心标签和转发地址以建立残心。`Relocate-old-result-in-new` 使用一个名为 `oldcr` 的寄存器来保存由 `old` 所指对象的 `car` 或 `cdr`。

At the very end of the garbage-collection process, we interchange the role of

old and new memories by interchanging pointers: interchanging the-cars

with new-cars, and the-cdrs with new-cdrs. We will then

be ready to perform another garbage collection the next time memory runs out.

在垃圾收集过程的最后,我们通过交换指针来互换新旧内存的角色:将 `the-cars` 与 `new-cars` 互换,将 `the-cdrs` 与 `new-cdrs` 互换。此后,当内存再次耗尽时,我们便可以执行下一次垃圾收集了。

Racket #lang sicp
(accumulate
 +
 0
 (filter odd? (enumerate-interval 0 n)))
Racket #lang sicp
begin-garbage-collection
 (assign free (const 0))
 (assign scan (const 0))
 (assign old (reg root))
 (assign relocate-continue
 (label reassign-root))
 (goto (label relocate-old-result-in-new))
reassign-root
 (assign root (reg new))
 (goto (label gc-loop))
Racket #lang sicp
gc-loop
 (test (op =) (reg scan) (reg free))
 (branch (label gc-flip))
 (assign old
 (op vector-ref)
 (reg new-cars)
 (reg scan))
 (assign relocate-continue
 (label update-car))
 (goto (label relocate-old-result-in-new))
Racket #lang sicp
update-car
 (perform (op vector-set!)
 (reg new-cars)
 (reg scan)
 (reg new))
 (assign old
 (op vector-ref)
 (reg new-cdrs)
 (reg scan))
 (assign relocate-continue
 (label update-cdr))
 (goto (label relocate-old-result-in-new))
update-cdr
 (perform (op vector-set!)
 (reg new-cdrs)
 (reg scan)
 (reg new))
 (assign scan (op +) (reg scan) (const 1))
 (goto (label gc-loop))
Racket #lang sicp
relocate-old-result-in-new
 (test (op pointer-to-pair?) (reg old))
 (branch (label pair))
 (assign new (reg old))
 (goto (reg relocate-continue))
pair
 (assign oldcr
 (op vector-ref)
 (reg the-cars)
 (reg old))
 (test (op broken-heart?) (reg oldcr))
 (branch (label already-moved))
 (assign new (reg free)) ; new location for pair
 ;; Update free pointer.
 (assign free (op +) (reg free) (const 1))
 ;; Copy the car and cdr to new memory.
 (perform (op vector-set!)
 (reg new-cars)
 (reg new)
 (reg oldcr))
 (assign oldcr
 (op vector-ref)
 (reg the-cdrs)
 (reg old))
 (perform (op vector-set!)
 (reg new-cdrs)
 (reg new)
 (reg oldcr))
 ;; Construct the broken heart.
 (perform (op vector-set!)
 (reg the-cars)
 (reg old)
 (const broken-heart))
 (perform (op vector-set!)
 (reg the-cdrs)
 (reg old)
 (reg new))
 (goto (reg relocate-continue))
already-moved
 (assign new
 (op vector-ref)
 (reg the-cdrs)
 (reg old))
 (goto (reg relocate-continue))
Racket #lang sicp
gc-flip
 (assign temp (reg the-cdrs))
 (assign the-cdrs (reg new-cdrs))
 (assign new-cdrs (reg temp))
 (assign temp (reg the-cars))
 (assign the-cars (reg new-cars))
 (assign new-cars (reg temp))