灯下 登录
计算机科学 / SICP / subsection 中英对照

2.3.3 Example: Representing Sets

In the previous examples we built representations for two kinds of compound

data objects: rational numbers and algebraic expressions. In one of these

examples we had the choice of simplifying (reducing) the expressions at either

construction time or selection time, but other than that the choice of a

representation for these structures in terms of lists was straightforward. When

we turn to the representation of sets, the choice of a representation is not so

obvious. Indeed, there are a number of possible representations, and they

differ significantly from one another in several ways.

在前面的例子中,我们分别为两种复合数据对象建立了表示:有理数与代数表达式。在其中一个例子里,我们可以选择在构造时或选择时对表达式进行化简,但除此之外,用表来表示这些结构的方式是直截了当的。当我们转向集合的表示时,情况便不那么明朗了。事实上,集合有多种可能的表示方式,它们在若干方面存在显著差异。

Informally, a set is simply a collection of distinct objects. To give a more

precise definition we can employ the method of data abstraction. That is, we

define “set” by specifying the operations that are to be used on sets. These

are union-set, intersection-set, element-of-set?, and

adjoin-set. Element-of-set? is a predicate that determines

whether a given element is a member of a set. Adjoin-set takes an

object and a set as arguments and returns a set that contains the elements of

the original set and also the adjoined element. Union-set computes the

union of two sets, which is the set containing each element that appears in

either argument. Intersection-set computes the intersection of two

sets, which is the set containing only elements that appear in both arguments.

From the viewpoint of data abstraction, we are free to design any

representation that implements these operations in a way consistent with the

interpretations given above.

Subheading: Sets as unordered lists

非正式地说,集合不过是一组互不相同的对象的汇集。为了给出更精确的定义,我们可以借助数据抽象的方法——即通过规定作用于集合之上的操作来定义“集合”。这些操作是 union-set、intersection-set、element-of-set? 和 adjoin-set。element-of-set? 是一个谓词,用于判断给定元素是否属于某个集合。adjoin-set 接受一个对象和一个集合作为参数,返回包含原集合所有元素以及新加入元素的集合。union-set 计算两个集合的并集,即包含出现在任一参数中的每个元素的集合。intersection-set 计算两个集合的交集,即只包含同时出现在两个参数中的元素的集合。从数据抽象的角度来看,我们可以自由选择任何一种表示方式,只要它能以符合上述解释的方式实现这些操作即可。\n小节标题:用无序表表示集合

One way to represent a set is as a list of its elements in which no element

appears more than once. The empty set is represented by the empty list. In

this representation, element-of-set? is similar to the procedure

memq of 2.3.1. It uses equal? instead of

eq? so that the set elements need not be symbols:

表示集合的一种方式,是用一个每个元素至多出现一次的表来存放集合的元素。空集用空表表示。在这种表示下,element-of-set? 类似于 2.3.1 节中的过程 memq,但它使用 equal? 而非 eq?,从而使集合元素不必局限于符号:

Using this, we can write adjoin-set. If the object to be adjoined is

already in the set, we just return the set. Otherwise, we use cons to

add the object to the list that represents the set:

有了这个,就可以写出 adjoin-set。若待加入的对象已在集合中,直接返回该集合;否则,用 cons 将该对象添加到表示集合的表的头部:

For intersection-set we can use a recursive strategy. If we know how to

form the intersection of set2 and the cdr of set1, we only

need to decide whether to include the car of set1 in this. But

this depends on whether (car set1) is also in set2. Here is the

resulting procedure:

对于 intersection-set,可以采用递归策略。若已知如何求 set2 与 set1 的 cdr 的交集,只需再决定是否将 set1 的 car 纳入其中。而这取决于 (car set1) 是否也属于 set2。下面是所得的过程:

In designing a representation, one of the issues we should be concerned with is

efficiency. Consider the number of steps required by our set operations.

Since they all use element-of-set?, the speed of this operation has a

major impact on the efficiency of the set implementation as a whole. Now, in

order to check whether an object is a member of a set, element-of-set?

may have to scan the entire set. (In the worst case, the object turns out not

to be in the set.) Hence, if the set has n elements,

element-of-set? might take up to n steps. Thus, the number of

steps required grows as Θ

(

n

). The number of steps required by

adjoin-set, which uses this operation, also grows as Θ

(

n

).

For intersection-set, which does an element-of-set? check for

each element of set1, the number of steps required grows as the product

of the sizes of the sets involved, or Θ

(

在设计一种表示时,效率是我们需要关注的问题之一。考察集合操作所需的步骤数。由于所有操作都要用到 element-of-set?,这一操作的速度对整个集合实现的效率有着举足轻重的影响。为判断某个对象是否属于一个集合,element-of-set? 可能需要扫描整个集合(最坏情况下,该对象并不在集合中)。因此,若集合有 n 个元素,element-of-set? 最多需要 n 步。也就是说,所需步骤数的增长为 Θ

(

n

)。adjoin-set 调用了这一操作,其所需步骤数同样以 Θ

(

n

) 增长。对于 intersection-set 而言,它对 set1 中的每个元素都要做一次 element-of-set? 检验,所需步骤数以两个集合大小之积的量级增长,即 Θ

(

n
2

) for two sets of size

n. The same will be true of union-set.

) 对于规模均为 n 的两个集合而言。union-set 亦同此理。

Racket #lang sicp
(define (element-of-set? x set)
 (cond ((null? set) false)
 ((equal? x (car set)) true)
 (else (element-of-set? x (cdr set)))))
Racket #lang sicp
(define (adjoin-set x set)
 (if (element-of-set? x set)
 set
 (cons x set)))
Racket #lang sicp
(define (intersection-set set1 set2)
 (cond ((or (null? set1) (null? set2))
 '())
 ((element-of-set? (car set1) set2)
 (cons (car set1)
 (intersection-set (cdr set1)
 set2)))
 (else (intersection-set (cdr set1)
 set2))))