As we saw in 2.2.3, sequences can serve as standard interfaces
for combining program modules. We formulated powerful abstractions for
manipulating sequences, such as map, filter, and
accumulate, that capture a wide variety of operations in a manner that
is both succinct and elegant.
如我们在 2.2.3 节中所见,序列可以作为组合程序模块的标准接口。我们已建立了操纵序列的强大抽象,如 map、filter 和 accumulate,它们以简洁而优雅的方式捕获了各种各样的操作。
Unfortunately, if we represent sequences as lists, this elegance is bought at
the price of severe inefficiency with respect to both the time and space
required by our computations. When we represent manipulations on sequences as
transformations of lists, our programs must construct and copy data structures
(which may be huge) at every step of a process.
遗憾的是,如果我们用表来表示序列,这种优雅是以计算所需时间和空间上严重的低效为代价的。当我们将对序列的操作表示为对表的变换时,程序必须在计算过程的每一步都构造并复制数据结构(这些结构可能非常庞大)。
To see why this is true, let us compare two programs for computing the sum of
all the prime numbers in an interval. The first program is written in standard
iterative style:
为了说明这一点,让我们比较两个计算某区间内所有素数之和的程序。第一个程序以标准迭代风格编写:
The second program performs the same computation using the sequence operations
of 2.2.3:
第二个程序使用 2.2.3 节的序列操作完成同样的计算:
In carrying out the computation, the first program needs to store only the sum
being accumulated. In contrast, the filter in the second program cannot do any
testing until enumerate-interval has constructed a complete list of the
numbers in the interval. The filter generates another list, which in turn is
passed to accumulate before being collapsed to form a sum. Such large
intermediate storage is not needed by the first program, which we can think of
as enumerating the interval incrementally, adding each prime to the sum as it
is generated.
在执行计算的过程中,第一个程序只需存储正在积累的那个和。相比之下,第二个程序中的 filter 在 enumerate-interval 构造完区间内所有数的完整表之前无法进行任何检验。filter 再生成另一个表,该表又被传递给 accumulate,然后折叠成一个和。第一个程序不需要这样庞大的中间存储——我们可以把它看作逐步枚举区间,每找到一个素数就将其加入和中。
The inefficiency in using lists becomes painfully apparent if we use the
sequence paradigm to compute the second prime in the interval from 10,000 to
1,000,000 by evaluating the expression
如果用序列范式来计算 10,000 到 1,000,000 区间内的第二个素数,通过求值以下表达式,用表带来的低效就会显得格外触目惊心:
This expression does find the second prime, but the computational overhead is
outrageous. We construct a list of almost a million integers, filter this list
by testing each element for primality, and then ignore almost all of the
result. In a more traditional programming style, we would interleave the
enumeration and the filtering, and stop when we reached the second prime.
这个表达式确实能找到第二个素数,但其计算开销之大令人咋舌。我们构造了一个近百万个整数的表,用素性检验过滤该表的每个元素,然后几乎丢弃所有结果。在更传统的编程风格中,我们会将枚举与过滤交错进行,一旦找到第二个素数便停止。
Streams are a clever idea that allows one to use sequence manipulations without
incurring the costs of manipulating sequences as lists. With streams we can
achieve the best of both worlds: We can formulate programs elegantly as
sequence manipulations, while attaining the efficiency of incremental
computation. The basic idea is to arrange to construct a stream only
partially, and to pass the partial construction to the program that consumes
the stream. If the consumer attempts to access a part of the stream that has
not yet been constructed, the stream will automatically construct just enough
more of itself to produce the required part, thus preserving the illusion that
the entire stream exists. In other words, although we will write programs as
if we were processing complete sequences, we design our stream implementation
to automatically and transparently interleave the construction of the stream
with its use.
流 (streams) 是一种巧妙的构想,它允许人们使用序列操作而无需承担将序列作为表来操纵的代价。借助流,我们可以兼得两全:既能以序列操作的形式优雅地表达程序,又能实现增量式计算的效率。其基本思想是:只部分地构造一个流,并将部分构造的结果传递给消费该流的程序。如果消费者试图访问流中尚未构造的部分,流将自动构造恰好足够多的自身内容,以产生所需部分,从而维持整个流都已存在的假象。换言之,尽管我们将像处理完整序列那样编写程序,但我们设计的流实现会自动且透明地将流的构造与其使用交错进行。
On the surface, streams are just lists with different names for the procedures
that manipulate them. There is a constructor, cons-stream, and two
selectors, stream-car and stream-cdr, which satisfy the
constraints
从表面上看,流不过是换了操纵过程名称的表。它有一个构造函数 cons-stream,以及两个选择函数 stream-car 和 stream-cdr,满足如下约束:
There is a distinguishable object, the-empty-stream, which cannot be the
result of any cons-stream operation, and which can be identified with
the predicate stream-null?. Thus we can
make and use streams, in just the same way as we can make and use lists, to
represent aggregate data arranged in a sequence. In particular, we can build
stream analogs of the list operations from Chapter 2, such as
list-ref, map, and for-each:
存在一个特殊的对象 the-empty-stream,它不可能是任何 cons-stream 操作的结果,并可由谓词 stream-null? 来识别。因此,我们可以像使用表那样创建和使用流,以表示按序列排列的聚合数据。特别地,我们可以构造第 2 章表操作的流版本,例如 list-ref、map 和 for-each:
Stream-for-each is useful for viewing streams:
stream-for-each 对于查看流的内容很有用:
To make the stream implementation automatically and transparently interleave
the construction of a stream with its use, we will arrange for the cdr
of a stream to be evaluated when it is accessed by the stream-cdr
procedure rather than when the stream is constructed by cons-stream.
This implementation choice is reminiscent of our discussion of rational numbers
in 2.1.2, where we saw that we can choose to implement rational
numbers so that the reduction of numerator and denominator to lowest terms is
performed either at construction time or at selection time. The two
rational-number implementations produce the same data abstraction, but the
choice has an effect on efficiency. There is a similar relationship between
streams and ordinary lists. As a data abstraction, streams are the same as
lists. The difference is the time at which the elements are evaluated. With
ordinary lists, both the car and the cdr are evaluated at
construction time. With streams, the cdr is evaluated at selection
time.
为使流的实现自动且透明地将流的构造与其使用交错进行,我们将安排在通过 stream-cdr 过程访问流的 cdr 时才对其求值,而不是在由 cons-stream 构造流时就求值。这一实现选择令人联想到我们在 2.1.2 节对有理数的讨论——我们看到,可以选择在构造时或选取时对分子和分母进行约分,两种实现产生相同的数据抽象,但选择会影响效率。流与普通表之间存在类似的关系:作为数据抽象,流与表是相同的;区别在于元素求值的时机。对于普通表,car 和 cdr 在构造时均被求值;对于流,cdr 在选取时才被求值。
Our implementation of streams will be based on a special form called
delay. Evaluating (delay ⟨exp⟩) does not evaluate the
expression ⟨exp⟩, but rather returns a so-called
我们对流的实现将基于一个称为 delay 的特殊形式。对 (delay ⟨exp⟩) 求值时,并不对表达式 ⟨exp⟩ 进行求值,而是返回一个所谓的
delayed object, which we can think of as a “promise” to evaluate
⟨exp⟩ at some
future time. As a companion to delay, there is a procedure called
force that takes a delayed object as argument and performs the
evaluation—in effect, forcing the delay to fulfill its promise. We
will see below how delay and force can be implemented, but first
let us use these to construct streams.
Cons-stream is a special form defined so that
延迟对象 (delayed object),我们可以将其理解为在将来某个时刻对 ⟨exp⟩ 求值的“承诺”。与 delay 配套,有一个名为 force 的过程,它以延迟对象为参数并执行求值——实际上是强制 delay 履行其“承诺”。我们将在下文看到 delay 和 force 的实现,但首先让我们用它们来构造流。cons-stream 是一个特殊形式,其定义使得:
is equivalent to
等价于
What this means is that we will construct streams using pairs. However, rather
than placing the value of the rest of the stream into the cdr of the
pair we will put there a promise to compute the rest if it is ever requested.
Stream-car and stream-cdr can now be defined as procedures:
这意味着我们将用序对来构造流。然而,我们不会把流其余部分的值放入序对的 cdr,而是在那里放一个承诺 (promise):若将来有需要,才去计算流的其余部分。stream-car 和 stream-cdr 现在可以定义为过程:
Stream-car selects the car of the pair; stream-cdr selects
the cdr of the pair and evaluates the delayed expression found there to
obtain the rest of the stream.
Subheading: The stream implementation in action
stream-car 取序对的 car;stream-cdr 取序对的 cdr,并对那里找到的延迟表达式求值,从而得到流的其余部分。
小节标题:流实现的运作过程
To see how this implementation behaves, let us analyze the “outrageous” prime
computation we saw above, reformulated in terms of streams:
为了了解这一实现的运作方式,让我们分析前面那个"骇人听闻"的素数计算,将其用流的形式重新表述:
We will see that it does indeed work efficiently.
我们将会看到,它确实能高效地运行。
We begin by calling stream-enumerate-interval with the arguments 10,000
and 1,000,000. Stream-enumerate-interval is the stream analog of
enumerate-interval (2.2.3):
我们先以 10,000 和 1,000,000 为参数调用 stream-enumerate-interval。stream-enumerate-interval 是 enumerate-interval(2.2.3 节)的流版本:
and thus the result returned by stream-enumerate-interval, formed by the
cons-stream, is
于是,stream-enumerate-interval 由 cons-stream 构造返回的结果是
That is, stream-enumerate-interval returns a stream represented as a
pair whose car is 10,000 and whose cdr is a promise to enumerate
more of the interval if so requested. This stream is now filtered for primes,
using the stream analog of the filter procedure (2.2.3):
也就是说,stream-enumerate-interval 返回一个流,该流表示为一个序对,其 car 为 10,000,其 cdr 是一个承诺:若有需要则枚举区间中的更多元素。这个流随后被用素数进行过滤,使用的是 filter 过程(2.2.3 节)的流版本:
Stream-filter tests the stream-car of the stream (the car
of the pair, which is 10,000). Since this is not prime, stream-filter
examines the stream-cdr of its input stream. The call to
stream-cdr forces evaluation of the delayed
stream-enumerate-interval, which now returns
stream-filter 检验该流的 stream-car(即序对的 car,其值为 10,000)。由于 10,000 不是素数,stream-filter 继续检查其输入流的 stream-cdr。对 stream-cdr 的调用强制对延迟的 stream-enumerate-interval 求值,后者此时返回
Stream-filter now looks at the stream-car of this stream, 10,001,
sees that this is not prime either, forces another stream-cdr, and so
on, until stream-enumerate-interval yields the prime 10,007, whereupon
stream-filter, according to its definition, returns
stream-filter 接着查看这个流的 stream-car,即 10,001,发现它也不是素数,便再次强制求 stream-cdr,如此继续,直到 stream-enumerate-interval 产出素数 10,007,此时 stream-filter 按其定义返回
which in this case is
在这种情况下,结果是
This result is now passed to stream-cdr in our original expression.
This forces the delayed stream-filter, which in turn keeps forcing the
delayed stream-enumerate-interval until it finds the next prime, which
is 10,009. Finally, the result passed to stream-car in our original
expression is
这个结果随后被传给原始表达式中的 stream-cdr。这将强制对延迟的 stream-filter 求值,后者反过来持续强制延迟的 stream-enumerate-interval,直到找到下一个素数 10,009。最终,传给原始表达式中 stream-car 的结果是
Stream-car returns 10,009, and the computation is complete. Only as
many integers were tested for primality as were necessary to find the second
prime, and the interval was enumerated only as far as was necessary to feed the
prime filter.
stream-car 返回 10,009,计算完成。在整个过程中,被检验素性的整数恰好是为找到第二个素数所必需的那些,而区间的枚举也只进行到向素数过滤器提供所需数据为止。
In general, we can think of delayed evaluation as “demand-driven”
programming, whereby each stage in the stream process is activated only enough
to satisfy the next stage. What we have done is to decouple the actual order
of events in the computation from the apparent structure of our procedures. We
write procedures as if the streams existed “all at once” when, in reality,
the computation is performed incrementally, as in traditional programming
styles.
Subheading: Implementing delay and force
一般而言,我们可以把延迟求值视为"按需驱动"的程序设计方式——流计算过程的每个阶段只被激活到足以满足下一阶段的程度。我们所做的,是将计算中事件的实际发生顺序与过程的表观结构解耦。我们编写过程时仿佛流"同时全部存在",而实际上计算是像传统程序设计风格那样增量进行的。
小节标题:delay 和 force 的实现
Although delay and force may seem like mysterious operations,
their implementation is really quite straightforward. Delay must
package an expression so that it can be evaluated later on demand, and we can
accomplish this simply by treating the expression as the body of a procedure.
Delay can be a special form such that
尽管 delay 和 force 看起来像是神秘的操作,它们的实现其实相当直接。delay 必须将一个表达式打包,使其能够按需在日后求值,而我们可以简单地把该表达式视为一个过程的体来实现这一点。delay 可以是一种特殊形式,使得
is syntactic sugar for
是以下形式的语法糖
Force simply calls the procedure (of no arguments) produced by
delay, so we can implement force as a procedure:
force 只是调用由 delay 产生的那个(无参数的)过程,因此可以将 force 实现为一个过程:
This implementation suffices for delay and force to work as
advertised, but there is an important optimization that we can include. In
many applications, we end up forcing the same delayed object many times. This
can lead to serious inefficiency in recursive programs involving streams. (See
Exercise 3.57.) The solution is to build delayed objects so that the
first time they are forced, they store the value that is computed. Subsequent
forcings will simply return the stored value without repeating the computation.
In other words, we implement delay as a special-purpose memoized
procedure similar to the one described in Exercise 3.27. One way to
accomplish this is to use the following procedure, which takes as argument a
procedure (of no arguments) and returns a memoized version of the procedure.
The first time the memoized procedure is run, it saves the computed result. On
subsequent evaluations, it simply returns the result.
这一实现已足以使 delay 和 force 按预期工作,但我们还可以加入一个重要的优化。在许多应用中,我们会多次强制求同一个延迟对象的值。在涉及流的递归程序中,这可能导致严重的低效。(见练习 3.57。)解决办法是将延迟对象构造成这样:第一次被强制求值时,它存储计算得到的值;之后的每次强制求值只需返回已存储的值,而无需重复计算。换言之,我们将 delay 实现为一种专用的记忆化过程,类似于练习 3.27 中描述的那种。实现这一点的方法之一,是使用下面这个过程——它以一个(无参数的)过程作为参数,返回该过程的记忆化版本。记忆化过程第一次运行时,它保存计算结果;此后求值时,直接返回该结果。
Delay is then defined so that (delay ⟨exp⟩) is equivalent
to
然后将 delay 定义为使 `(delay ⟨exp⟩)` 等价于
and force is as defined previously.
而 force 则与前面的定义相同。
(define (sum-primes a b)
(define (iter count accum)
(cond ((> count b) accum)
((prime? count)
(iter (+ count 1)
(+ count accum)))
(else (iter (+ count 1) accum))))
(iter a 0)) (define (sum-primes a b)
(accumulate
+
0
(filter prime? (enumerate-interval a b)))) (car (cdr
(filter
prime?
(enumerate-interval 10000 1000000)))) (stream-car (cons-stream x y)) = x
(stream-cdr (cons-stream x y)) = y (define (stream-ref s n)
(if (= n 0)
(stream-car s)
(stream-ref (stream-cdr s) (- n 1))))
(define (stream-map proc s)
(if (stream-null? s)
the-empty-stream
(cons-stream
(proc (stream-car s))
(stream-map proc (stream-cdr s)))))
(define (stream-for-each proc s)
(if (stream-null? s)
'done
(begin
(proc (stream-car s))
(stream-for-each proc
(stream-cdr s))))) (define (display-stream s)
(stream-for-each display-line s))
(define (display-line x)
(newline)
(display x)) (cons-stream ⟨a⟩ ⟨b⟩) (cons ⟨a⟩ (delay ⟨b⟩)) (define (stream-car stream)
(car stream))
(define (stream-cdr stream)
(force (cdr stream))) (stream-car
(stream-cdr
(stream-filter
prime? (stream-enumerate-interval
10000 1000000)))) (define (stream-enumerate-interval low high)
(if (> low high)
the-empty-stream
(cons-stream
low
(stream-enumerate-interval (+ low 1)
high)))) (cons 10000
(delay
(stream-enumerate-interval
10001
1000000))) (define (stream-filter pred stream)
(cond ((stream-null? stream)
the-empty-stream)
((pred (stream-car stream))
(cons-stream
(stream-car stream)
(stream-filter
pred
(stream-cdr stream))))
(else (stream-filter
pred
(stream-cdr stream))))) (cons 10001
(delay
(stream-enumerate-interval
10002
1000000))) (cons-stream
(stream-car stream)
(stream-filter pred (stream-cdr stream))) (cons 10007
(delay
(stream-filter
prime?
(cons 10008
(delay
(stream-enumerate-interval
10009 1000000)))))) (cons 10009
(delay
(stream-filter
prime?
(cons 10010
(delay
(stream-enumerate-interval
10011 1000000)))))) (delay ⟨exp⟩) (lambda () ⟨exp⟩) (define (force delayed-object)
(delayed-object)) (define (memo-proc proc)
(let ((already-run? false) (result false))
(lambda ()
(if (not already-run?)
(begin (set! result (proc))
(set! already-run? true)
result)
result)))) (memo-proc (lambda () ⟨exp⟩))