The above examples demonstrate how the ability to pass procedures as arguments
significantly enhances the expressive power of our programming language. We
can achieve even more expressive power by creating procedures whose returned
values are themselves procedures.
上面的例子展示了将过程作为参数传递的能力如何显著增强我们编程语言的表达能力。通过构造以过程为返回值的过程,我们能够获得更强大的表达力。
We can illustrate this idea by looking again at the fixed-point example
described at the end of 1.3.3. We formulated a new version of
the square-root procedure as a fixed-point search, starting with the
observation that x is a fixed-point of the function y
↦
x
我们可以通过重新审视 1.3.3 末尾描述的不动点例子来阐明这一思路。我们将求平方根的过程重新表述为一个不动点搜索,出发点是观察到 x 是函数 y ↦ x
/
y. Then we used average damping to make the approximations converge.
Average damping is a useful general technique in itself. Namely, given a
function f, we consider the function whose value at x is equal to the
average of x and f
(
x
).
We can express the idea of average damping by means of the following procedure:
y 的不动点。随后我们利用平均阻尼使近似值收敛。平均阻尼 (average damping) 本身就是一种有用的通用技术。具体而言,给定函数 f,我们考虑这样一个函数:它在 x 处的值等于 x 与 f(x) 的平均值。我们可以用如下过程来表达平均阻尼的思路:
Average-damp is a procedure that takes as its argument a procedure
f and returns as its value a procedure (produced by the lambda)
that, when applied to a number x, produces the average of x and
(f x). For example, applying average-damp to the square
procedure produces a procedure whose value at some number x is the average
of x and x
2. Applying this resulting procedure to 10 returns the
average of 10 and 100, or 55:
`average-damp` 是一个过程,它以过程 f 作为参数,返回一个由 lambda 产生的过程;该过程在应用于数 x 时,返回 x 与 (f x) 的平均值。例如,将 `average-damp` 应用于 `square` 过程,得到的过程在某个数 x 处的值是 x 与 x² 的平均值。将这一结果过程应用于 10,返回 10 与 100 的平均值,即 55:
Using average-damp, we can reformulate the square-root procedure as
follows:
利用 `average-damp`,我们可以将求平方根的过程重新表述如下:
Notice how this formulation makes explicit the three ideas in the method:
fixed-point search, average damping, and the function y
↦
x
请注意,这一表述明确体现了方法中的三个要素:不动点搜索、平均阻尼,以及函数 y ↦ x
/
y.
It is instructive to compare this formulation of the square-root method with
the original version given in 1.1.7. Bear in mind that these
procedures express the same process, and notice how much clearer the idea
becomes when we express the process in terms of these abstractions. In
general, there are many ways to formulate a process as a procedure.
Experienced programmers know how to choose procedural formulations that are
particularly perspicuous, and where useful elements of the process are exposed
as separate entities that can be reused in other applications. As a simple
example of reuse, notice that the cube root of x is a fixed point of the
function y
↦
x
y。将这一表述与 1.1.7 给出的原始版本加以对比,颇有教益。请记住这两个过程表达的是同一个计算过程,并留意当我们用这些抽象来表达计算过程时,思路变得清晰了多少。一般而言,将一个计算过程表述为过程有多种方式。有经验的程序员懂得如何选择特别清晰的过程表述形式,在这些形式中,计算过程中有用的要素以独立实体的形式暴露出来,可在其他应用中复用。作为复用的一个简单示例,注意 x 的立方根是函数 y ↦ x
/
y
2, so we can immediately generalize our
square-root procedure to one that extracts cube roots:
y² 的不动点,因此我们可以立即将求平方根的过程推广为提取立方根的过程:
Subheading: Newton’s method
小节标题:牛顿法
When we first introduced the square-root procedure, in 1.1.7, we
mentioned that this was a special case of
Newton’s method.
If x
↦
g
(
x
) is a differentiable function, then a solution of the equation
g
(
x
)
=
0 is a fixed point of the function x
↦
f
(
x
) where
在 1.1.7 中,我们初次引入求平方根的过程时,曾提到这是牛顿法 (Newton's method) 的一个特例。若 x ↦ g(x) 是可微函数,则方程 g(x) = 0 的解是函数 x ↦ f(x) 的不动点,其中
f
(
x
)
f(x)
=
x
−
x −
g
(
x
)
g(x)
D
g
(
x
)
Dg(x)
and D
g
(
x
) is the derivative of g evaluated at x. Newton’s
method is the use of the fixed-point method we saw above to approximate a
solution of the equation by finding a fixed point of the function
f.
而 Dg(x) 是 g 在 x 处的导数值。牛顿法就是利用上文所见的不动点方法,通过寻找函数 f 的不动点来近似求解方程。
For many functions g and for sufficiently good initial guesses for x,
Newton’s method converges very rapidly to a solution of g
(
x
)
=
0.
对于许多函数 g,以及对 x 足够好的初始猜测,牛顿法能非常迅速地收敛到 g(x) = 0 的解。
In order to implement Newton’s method as a procedure, we must first express the
idea of derivative. Note that “derivative,” like average damping, is
something that transforms a function into another function. For instance, the
derivative of the function x
↦
为了将牛顿法实现为一个过程,我们首先必须表达导数 (derivative) 的概念。注意,“导数”与平均阻尼类似,也是将一个函数变换为另一个函数的东西。例如,函数 x ↦
x
3 is the function x
↦
3
x³ 的导数是函数 x ↦ 3
x
2.
In general, if g is a function and d
x is a small number,
then the derivative D
g of g is the function whose value at any
number x is given (in the limit of small d
x) by
x²。一般而言,若 g 是一个函数,dx 是一个很小的数,则 g 的导数 Dg 是这样一个函数:它在任意数 x 处的值(在 dx 趋于零的极限下)由下式给出:
D
g
(
x
)
Dg(x)
=
g
(
x
+
d
x
)
−
g
(
x
)
g(x + dx) − g(x)
d
x
dx
.
。
Thus, we can express the idea of derivative (taking d
x to be, say,
0.00001) as the procedure
因此,我们可以将导数的思路(取 dx 为例如 0.00001)表述为如下过程:
along with the definition
以及下面的定义:
Like average-damp, deriv is a procedure that takes a procedure as
argument and returns a procedure as value. For example, to approximate the
derivative of x
↦
与 `average-damp` 类似,`deriv` 也是一个以过程为参数、以过程为返回值的过程。例如,要近似计算 x ↦
x
3 at 5 (whose exact value is 75) we can evaluate
x³ 在 5 处的导数(精确值为 75),我们可以求值:
With the aid of deriv, we can express Newton’s method as a fixed-point
process:
借助 `deriv`,我们可以将牛顿法表述为一个不动点计算过程:
The newton-transform procedure expresses the formula at the beginning of
this section, and newtons-method is readily defined in terms of this.
It takes as arguments a procedure that computes the function for which we want
to find a zero, together with an initial guess. For instance, to find the
square root of x, we can use Newton’s method to find a zero of the function
y
↦
`newton-transform` 过程表达了本节开头的公式,`newtons-method` 则可方便地在此基础上定义。它以一个计算目标函数的过程和一个初始猜测为参数。例如,要求 x 的平方根,我们可以用牛顿法寻找函数 y ↦
y
2
y²
−
x starting with an initial guess of 1.
This provides yet another form of the square-root procedure:
− x 的零点,以 1 作为初始猜测。这给出了求平方根过程的又一种形式:
Subheading: Abstractions and first-class procedures
小节标题:抽象与第一类过程
We’ve seen two ways to express the square-root computation as an instance of a
more general method, once as a fixed-point search and once using Newton’s
method. Since Newton’s method was itself expressed as a fixed-point process,
we actually saw two ways to compute square roots as fixed points. Each method
begins with a function and finds a fixed point of some transformation of the
function. We can express this general idea itself as a procedure:
我们已经看到了将平方根计算表达为某种更一般方法之实例的两种方式:一次作为不动点搜索,一次使用牛顿法。由于牛顿法本身也被表述为一个不动点计算过程,我们实际上看到了将平方根计算为不动点的两种途径。每种方法都从一个函数出发,求该函数经某种变换后的不动点。我们可以将这一通用思路本身表述为一个过程:
This very general procedure takes as its arguments a procedure g that
computes some function, a procedure that transforms g, and an initial
guess. The returned result is a fixed point of the transformed function.
这个高度通用的过程以一个计算某函数的过程 g、一个对 g 进行变换的过程以及一个初始猜测为参数,返回值是被变换后的函数的不动点。
Using this abstraction, we can recast the first square-root computation from
this section (where we look for a fixed point of the average-damped version of
y
↦
x
利用这一抽象,我们可以将本节第一个平方根计算重新表述(其中我们寻找 y ↦ x
/
y) as an instance of this general method:
y) 作为这一一般方法的实例:
Similarly, we can express the second square-root computation from this section
(an instance of Newton’s method that finds a fixed point of the Newton
transform of y
↦
类似地,我们也可以将本节中第二个求平方根的计算(即牛顿法的一个实例,它求 y ↦
y
2
−
x) as
−
x 的牛顿变换的不动点)表达为:
We began section 1.3 with the observation that compound procedures are a
crucial abstraction mechanism, because they permit us to express general
methods of computing as explicit elements in our programming language. Now
we’ve seen how higher-order procedures permit us to manipulate these general
methods to create further abstractions.
我们在 1.3 节开头指出,复合过程是一种至关重要的抽象机制,因为它允许我们将一般性的计算方法表达为程序设计语言中的显式元素。现在我们已经看到,高阶过程使我们能够对这些一般方法加以操控,从而构造出更高层次的抽象。
As programmers, we should be alert to opportunities to identify the underlying
abstractions in our programs and to build upon them and generalize them to
create more powerful abstractions. This is not to say that one should always
write programs in the most abstract way possible; expert programmers know how
to choose the level of abstraction appropriate to their task. But it is
important to be able to think in terms of these abstractions, so that we can be
ready to apply them in new contexts. The significance of higher-order
procedures is that they enable us to represent these abstractions explicitly as
elements in our programming language, so that they can be handled just like
other computational elements.
作为程序员,我们应当善于发现程序中潜藏的抽象,并在此基础上加以发展和推广,以构造出更强大的抽象。这并不是说程序应当总是以最抽象的方式来编写;有经验的程序员懂得如何针对具体任务选取恰当的抽象层次。但能够在这些抽象层次上进行思考至关重要,因为这样我们才能随时准备好将它们应用于新的场合。高阶过程的重要意义正在于:它使我们能够将这些抽象显式地表达为程序设计语言中的元素,从而像处理其他计算元素一样对它们加以操控。
In general, programming languages impose restrictions on the ways in which
computational elements can be manipulated. Elements with the fewest
restrictions are said to have
first-class status. Some of the
“rights and privileges” of first-class elements are:
一般而言,程序设计语言会对计算元素的可操控方式施加限制。受限最少的元素被称为具有一等地位 (first-class status)。一等元素所享有的"权利与特权"包括:
They may be named by variables.
可以用变量命名。
They may be passed as arguments to procedures.
可以作为实参传递给过程。
They may be returned as the results of procedures.
可以作为过程的返回值。
They may be included in data structures.
可以包含在数据结构中。
Lisp, unlike other common programming languages, awards procedures full
first-class status. This poses challenges for efficient implementation, but
the resulting gain in expressive power is enormous.
Lisp 与其他常见程序设计语言不同,它赋予过程完整的一等地位。这给高效实现带来了挑战,但由此获得的表达能力的提升是巨大的。
(define (average-damp f)
(lambda (x)
(average x (f x)))) ((average-damp square) 10)
55 (define (sqrt x)
(fixed-point
(average-damp
(lambda (y) (/ x y)))
1.0)) (define (cube-root x)
(fixed-point
(average-damp
(lambda (y)
(/ x (square y))))
1.0)) (define (deriv g)
(lambda (x)
(/ (- (g (+ x dx)) (g x))
dx))) (define dx 0.00001) (define (cube x) (* x x x))
((deriv cube) 5)
75.00014999664018 (define (newton-transform g)
(lambda (x)
(- x (/ (g x)
((deriv g) x)))))
(define (newtons-method g guess)
(fixed-point (newton-transform g)
guess)) (define (sqrt x)
(newtons-method
(lambda (y)
(- (square y) x))
1.0)) (define (fixed-point-of-transform
g transform guess)
(fixed-point (transform g) guess)) (define (sqrt x)
(fixed-point-of-transform
(lambda (y) (/ x y))
average-damp
1.0)) (define (sqrt x)
(fixed-point-of-transform
(lambda (y) (- (square y) x))
newton-transform
1.0))