Chapter 2: Building Abstractions with Data
We now come to the decisive step of mathematical abstraction: we forget about what the symbols stand for… [The mathematician] need not be idle; there are many operations which he may carry out with these symbols, without ever having to look at the things they stand for.
——Hermann Weyl, The mathematical Way of Thinking
The general technique of isolating the parts of a program that deal with how data objects are represented from the parts of a program that deal with how data objects are used is a powerful design methodology called data abstraction.
We will discover how to form compound data using no special “data” operations at all, only procedures, which will further blur the distinction between “procedure” and “data”.
Contents
2.1 Introduction to Data Abstraction
2.1.1 Example: Arithmetic Operations for Rational Numbers
a example shows that how to use cons, car and cdr to represent a rational numbers.
2.1.2 Abstraction Barriers
In effect, procedures at each level are the interfaces that define the abstraction barriers and connect the different levels.
Constraining the dependence on the representation to a few interface procedures helps us design programs as well as modify them because it allow us to maintain the flexibility to consider alternative implementations.
2.1.3 What Is Meant by Data?
In general, we can think of data as defined by some collection of selectors and constructors, together with specified conditions that these procedures must fulfill in order to be a valid representation.
For example,
(car (cons x y)) == x and
(cdr (cons x y)) == y.
Any triple of procedures that satisfies the above condition can be used as the basis for implementing pairs.
Therefore, we could implement
cons,
car and
cdr without using any data structures at all but only using procedures.
(However, for efficiency reasons, Scheme and Lisp implement pairs directly.)
1 2 3 4 5 6 7 8 |
(define (cons x y) (define (dispatch m) (cond ((= m 0) x) (cond ((= m 1) y) (else (error "..." m)))) dispatch) (define (car z) (z 0)) (define (cdr z) (z 1)) |
1 2 3 4 |
(define (cons x y) (lambda (m) (m x y))) (define (car z) (z (lambda (p q) p))) |
Exercise 2.6
define natural number in terms of lambda:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#lang sicp (print "Church numerals") (define zero (lambda (f) (lambda (x) x))) (define (add-1 n) (lambda (f) (lambda(x) (f ((n f) x))))) (define (church-to-int ch) ((ch (lambda (x) (+ x 1))) 0)) (define one (lambda (f) (lambda (x) (f x)))) (define two (lambda (f) (lambda (x) (f (f x))))) (church-to-int one) (church-to-int two) (define (add a b) (lambda (f) (lambda (x) ((a f) ((b f) x))))) (church-to-int (add one two)) |
2.1.4 Extended Exercise: Interval Arithmetic
2.2 Hierarchical Data and the Closure Property
The ability to create pairs whose elements are pairs is the closure property of
cons.
In general, an operation for combining data objects satisfies the closure property if the results of combining things with that operation can themselves be combined using the same operation.
With the closure property, we can construct hierarchical data in a consistent way.
(This word “closure” comes from abstract algebra, where a set of elements is said to be closed under an operation if applying the operation to elements in the set produces an element that is again an element of the set.
While the Lisp community also (unfortunately) uses the word “closure” to describe a unrelated concept: A closure is an implementation technique for representing procedures with free variables)
2.2.1 Representing Sequences
(list <a1> <a2> ... <an>) is equivalent to
1 2 3 4 5 |
(cons <a1> (cons <a2> (cons ... (cons <an> nil)...))) |
(cadr <arg>) is equivalent to
(car (cdr <arg>))
In a procedure definition, a parameter list that has a dot before the last parameter name indicates that, when the procedure is called, the final parameter’s value will be a list of any remaining arguments.
2.2.2 Hierarchical Structures
We can think of sequences whose elements are sequences as trees.
Recursion is a natural tool for dealing with tree structures.
1 2 3 4 5 6 |
(define (scale-tree tree factor) (map (lambda (sub-tree) (if (pair? sub-tree) (scale-tree sub-tree factor) (* sub-tree factor) tree)) |
2.2.3 Sequences as Conventional Interfaces
The key to organizing programs so as to more clearly reflect the signal flow structure is to concentrate on the “signals” that flow from one stage in the process to the next.
1 2 3 4 5 6 |
(define (sum-odd-squares tree) (accumulate + 0 (map (square (filter odd? (enumerate-tree tree))))) (define (even-fibs n) (accumulate cons nil (filter even? (map fib (enumerate-interval 0 n))))) |
We will generalize the sequence-processing paradigm to admit infinite sequences.
Exercise 2.37
We can extend the sequence paradigm by nested mappings to include many computations that are commonly expressed using nested loops.
1 2 3 4 5 6 7 8 9 10 |
(define (flatmap proc seq) (accumulate append nil (map proc seq))) (define (permutations s) (if (null? s) (list nil) (flatmap (lambda (x) (map (lambda (p) (cons x p)) (permutations (remove x s)))) s))) |
Exercise 2.42
2.2.4 Example: A Picture Language
The fundamental data abstractions, painters, are implemented using procedural representations, which enables the language to handle different basic drawing capabilities in a uniform way.
The means of combination satisfy the closure property, which permits us to easily build up complex designs.
Finally, all the tools for abstracting procedures are available to us for abstracting means of combination for painters.
Stratified design: a complex system should be structured as a sequence of levels that are described using a sequence of languages. The language used at each level of a stratified design has primitives, means of combination, and means of abstraction appropriate to that level of detail.
2.3 Symbolic Data
We extend the representational capability by introducing the ability to work with arbitrary symbols as data.
2.3.1 Quotation
The meaning of the single quote character is to quote the next object.
However, it violates the general rule that all compound expressions should be delimited by parentheses and look like lists.
We can recover this consistency by introducing a special form quote.
It maintains the principle that any expression seen by the interpreter can be manipulated as a data object.
For instance,
(car '(a b c)) is the same as
(car (quote (a b c))), by evaluating
(list 'car (list 'quote '(a b c))).
(We can obtain the empty list nil by evaluating
'())
2.3.2 Example: Symbolic Differentiation
Exercise 2.58
2.3.3 Example: Representing Sets
2.3.4 Example: Huffman Encoding Tree
Exercise 2.69