"Single Entry, Single Exit" was written when most programming was done in assembly language, FORTRAN, or COBOL. It has been widely misinterpreted, because modern languages do not support the practices Dijkstra was warning against.
"Single Entry" meant "do not create alternate entry points for functions". In assembly language, of course, it is possible to enter a function at any instruction. FORTRAN supported multiple entries to functions with the ENTRY
statement:
SUBROUTINE S(X, Y)
R = SQRT(X*X + Y*Y)
C ALTERNATE ENTRY USED WHEN R IS ALREADY KNOWN
ENTRY S2(R)
...
RETURN
END
C USAGE
CALL S(3,4)
C ALTERNATE USAGE
CALL S2(5)
"Single Exit" meant that a function should only return to one place: the statement immediately following the call. It did not mean that a function should only return from one place. When Structured Programming was written, it was common practice for a function to indicate an error by returning to an alternate location. FORTRAN supported this via "alternate return":
C SUBROUTINE WITH ALTERNATE RETURN. THE '*' IS A PLACE HOLDER FOR THE ERROR RETURN
SUBROUTINE QSOLVE(A, B, C, X1, X2, *)
DISCR = B*B - 4*A*C
C NO SOLUTIONS, RETURN TO ERROR HANDLING LOCATION
IF DISCR .LT. 0 RETURN 1
SD = SQRT(DISCR)
DENOM = 2*A
X1 = (-B + SD) / DENOM
X2 = (-B - SD) / DENOM
RETURN
END
C USE OF ALTERNATE RETURN
CALL QSOLVE(1, 0, 1, X1, X2, *99)
C SOLUTION FOUND
...
C QSOLVE RETURNS HERE IF NO SOLUTIONS
99 PRINT 'NO SOLUTIONS'
Both these techniques were highly error prone. Use of alternate entries often left some variable uninitialized. Use of alternate returns had all the problems of a GOTO statement, with the additional complication that the branch condition was not adjacent to the branch, but somewhere in the subroutine.
Thanks to Alexey Romanov for finding the original paper. See http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF, page 28 (printed page number is 24). Not limited to functions.
1973
The term was in use at least as early as 1973, as seen in this advert from IPS Computer Marketing Corp. in Computerworld magazine (30 May 1973 - Vol. 7, No. 22):
360/651H or J System available for sale or lease Sept. 1973. Will supply with any number of selector channels. With 7074 Hypervisor.
1970
It appears in these two 1970 papers, with one quoting the other.
Operating systems architecture, H Katzan Jr - Proceedings of the May 5-7, 1970, spring joint computer conference:
... Hypervisors are particularly useful when it is necessary to run an emulator and an operating system at the same time. Similar to multiprogramming systems, a hypervisor is characterized by: (1) limited access; (2) batch utilization; (3) high throughput performance; (4) priority ...
Analysis of Major Computer Operating Systems, CS McIntosh, KP Choate, WC Mittwede - 1970 - DTIC Document (PDF):
As a result, this classification scheme should not be viewed as conflicting with other schemes which attempt to either describe different system environments or that are used for other purposes. For example, Harry Katzan, Jr., in a report presented at the 1970 Spring Joint Computer Conference entitled "Operating Systems Architecture, " describes five operating system types: multiprogramming, hypervisor multiprogramming, time-sharing, virtual systems, and tri-level operating systems. This classification scheme was developed to encompass a number of experimental and research-oriented systems, including some of those cited above. Consequently, the classification structure does not purport to be an
inclusive representation of commercially available software. Nevertheless, since several of these system types are not represented by any commercially available system, this categorization can only be superficially applied to the commercial environment.
1969?
It also appears in earlier snippets in Google Books, but care must be taken as Google often has incorrect metadata. However, this 1969 description of the IBM 360/60 in Management Services, (Volumes 6-7, American Institute of Certified Public Accountants) seems possible (date check):
To operate in the multiprograming mode with both control systems simultaneously would require a minimum of 128K bytes of core memory, and thus a 360/40, since maximum core for a 360/30 is 65K bytes of memory. In addition, a hypervisor (a master control system requiring both hardware and software) would be required to partition memory between both control systems.
1966?
It may also appear in a paper by IBM: A Virtual Machine System for the 360/40 (1966) by R Adair, R Bayles, L Comeau, R Creasy, but Google Books only shows it as a result and no text. If someone has access to this paper online, perhaps they can confirm it.
Best Answer
From Herb Sutter using the reference quoted:
And yes, there was discussion at the time.
The rest is history. And I hate seeing an unanswered question with such good links.