diff --git a/Makefile b/Makefile index b914b5d..4ffc2f2 100644 --- a/Makefile +++ b/Makefile @@ -23,8 +23,8 @@ EPUB_FILES = mimetype \ OEBPS/figure2.png EPUB_ARCHIVE = out-of-the-tar-pit.epub -#test : $(EPUB_ARCHIVE) -# java -jar lib/epubcheck-3.0.1.jar $(EPUB_ARCHIVE) +# test : $(EPUB_ARCHIVE) +# java -jar lib/epubcheck-4.0.1/epubcheck.jar $(EPUB_ARCHIVE) $(EPUB_ARCHIVE) : $(EPUB_FILES) zip -X $@ $^ diff --git a/OEBPS/abstract.html b/OEBPS/abstract.html index c5e6848..5db20d8 100644 --- a/OEBPS/abstract.html +++ b/OEBPS/abstract.html @@ -10,12 +10,10 @@

Abstract

-
-

+

Complexity is the single major difficulty in the successful development of large-scale software systems. Following Brooks we distinguish accidental from essential difficulty, but disagree with his premise that most complexity remaining in contemporary systems is essential. We identify common causes of complexity and discuss general approaches which can be taken to eliminate them where they are accidental in nature. To make things more concrete we then give an outline for a potential complexity-minimizing approach based on functional programming and Codd’s relational model of data.

-
diff --git a/OEBPS/section-1.html b/OEBPS/section-1.html index 6c8cbf9..155ee9a 100644 --- a/OEBPS/section-1.html +++ b/OEBPS/section-1.html @@ -10,7 +10,6 @@

1 Introduction

-

The “software crisis” was first identified in 1968 [NR69] and in the intervening decades has deepened rather than abated. @@ -45,6 +44,5 @@

1 Introduction

Finally we contrast our approach with others in section 11 and then give conclusions in section 12.

-
diff --git a/OEBPS/section-10.html b/OEBPS/section-10.html index a4014cd..cad1b76 100644 --- a/OEBPS/section-10.html +++ b/OEBPS/section-10.html @@ -12,7 +12,6 @@

10 Example of an FRP system

-

We now examine a simple example FRP system. @@ -36,11 +35,9 @@

10 Example of an FRP system

The example will use syntax from a hypothetical FRP infrastructure (which supports not only the relational algebra but also some of the common extensions from section 8.5) — typewriter font is used for this.

-

10.1 User-defined Types

-

The example system makes use of a small number of custom types (see section 9.3), some of which are just aliases for types provided by the infrastructure: @@ -57,11 +54,9 @@

10.1 User-defined Types

def enum speedBand : VERY_FAST | FAST | MEDIUM | SLOW | VERY_SLOW -

10.2 Essential State

-

The essential state (see section 9.1.1) consists of the definitions of the types of the base relvars (the types of the attributes are shown in italics). @@ -125,21 +120,17 @@

10.2 Essential State

The commission fees are assigned on the basis of sale prices divided into different priceBands, Property addresses categorized into areaCodes and ratings of the saleSpeed. (The decision has been made to represent commission rates as a base relation — rather than as a function — so that the commission fees can be queried and easily adjusted).

-

10.3 Essential Logic

-

This is the heart of the system (see section 9.1.2) and corresponds to the “business logic”.

-

10.3.1 Functions

-

We do not give the actual function definitions here, we just describe their operation informally. @@ -155,11 +146,9 @@

10.3.1 Functions

Converts a pair of dates into a speedBand (reflecting the speed of sale after taking into account the time of year)
-

10.3.2 Derived Relations

-

There are thirteen derived relations in the system. @@ -172,11 +161,9 @@

10.3.2 Derived Relations

In reality these types would be derived (or checked) by an infrastructure-provided type inference mechanism.

-
Internal
-

The ten internal derived relations exist mainly to help with the later definition of the three external ones. @@ -311,11 +298,9 @@

Internal
This gives the amount of commission due to each agent on each Property (represented by address).

-
External
-

Having now defined all the internal derived relations, we are now in a position to define the external derived relations — these are the ones which will be of most direct interest to the users of the system. @@ -360,11 +345,9 @@

External
Finally, the total commission due to each agent is calculated by simply summing up the commission attribute of the SalesCommissions relation on a per agent basis to give the totalCommission attribute.

-

10.3.3 Integrity

-

Integrity constraints are given in the form of relational algebra or relational calculus expressions. @@ -454,11 +437,9 @@

10.3.3 Integrity

Once the system is deployed, the FRP infrastructure will reject any state modification attempts which would violate any of these integrity constraints.

-

10.4 Accidental State and Control

-

The accidental state and control component of an FRP system consists solely of a set of declarations which represent performance hints for the infrastructure (see section 9.1.3). @@ -489,11 +470,9 @@

10.4 Accidental State and Control

Larger systems would probably also include accidental control specifications for performance reasons.

-

10.5 Other

-

The feeders and observers for this system would be fairly simple — feeding user input into Decisions, Offers etc., and directly observing and displaying the various derived relations as output (e.g. OpenOffers, PropertyForWebSite and CommisionDue). @@ -506,7 +485,6 @@

10.5 Other

One extension which might require a custom observer would be a requirement to connect CommissionDue into an external payroll system.

-
diff --git a/OEBPS/section-11.html b/OEBPS/section-11.html index b5dc533..e9b48a1 100644 --- a/OEBPS/section-11.html +++ b/OEBPS/section-11.html @@ -12,7 +12,6 @@

11 Related Work

-

FRP draws some influence from the ideas of [DD00]. @@ -23,6 +22,5 @@

11 Related Work

There are also some similarities to Backus’ Applicative State Transition systems [Bac78], and to the Aldat project at McGill [Mer85] which investigated general purpose applications of relational algebra.

-
diff --git a/OEBPS/section-12.html b/OEBPS/section-12.html index dd2aa51..a5fc042 100644 --- a/OEBPS/section-12.html +++ b/OEBPS/section-12.html @@ -10,7 +10,6 @@

12 Conclusions

-

We have argued that complexity causes more problems in large software systems than anything else. @@ -30,6 +29,5 @@

12 Conclusions

So, what is the way out of the tar pit? What is the silver bullet? … it may not be FRP, but we believe there can be no doubt that it is simplicity.

-
diff --git a/OEBPS/section-2.html b/OEBPS/section-2.html index b1509c7..7f18cae 100644 --- a/OEBPS/section-2.html +++ b/OEBPS/section-2.html @@ -10,7 +10,6 @@

2 Complexity

-

In his classic paper — “No Silver Bullet” Brooks [Bro86] identified four properties of software systems which make building software hard: Complexity, Conformity, Changeability and Invisibility. Of these we believe that Complexity is the only significant one — the others can either be classified as forms of complexity, or be seen as problematic solely because of the complexity in the system. @@ -25,10 +24,10 @@

2 Complexity

The relevance of complexity is widely recognised. As Dijkstra said [Dij97, EWD1243] -

- “…we have to keep it crisp, disentangled, and simple if we refuse to be crushed by the complexities of our own making…” -

+
+

“…we have to keep it crisp, disentangled, and simple if we refuse to be crushed by the complexities of our own making…”

+

…and the Economist devoted a whole article to software complexity [Eco04] — noting that by some estimates software problems cost the American economy $59 billion annually. @@ -38,38 +37,38 @@

2 Complexity

Being able to think and reason about our systems (particularly the effects of changes to those systems) is of crucial importance. The dangers of complexity and the importance of simplicity in this regard have also been a popular topic in ACM Turing award lectures. In his 1990 lecture Corbato said [Cor91]: -
- “The general problem with ambitious systems is complexity.”, “…it is important to emphasize the value of simplicity and elegance, for complexity has a way of compounding difficulties” -

+
+

“The general problem with ambitious systems is complexity.”, “…it is important to emphasize the value of simplicity and elegance, for complexity has a way of compounding difficulties”

+

In 1977 Backus [Bac78] talked about the “complexities and weaknesses” of traditional languages and noted: -

“there is a desperate need for a powerful methodology to help us think about programs. … conventional languages create unnecessary confusion in the way we think about programs”

+
+

“there is a desperate need for a powerful methodology to help us think about programs. … conventional languages create unnecessary confusion in the way we think about programs”

+
+

Finally, in his Turing award speech in 1980 Hoare [Hoa81] observed: -

- “…there is one quality that cannot be purchased … — and that is reliability. - The price of reliability is the pursuit of the utmost simplicity” -

+
+

“…there is one quality that cannot be purchased … — and that is reliability. + The price of reliability is the pursuit of the utmost simplicity”

+
-

- and -

- “I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. - The first method is far more difficult.” -
-

+

and

+
+

“I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”

+

This is the unfortunate truth:

- Simplicity is Hard +

Simplicity is Hard

@@ -86,6 +85,5 @@

2 Complexity

We shall look at what we consider to be the major common causes of complexity (things which make understanding difficult) after first discussing exactly how we normally attempt to understand systems.

-
diff --git a/OEBPS/section-3.html b/OEBPS/section-3.html index 4718359..acc9ead 100644 --- a/OEBPS/section-3.html +++ b/OEBPS/section-3.html @@ -10,7 +10,6 @@

3 Approaches to Understanding

-

We argued above that the danger of complexity came from its impact on our attempts to understand a system. Because of this, it is helpful to consider the mechanisms that are commonly used to try to understand systems. @@ -39,26 +38,26 @@

3 Approaches to Understanding

This is because — as we shall see below — there are inherent limits to what can be achieved by testing, and because informal reasoning (by virtue of being an inherent part of the development process) is always used. The other justification is that improvements in informal reasoning will lead to less errors being created whilst all that improvements in testing can do is to lead to more errors being detected. As Dijkstra said in his Turing award speech [Dij72, EWD340]: -
- “Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with.” -

+
+

“Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with.”

+

and as O’Keefe (who also stressed the importance of “understanding your problem” and that “Elegance is not optional”) said [O’K90]: -

- “Our response to mistakes should be to look for ways that we can avoid making them, not to blame the nature of things.” -

+
+

“Our response to mistakes should be to look for ways that we can avoid making them, not to blame the nature of things.”

+

The key problem with testing is that a test (of any kind) that uses one particular set of inputs tells you nothing at all about the behaviour of the system or component when it is given a different set of inputs. The huge number of different possible inputs usually rules out the possibility of testing them all, hence the unavoidable concern with testing will always be — have you performed the right tests?. The only certain answer you will ever get to this question is an answer in the negative — when the system breaks. Again, as Dijkstra observed [Dij71, EWD303]: -

- “testing is hopelessly inadequate.…(it) can be used very effectively to show the presence of bugs but never to show their absence.” -

+
+

“testing is hopelessly inadequate.…(it) can be used very effectively to show the presence of bugs but never to show their absence.”

+

We agree with Dijkstra. Rely on testing at your peril. @@ -75,6 +74,5 @@

3 Approaches to Understanding

When considered next to testing and reasoning, simplicity is more important than either. Given a stark choice between investment in testing and investment in simplicity, the latter may often be the better choice because it will facilitate all future attempts to understand the system — attempts of any kind.

-
diff --git a/OEBPS/section-4.html b/OEBPS/section-4.html index 71aa549..8d1081c 100644 --- a/OEBPS/section-4.html +++ b/OEBPS/section-4.html @@ -10,16 +10,13 @@

4 Causes of Complexity

-

In any non-trivial system there is some complexity inherent in the problem that needs to be solved. In real large systems however, we regularly encounter complexity whose status as “inherent in the problem” is open to some doubt. We now consider some of these causes of complexity.

-

4.1 Complexity caused by State

-

Anyone who has ever telephoned a support desk for a software system and been told to “try it again”, or “reload the document”, or “restart the program”, or “reboot your computer” or “re-install the program” or even “re-install the operating system and then the program” has direct experience of the problems that state1 causes for writing reliable, understandable software.

@@ -29,26 +26,16 @@

4.1 Complexity caused by State

The reason that many of these errors exist is that the presence of state makes programs hard to understand. It makes them complex.

-

- When it comes to state, we are in broad agreement with Brooks’ sentiment when he says [Bro86] -

-

- “From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability” -

-
- — we agree with this, but believe that it is the presence of many possible states which gives rise to the complexity in the first place, and: -
-

- “computers… have very large numbers of states. - This makes conceiving, describing, and testing them hard. - Software systems have orders-of-magnitude more states than computers do.” -

-
-

-
+

When it comes to state, we are in broad agreement with Brooks’ sentiment when he says [Bro86]

+
+

“From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability”

+
+

— we agree with this, but believe that it is the presence of many possible states which gives rise to the complexity in the first place, and:

+
+

“computers… have very large numbers of states. This makes conceiving, describing, and testing them hard. Software systems have orders-of-magnitude more states than computers do.”

+

4.1.1 Impact of State on Testing

-

The severity of the impact of state on testing noted by Brooks is hard to over-emphasise. State affects all types of testing — from system-level testing (where the tester will be at the mercy of the same problems as the hapless user just mentioned) through to component-level or unit testing. @@ -74,10 +61,8 @@

4.1.1 Impact of State on Testing

These two similar problems — one intrinsic to testing, the other coming from state — combine together horribly. Each introduces a huge amount of uncertainty, and we are left with very little about which we can be certain if the system/component under scrutiny is of a stateful nature.

-

4.1.2 Impact of State on Informal Reasoning

-

In addition to causing problems for understanding a system from the outside, state also hinders the developer who must attempt to reason (most commonly on an informal basis) about the expected behaviour of the system “from the inside”.

@@ -100,10 +85,8 @@

4.1.2 Impact of State on Informal Reasoning

As a result of all the above reasons it is our belief that the single biggest remaining cause of complexity in most contemporary large systems is state, and the more we can do to limit and manage state, the better.

-

4.2 Complexity caused by Control

-

Control is basically about the order in which things happen.

@@ -154,10 +137,8 @@

4.2 Complexity caused by Control

Concurrency also affects testing, for in this case, we can no longer even be assured of result consistency when repeating tests on a system — even if we somehow ensure a consistent starting state. Running a test in the presence of concurrency with a known initial state and set of inputs tells you nothing at all about what will happen the next time you run that very same test with the very same inputs and the very same starting state…and things can’t really get any worse than that.

-

4.3 Complexity caused by Code Volume

-

The final cause of complexity that we want to examine in any detail is sheer code volume.

@@ -166,28 +147,25 @@

4.3 Complexity caused by Code Volume

Because of this we shall often not mention code volume explicitly. It is however worth brief independent attention for at least two reasons — firstly because it is the easiest form of complexity to measure, and secondly because it interacts badly with the other causes of complexity and this is important to consider.

-

- Brooks noted [Bro86] -

- “Many of the classic problems of developing software products derive from this essential complexity and its nonlinear increase with size” -
- We basically agree that in most current systems this is true (we disagree with the word “essential” as already noted) — i.e. in most systems complexity definitely does exhibit nonlinear increase with size (of the code). - This non-linearity in turn means that it’s vital to reduce the amount of code to an absolute minimum. -

-

- We also want to draw attention to one of Dijkstra’s [Dij72, EWD340] thoughts on this subject: -

- “It has been suggested that there is some kind of law of nature telling us that the amount of intellectual effort needed grows with the square of program length. +

Brooks noted [Bro86]

+ +
+

“Many of the classic problems of developing software products derive from this essential complexity and its nonlinear increase with size”

+
+ +

We basically agree that in most current systems this is true (we disagree with the word “essential” as already noted) — i.e. in most systems complexity definitely does exhibit nonlinear increase with size (of the code). This non-linearity in turn means that it’s vital to reduce the amount of code to an absolute minimum.

+ +

We also want to draw attention to one of Dijkstra’s [Dij72, EWD340] thoughts on this subject:

+ +
+

“It has been suggested that there is some kind of law of nature telling us that the amount of intellectual effort needed grows with the square of program length. But, thank goodness, no one has been able to prove this law. - And this is because it need not be true. … I tend to the assumption — up till now not disproved by experience — that by suitable application of our powers of abstraction, the intellectual effort needed to conceive or to understand a program need not grow more than proportional to program length.” -

- We agree with this — it is the reason for our “in most current systems” caveat above. - We believe that — with the effective management of the two major complexity causes which we have discussed, state and control — it becomes far less clear that complexity increases with code volume in a non-linear way. -

-
+ And this is because it need not be true. … I tend to the assumption — up till now not disproved by experience — that by suitable application of our powers of abstraction, the intellectual effort needed to conceive or to understand a program need not grow more than proportional to program length.”

+ +

We agree with this — it is the reason for our “in most current systems” caveat above. + We believe that — with the effective management of the two major complexity causes which we have discussed, state and control — it becomes far less clear that complexity increases with code volume in a non-linear way.

4.4 Other causes of complexity

-

Finally there are other causes, for example: duplicated code, code which is never actually used (“dead code”), unnecessary abstraction3, missed abstraction, poor modularity, poor documentation…

@@ -225,7 +203,6 @@

4.4 Other causes of complexity

Some of these causes are of a human-nature, others due to environmental issues, but all can — we believe — be greatly alleviated by focusing on effective management of the complexity causes discussed in sections 4.1–4.3.

-
diff --git a/OEBPS/section-5.html b/OEBPS/section-5.html index 4cd8a52..34e844d 100644 --- a/OEBPS/section-5.html +++ b/OEBPS/section-5.html @@ -1,6 +1,4 @@ - Out of the Tar Pit Section 5: Classical approaches to managing complexity @@ -13,24 +11,19 @@

5 Classical approaches to managing complexity

-

The different classical approaches to managing complexity can perhaps best be understood by looking at how programming languages of each of the three major styles (imperative, functional, logic) approach the issue. (We take object-oriented languages as a commonly used example of the imperative style).

-

5.1 Object-Orientation

-

Object-orientation — whilst being a very broadly applied term (encompassing everything from Java-style class-based to Self-style prototype-based languages, from single-dispatch to CLOS-style multiple dispatch languages, and from traditional passive objects to the active / actor styles) — is essentially an imperative approach to programming. It has evolved as the dominant method of general software development for traditional (Von Neumann) computers, and many of its characteristics spring from a desire to facilitate Von Neumann style (i.e. state-based) computation.

-

5.1.1 State

-

In most forms of object-oriented programming (OOP) an object is seen as consisting of some state together with a set of procedures for accessing and manipulating that state.

@@ -44,11 +37,9 @@

5.1.1 State

One problem with this is that, if several of the access procedures access or manipulate the same bit of state, then there may be several places where a given constraint must be enforced (these different access procedures may or may not be within the same file depending on the specific language and whether features, such as inheritance, are in use). Another major problem4 is that encapsulation-based integrity constraint enforcement is strongly biased toward single-object constraints and it is awkward to enforce more complicated constraints involving multiple objects with this approach (for one thing it becomes unclear where such multiple-object constraints should reside).

-
Identity and State
-

There is one other intrinsic aspect of OOP which is intimately bound up with the issue of state, and that is the concept of object identity.

@@ -57,13 +48,10 @@
Identity and State
In OOP, each object is seen as being a uniquely identifiable entity regardless of its attributes. This is known as intensional identity (in contrast with extensional identity in which things are considered the same if their attributes are the same). As Baker observed [Bak93]: -
-

- In a sense, object identity can be considered to be a rejection of the "relational algebra" view of the world in which two objects can only be distinguished through differing attributes. -

-

- +
+

In a sense, object identity can be considered to be a rejection of the "relational algebra" view of the world in which two objects can only be distinguished through differing attributes.

+

Object identity does make sense when objects are used to provide a (mutable) stateful abstraction — because two distinct stateful objects can be mutated to contain different state even if their attributes (the contained state) happen initially to be the same. @@ -79,47 +67,37 @@

Identity and State
This additional concept of identity adds complexity to the task of reasoning about systems developed in the OOP style (it is necessary to switch mentally between the two equivalence concepts — serious errors can result from confusion between the two).

-
State in OOP
-

The bottom line is that all forms of OOP rely on state (contained within objects) and in general all behaviour is affected by this state. As a result of this, OOP suffers directly from the problems associated with state described above, and as such we believe that it does not provide an adequate foundation for avoiding complexity.

-

5.1.2 Control

-

Most OOP languages offer standard sequential control flow, and many offer explicit classical “shared-state concurrency” mechanisms together with all the standard complexity problems that these can cause. One slight variation is that actor-style languages use the “message-passing” model of concurrency — they associate threads of control with individual objects and messages are passed between these. This can lead to easier informal reasoning in some cases, but the use of actor-style languages is not widespread.

-

5.1.3 Summary — OOP

-

Conventional imperative and object-oriented programs suffer greatly from both state-derived and control-derived complexity.

-

5.2 Functional Programming

-

Whilst OOP developed out of a desire to offer improved ways of managing and dealing with the classic stateful Von Neumann architecture, functional programming has its roots in the completely stateless lambda calculus of Church (we are ignoring the even simpler functional systems based on combinatory logic). The untyped lambda calculus is known to be equivalent in power to the standard stateful abstraction of computation — the Turing machine.

-

5.2.1 State

-

Modern functional programming languages are often classified as ‘pure’ — those such as Haskell [PJ+03] which shun state and side-effects completely, and ‘impure’ — those such as ML which, whilst advocating the avoidance of state and side-effects in general, do permit their use. Where not explicitly mentioned we shall generally be considering functional programming in its pure form. @@ -138,11 +116,9 @@

5.2.1 State

By avoiding state functional programming also avoids all of the other state-related weaknesses discussed above, so — for example — informal reasoning also becomes much more effective.

-

5.2.2 Control

-

Most functional languages specify implicit (left-to-right) sequencing (of calculation of function arguments) and hence they face many of the same issues mentioned above. Functional languages do derive one slight benefit when it comes to control because they encourage a more abstract use of control using functionals (such as fold / map) rather than explicit looping. @@ -151,11 +127,9 @@

5.2.2 Control

There are also concurrent versions of many functional languages, and the fact that state is generally avoided can give benefits in this area (for example in a pure functional language it will always be safe to evaluate all arguments to a function in parallel).

-

5.2.3 Kinds of State

-

In most of this paper when we refer to “state” what we really mean is mutable state.

@@ -202,11 +176,9 @@

5.2.3 Kinds of State

It is worth noting in passing that — even though it would be no substitute for a guarantee of referential transparency — there is no reason why the functional style of programming cannot be adopted in stateful languages (i.e. imperative as well as impure functional ones). More generally, we would argue that — whatever the language being used — there are large benefits to be had from avoiding hidden, implicit, mutable state.

-

5.2.4 State and Modularity

-

It is sometimes argued (e.g. [vRH04]) that state is important because it permits a particular kind of modularity. This is certainly true. @@ -244,20 +216,16 @@

5.2.4 State and Modularity

This does basically allow you to avoid the problem described above, but it can very easily be abused to create a stateful, side-effecting sub-language (and hence re-introduce all the problems we are seeking to avoid) inside Haskell — albeit one that is marked by its type. Again, despite their huge strengths, monads have as yet been insufficient to give rise to widespread adoption of functional techniques.

-

5.2.5 Summary — Functional Programming

-

Functional programming goes a long way towards avoiding the problems of state-derived complexity. This has very significant benefits for testing (avoiding what is normally one of testing’s biggest weaknesses) as well as for reasoning.

-

5.3 Logic Programming

-

Together with functional programming, logic programming is considered to be a declarative style of programming because the emphasis is on specifying what needs to be done rather than exactly how to do it. Also as with functional programming — and in contrast with OOP — its principles and the way of thinking encouraged do not derive from the stateful Von Neumann architecture. @@ -283,11 +251,9 @@

5.3 Logic Programming

It is for this reason that Prolog falls short of the ideals of logic programming. Specifically it is necessary to be concerned with the operational interpretation of the program whilst writing the axioms.

-

5.3.1 State

-

Pure logic programming makes no use of mutable state, and for this reason profits from the same advantages in understandability that accrue to pure functional programming. Many languages based on the paradigm do however provide some stateful mechanisms. @@ -300,11 +266,9 @@

5.3.1 State

The one advantage that all these impure non-Von Neumann derived languages can claim is that — whilst state is permitted its use is generally discouraged (which is in stark contrast to the stateful Von Neumann world). Still, without purity there are no guarantees and all the same state-related problems can sometimes occur.

-

5.3.2 Control

-

In the case of pure Prolog the language specifies both an implicit ordering for the processing of sub-goals (left to right), and also an implicit ordering of clause application (top down) — these basically correspond to an operational commitment to process the program in the same order as it is read textually (in a depth first manner). This means that some particular ways of writing down the program can lead to non-termination, and — when combined with the fact that some extra-logical features of the language permit side-effects — leads inevitably to the standard difficulty for informal reasoning caused by control flow. (Note that these reasoning difficulties do not arise in ideal world of logic programming where there simply is no specified control — as distinct from in pure Prolog programming where there is). @@ -321,15 +285,12 @@

5.3.2 Control

One example would be Oz which offers the ability to program specific control strategies which can then be applied to different problems as desired. This is a very useful feature because it allows significant explicit control flexibility to be specified separately from the main program (i.e. without contaminating it through the addition of control complexity).

-

5.3.3 Summary — Logic Programming

-

One of the most interesting things about logic programming is that (despite the limitations of some actual logic-based languages) it offers the tantalising promise of the ability to escape from the complexity problems caused by control.

-
diff --git a/OEBPS/section-6.html b/OEBPS/section-6.html index e559401..b6fb7b4 100644 --- a/OEBPS/section-6.html +++ b/OEBPS/section-6.html @@ -9,7 +9,6 @@

6 Accidents and Essence

-

Brooks defined difficulties of “essence” as those inherent in the nature of software and classified the rest as “accidents” @@ -88,6 +87,5 @@

6 Accidents and Essence

We now attempt to classify occurrences of complexity as either accidental or essential.

-
\ No newline at end of file diff --git a/OEBPS/section-7.html b/OEBPS/section-7.html index 05a0bf4..56a610e 100644 --- a/OEBPS/section-7.html +++ b/OEBPS/section-7.html @@ -12,7 +12,6 @@

7 Recommended General Approach

-

Given that our main recommendations revolve around trying to avoid as much accidental complexity as possible, we now need to look at which bits of the complexity must be considered accidental and which essential.

@@ -22,11 +21,9 @@

7 Recommended General Approach

We then follow this up with a look at just how realistic this ideal world really is before finally giving some recommendations.

-

7.1 Ideal World

-

In the ideal world we are not concerned with performance, and our language and infrastructure provide all the general support we desire. It is against this background that we are going to examine state and control. @@ -48,7 +45,7 @@

7.1 Ideal World

- Informal requirements → Formal requirements +

Informal requirements → Formal requirements

@@ -78,11 +75,9 @@

7.1 Ideal World

approach for the causes of complexity discussed above.

-

7.1.1 State in the ideal world

-

Our main aim for state in the ideal world is to get rid of it — i.e. we are hoping that most state will turn out to be accidental state. @@ -105,11 +100,9 @@

7.1.1 State in the ideal world

In cases where this is possible the data corresponds to accidental state.

-
Input Data
-

Data which is provided directly (input) will have to have been included in the informal requirements and as such is deemed essential. There are basically two cases:

@@ -128,22 +121,18 @@
Input Data
In the second case (which will most often happen when the input is designed simply to cause some side-effect) the data need not be maintained at all.

-
Essential Derived Data — Immutable
-

Data of this kind can always be re-derived (from the input data — i.e. from the essential state) whenever required. As a result we do not need to store it in the ideal world (we just re-derive it when it is required) and it is clearly accidental state.

-
Essential Derived Data — Mutable
-

As with immutable essential derived data, this can be excluded (and the data re-derived on demand) and hence corresponds to accidental state.

@@ -154,11 +143,9 @@
Essential Derived Data — Mutable
In this situation modifications to the data can simply be treated identically to the corresponding modifications to the existing essential state.

-
Accidental Derived Data
-

State which is derived but not in the users’ requirements is also accidental state. @@ -190,11 +177,9 @@

Accidental Derived Data
Hence, it is quite clear that we can eliminate it completely from our ideal world, and that hence it is accidental.

-
Summary — State in the ideal world
-

For our ideal approach to state, we largely follow the example of functional programming which shows how mutable state can be avoided. We need to remember though that:

@@ -253,11 +238,9 @@
Summary — State in the ideal world
One effect of this is that all the state in the system is visible to the user of (or person testing) the system (because inputs can reasonably be expected to be visible in ways which internal cached state normally is not).

-

7.1.2 Control in the ideal world

-

Whereas we have seen that some state is essential, control generally can be completely omitted from the ideal world and as such is considered entirely accidental. It typically won’t be mentioned in the informal requirements and hence should not appear in the formal requirements (because these are derived with no view to execution). @@ -279,22 +262,18 @@

7.1.2 Control in the ideal world

computations take zero time8 and as such it is immaterial to a user whether they happen in sequence or in parallel.

-

7.1.3 Summary

-

In the ideal world we have been able to avoid large amounts of complexity — both state and control. As a result, it is clear that a lot of complexity is accidental. This gives us hope that it may be possible to significantly reduce the complexity of real large systems. The question is — how close is it possible to get to the ideal world in the real one?

-

7.2 Theoretical and Practical Limitations

-

The real world is not of course ideal. In this section we examine a few of the assumptions made in the section @@ -317,11 +296,9 @@

7.2 Theoretical and Practical Limitations

These observations give some indication of where we can expect to encounter difficulties.

-

7.2.1 Formal Specification Languages

-

First of all, we want to consider two problems (one of a theoretical kind, the other practical) that arise in connection with the ideal-world formal requirements.

@@ -391,11 +368,9 @@

7.2.1 Formal Specification Languages

As a result, this means that we will require some accidental components as we shall see in
section 7.2.3.

-

7.2.2 Ease of Expression

-

There is one final practical problem that we want to consider — even though we believe it is fairly rare in most application domains. In section 7.1.1 we argued that immutable, derived data would correspond to accidental state and could be omitted (because the logic of the system could always be used to derive the data on-demand). @@ -414,11 +389,9 @@

7.2.2 Ease of Expression

An example of this would be the derived data representing the position state of a computer-controlled opponent in an interactive game — it is at all times derivable by a function of both all prior user movements and the initial starting positions,11 but this is not the way it is most naturally expressed.

-

7.2.3 Required Accidental Complexity

-

We have seen two possible reasons why in practice — even with optimal language and infrastructure — we may require complexity which strictly is accidental. These reasons are: @@ -446,11 +419,9 @@

7.2.3 Required Accidental Complexity

This is a very serious concern, and is one that we address in our recommendations below.

-

7.3 Recommendations

-

We believe that — despite the existence of required accidental complexity — it is possible to retain most of the simplicity of the ideal world (section 7.1) in the real one. We now look at how this might be achievable. @@ -483,28 +454,24 @@

7.3 Recommendations

The title of his paper was the equation:

- “Algorithm = Logic + Control” +

“Algorithm = Logic + Control”

… and this separation that he advocated is close to the heart of what we’re recommending.

-

7.3.1 Required Accidental Complexity

-

In section 7.2.3 we noted two possible reasons for requiring accidental complexity (even in the presence of optimal language and infrastructure). We now consider the most appropriate way of handling each.

-
Performance
-

We have seen that there are many serious risks which arise from accidental complexity — particularly when introduced in an undisciplined manner. To mitigate these risks we take two defensive measures. @@ -527,11 +494,9 @@

Performance
We examine separation after first looking at the other possible reason for requiring accidental complexity.

-
Ease of Expression
-

This problem (see section 7.2.2) fundamentally arises when derived (i.e. accidental) state offers the most natural way to express parts of the logic of the system.

@@ -548,11 +513,9 @@
Ease of Expression
9.1.4).

-

7.3.2 Separation and the relationship between the components

-

In the above we deliberately glossed over exactly what we meant by our second recommendation: “Separate”. This is because it actually encompasses two things. @@ -613,15 +576,12 @@

7.3.2 Separation and the relationship between the compone It is additionally advocating a split between the state and control components of the “Useful” Accidental Complexity — but this distinction is less important than the others.

-

- One implication of this overall structure is that the system (essential + accidental but useful) should still function completely correctly if the “accidental but useful” bits are removed (leaving only the two essential components) — albeit possibly unacceptably slowly. - As Kowalski (who — writing in a Prolog-context — was not really considering any essential state) says: -

-

- “The logic component determines the meaning … whereas the control component only affects its efficiency”. -

-
-

+

One implication of this overall structure is that the system (essential + accidental but useful) should still function completely correctly if the “accidental but useful” bits are removed (leaving only the two essential components) — albeit possibly unacceptably slowly. + As Kowalski (who — writing in a Prolog-context — was not really considering any essential state) says:

+ +
+

“The logic component determines the meaning … whereas the control component only affects its efficiency”.

+

A consequence of separation is that the separately specified components will each be of a very different nature, and as a result it may be ideal to use different languages for each. @@ -687,11 +647,9 @@

7.3.2 Separation and the relationship between the compone Together the goals of avoid and separate give us reason to hope that we may well be able to retain much of the simplicity of the ideal world in the real one.

-

7.4 Summary

-

This first part of the paper has done two main things. It has given arguments for the overriding danger of complexity, and it has given some hope that much of the complexity may be avoided or controlled. @@ -723,7 +681,6 @@

7.4 Summary

In the second half of this paper we shall consider a possible approach based on these recommendations.

-
diff --git a/OEBPS/section-8.html b/OEBPS/section-8.html index dc754a5..6632c4c 100644 --- a/OEBPS/section-8.html +++ b/OEBPS/section-8.html @@ -12,7 +12,6 @@

8 The Relational Model

-

The relational model [Cod70] has — despite its origins — nothing intrinsically to do with databases. Rather it is an elegant approach to structuring data, a means for manipulating such data, and a mechanism for maintaining integrity and consistency of state. @@ -48,13 +47,11 @@

8 The Relational Model

As a final comment, it is widely recognised that SQL (of any version) — despite its widespread use — is not an accurate reflection of the relational model [Cod90, p371, Serious flaws in SQL], [Dat04, p xxiv] so the reader is warned against equating the two.

-

8.1 Structure

8.1.1 Relations

-

As mentioned above, relations provide the sole means for structuring data in the relational model. @@ -84,11 +81,9 @@

8.1.1 Relations

Date calls these variables relation variables or relvars, leading to the terms base relvar and derived relvar, and we shall use this terminology later. (Note however that our definition of relation is slightly different from his in that — following standard static typing practice — we do not consider the type to be part of the value).

-

8.1.2 Structuring benefits of Relations — Access path independence

-

The idea of structuring data using relations is appealing because no subjective, up-front decisions need to be made about the access paths that will later be used to query and process the data. @@ -147,11 +142,9 @@

8.1.2 Structuring benefits of Relations — Access path i A final advantage of using relations for the structure — in contrast with approaches such as Chen’s ER-modelling [Che76] — is that no distinction is made between entities and relationships. (Using such a distinction can be problematic because whether something is an entity or a relationship can be a very subjective question).

-

8.2 Manipulation

-

Codd introduced two different mechanisms for expressing the manipulation aspects of the relational model — the relational calculus and the relational algebra. @@ -184,11 +177,9 @@

8.2 Manipulation

One significant benefit of this manipulation language (aside from its simplicity) is that it has the property of closure — that all operands and results are of the same kind (relations) — hence the operations can be nested in arbitrary ways (indeed this property is inherent in any single-sorted algebra).

-

8.3 Integrity

-

Integrity in the relational model is maintained simply by specifying — in a purely declarative way — a set of constraints which must hold at all times. @@ -207,11 +198,9 @@

8.3 Integrity

Finally, many commercially available DBMSs provide imperative mechanisms such as triggers for maintaining integrity — such mechanisms suffer from control-flow concerns (see section 4.2) and are not considered to be part of the relational model.

-

8.4 Data Independence

-

Data independence is the principle of separating the logical model from the physical storage representation, and was one of the original motivations for the relational model. @@ -222,11 +211,9 @@

8.4 Data Independence

This is one of several reasons that motivate the adoption of the relational model in Functional Relational Programming (see section 9 below).

-

8.5 Extensions

-

The relational algebra — whilst flexible — is a restrictive language in computational terms (it is not Turing-complete) and is normally augmented in various ways when used in practice. @@ -244,6 +231,5 @@

8.5 Extensions

the ability to generate derived relations by changing attribute names
-
diff --git a/OEBPS/section-9.html b/OEBPS/section-9.html index fe8391f..fa7420f 100644 --- a/OEBPS/section-9.html +++ b/OEBPS/section-9.html @@ -10,7 +10,6 @@

9 Functional Relational Programming

-

The approach of functional relational programming (FRP16) derives its name from the fact that the essential components of the system (the logic and the essential state) are based upon functional programming and the relational model (see Figure 2). @@ -34,12 +33,8 @@

9 Functional Relational Programming

Figure 2: The components of an FRP system (infrastructure not shown, arrows show dynamic data flow)

-
-

9.1 Architecture

-
-

We describe the architecture of an FRP system by first looking at what must be specified for each of the components when constructing a system in this manner. Then we look at what infrastructure needs to be available in order to be able to construct systems in this fashion. @@ -68,11 +63,9 @@

9.1 Architecture

In contrast with the object-oriented approach, FRP emphasises a clear separation of state and behaviour19.

-

9.1.1 Essential State (“State”)

-

This component consists solely of a specification of the essential state for the system in terms of base relvars20 (in FRP all state is stored solely in terms of relations — there are no exceptions to this). @@ -84,11 +77,9 @@

9.1.1 Essential State (“State”)

In accordance with section 7.1.1, FRP strongly encourages that data be treated as essential state only21 when it has been input directly by a user22.

-

9.1.2 Essential Logic (“Behaviour”)

-

The essential logic comprises both functional and algebraic (relational) parts. @@ -118,11 +109,9 @@

9.1.2 Essential Logic (“Behaviour”)

This is absolutely not the case (it is only the accidental state and control component — see below — which is concerned with efficiency of storage structures).

-

9.1.3 Accidental State and Control (“Performance”)

-

This component fundamentally consists of a series of isolated (in the sense that they cannot refer to each other in any way) performance “hints”. @@ -155,11 +144,9 @@

9.1.3 Accidental State and Control (“Performance”) -

9.1.4 Other (Interfacing)

-

The primary consideration not addressed by the above is that of interfacing with the outside world. @@ -179,11 +166,9 @@

9.1.4 Other (Interfacing)

The expectation is that all of these components will be of a minimal nature — performing only the necessary translations to and from relations.

-
Feeders
-

Feeders are components which convert input into relational assignments — i.e. cause changes to the essential state. @@ -197,11 +182,9 @@

Feeders
The infrastructure which eventually runs the FRP system will ensure that the command respects the integrity constraints23 — either by rejecting non-conformant commands, or possibly in some cases by modifying them to ensure conformance.

-
Observers
-

Observers are components which generate output in response to changes which they observe in the values of the (derived) relvars. At a minimum, observers will only need to specify the name of the relvar which they wish to observe. @@ -215,11 +198,9 @@

Observers
The only (occasional) exceptions to this should be of the ease of expression kind discussed in sections 7.2.2 and 7.3.1.

-
Summary
-

The most complicated scenario when interfacing the core relational system with the outside world is likely to come when the interfacing requires highly structured input or output (this is most likely to occur when interfacing with other systems rather than with people). @@ -229,11 +210,9 @@

Summary
In this situation, the feeders or observers are forced to convert between structured data and flat relations24.

-

9.1.5 Infrastructure

-

In several places above we have referred to the “infrastructure which runs the FRP system”. @@ -244,11 +223,9 @@

9.1.5 Infrastructure

The different components of an FRP system lead to different requirements on the infrastructure which is going to support them.

-
Infrastructure for Essential State
-
  1. some means of storing and retrieving data in the form of relations assigned to named relvars
  2. @@ -257,11 +234,9 @@
    Infrastructure for Essential State
  3. a base set of generally useful types (typically integer, boolean, string, date etc)
-
Infrastructure for Essential Logic
-
  1. a means to evaluate relational expressions
  2. @@ -271,11 +246,9 @@
    Infrastructure for Essential Logic
  3. a means to express and enforce integrity constraints
-
Infrastructure for Accidental State and Control
-
  1. a means to specify which derived relvars should actually be stored, along with the ability to store such relvars and ensure that the stored values are accurately up-to-date at all times
  2. @@ -286,11 +259,9 @@
    Infrastructure for Accidental State and Control
-
Infrastructure for Feeders and Observers
-

The minimum requirement on the infrastructure (specifically on the state manipulation language) from feeders is for it to be able to process relational assignment commands (containing complete new relation values) and reject them if necessary. @@ -310,11 +281,9 @@

Infrastructure for Feeders and Observers
Finally, the ability to access arbitrary historical relvar values would obviously be a useful extension in some scenarios.

-
Summary
-

If a system is to be based upon the FRP architecture it will be necessary either to obtain an FRP infrastructure from a third party, or to develop one with existing tools and techniques. @@ -330,22 +299,18 @@

Summary
Finally, it is of course perfectly possible to develop an FRP infrastructure in any general purpose language — be it object-oriented, functional or logic.

-

9.2 Benefits of this approach

-

FRP follows the guidelines of avoid and separate as recommended in section 7 and hence gains all the benefits which derive from that. We now examine how FRP helps to avoid complexity from the common causes.

-

9.2.1 Benefits for State

-

The architecture is explicitly designed to avoid useless accidental state, and to avoid even the possibility of an FRP system ever getting into a “bad state”. @@ -374,25 +339,20 @@

9.2.1 Benefits for State

Finally, integrity constraints provide big benefits for maintaining consistency of state in a declarative manner: -

-

- The fact that we can impose the integrity constraints of our system in a purely declarative manner (without requiring triggers or worse, methods / procedures) is one of the key benefits of the FRP approach. - It means that the addition of new constraints increases the complexity of the system only linearly because the constraints do not — indeed cannot — interact in any way at all. (Constraints can make use of user-defined functions — but they have no way of referring to other constraints). - This is in stark contrast with more imperative approaches such as object-oriented programming where interaction between methods causes the complexity to grow at a far greater rate. -

-

+
+

The fact that we can impose the integrity constraints of our system in a purely declarative manner (without requiring triggers or worse, methods / procedures) is one of the key benefits of the FRP approach. It means that the addition of new constraints increases the complexity of the system only linearly because the constraints do not — indeed cannot — interact in any way at all. (Constraints can make use of user-defined functions — but they have no way of referring to other constraints). This is in stark contrast with more imperative approaches such as object-oriented programming where interaction between methods causes the complexity to grow at a far greater rate.

+
+

Furthermore, the declarative nature of the integrity constraints opens the door to the possibility of a suitably sophisticated infrastructure making use of them for performance reasons (to give a trivial example, there is no need to compute the relational intersection of two relvars at all if it can be established that their integrity constraints are mutually exclusive — because then the result is guaranteed to be empty). This type of optimisation is just not possible if the integrity is maintained in an imperative way.

-

9.2.2 Benefits for Control

-

Control is avoided completely in the relational component which constitutes the top level of the essential logic. @@ -411,11 +371,9 @@

9.2.2 Benefits for Control

A final advantage (which isn’t particularly related to control) is that the uniform nature of the representation of data as relations makes it much easier to create distributed implementations of an FRP infrastructure should that be required (e.g. there are no pointers or other access paths to maintain).

-

9.2.3 Benefits for Code Volume

-

FRP addresses this in two ways. @@ -428,11 +386,9 @@

9.2.3 Benefits for Code Volume

-

9.2.4 Benefits for Data Abstraction

-

Data Abstraction is something which we have only mentioned in passing (in section 4.4) so far. @@ -467,22 +423,18 @@

9.2.4 Benefits for Data Abstraction

Specifically, FRP offers no support for nested relations or for creating product types (as we shall see in section 9.3).

-

9.2.5 Other Benefits

-

The previous sections considered the benefits offered by FRP for minimizing complexity. Other potential benefits include performance (as mentioned briefly under section 9.2.1) and the possibility that development teams themselves could be organised around the different components — for example one team could focus on the accidental aspects of the system, one on the essential aspects, one on the interfacing, and another on providing the infrastructure.

-

9.3 Types

-

A final comment is that — in addition to a fairly typical set of standard types — FRP provides a limited ability to define new user types for use in the essential state and essential logic components. @@ -498,7 +450,6 @@

9.3 Types

Interesting work in this area has been carried out in the Machiavelli system [OB88].

-
diff --git a/OEBPS/stylesheet.css b/OEBPS/stylesheet.css index dc9f18e..1413e48 100644 --- a/OEBPS/stylesheet.css +++ b/OEBPS/stylesheet.css @@ -28,11 +28,7 @@ blockquote.centred { dt { font-weight: bold; } - -article > p:not(:first-child) { - text-indent: 1em; -} -article.abstract > p { +p.abstract { margin: 0 15%; text-indent: 1em; }