I have been a software developer since 1976, and I have long been impressed with the flow-based approach to application design. My first exposure to such ideas came from Tom Demarco's book on structured analysis in which he develops the notion of dataflow diagrams. At that time - the early 80s - I was already heavily involved in the development of distributed applications, so it was very natural to think in terms of asynchronous processes connected by various types of communications links.
The surprise was that I soon found it easier, and much more reliable, to design more traditional (i.e. single-threaded, imperative) applications by creating a simple multi-threaded scheduler as the basis for the application, and then breaking my design down into asynchronous modules that would run in some non-deterministic order under control of the scheduler. When object-oriented languages came along this approach became even more natural since each process could be defined as a well encapsulated class, and the scheduler was similarly a special class that did almost nothing but invoke the "run" methods of the various process objects.
This approach also made it much easier to communicate accurately and naturally with clients who otherwise would not have understood the convoluted transformations that are necessary to turn their wishes into conventional procedural code. Unfortunately, the advent of object-oriented software made the earlier "structured" approach seemingly passe, including the use of dataflow diagrams. Consequently, I often feel that I am practically the only designer who still uses such diagrams to communicate with clients, and I daresay the flood of object-oriented thinking is also why your book is currently out of print.
This is all very unfortunate in my view, because the flow-based approach is not only easier for lay people like clients to understand, but it is easier for programmers also. Whenever I say that I write my own schedulers the average programmer absolutely cringes at the very idea, because they do not understand that a round-robin scheduler is often only a few lines of code. Nor do I give up much in the ability to prioritize real time event handling, because putting the priority scheme in the scheduler is usually the wrong thing to do anyway. The processes themselves know very well if they need to run or not, and they do not have to suck their input queues dry simply because there is more than one item waiting. Rather, every process gets a chance to run on a regular basis, and anything that turns out to be a bottleneck tends to get more processor time simply because it is starving the downstream processes for work. Thus, congested nodes in the flow network get the focus they need without any special logic to handle it.
Also, because I have done a lot of work with formal communication protocols, I have a deep feeling for the power and convenience of finite state machines, so the bulk of my asynchronous processes are designed as formal state machines, which means that my designs have almost nothing in the way of built in side-effects that can turn into pathological connections within a process. In addition, a state machine can certainly be thought of as a set of independent threads of logic that each run in some formally defined circumstance based upon the current state and the current input type. That is to say, a state machine can very well be thought of as a schedular that selects the correct bit of code to run at any given moment.
Finally, if those bits of code are defined as strict functions (i.e. a mapping between domain and range sets) then one has very tight control of the entire design.
One starts by defining the formal membership rules for the sets of data handled by the system, then one can define each processing function as a strict mapping between those sets, then those functions can be assembled into state machines that implement processes, and finally the processes can be organized in a system-wide dataflow network. At every level there is a formal, objective definition of every component from the simplest data item to the system as a whole.
This means two things in particular.
First, there are a wealth of opportunities to put correctness assertions into the finished code. Inputs and outputs to individual functions can be tested to make sure they belong to the expected sets. State machines can make sure that all states encountered are properly defined, and that all inputs to the process are occuring in a permitted order (of which there may be very many), and the overall flow of data can move over links that must connect objectively defined outputs from one process to similarly defined inputs to another process. At all of these points the software can cheaply and easily detect errors, and then report very precise debugging information to the users and designers.
Second, if the system is doing something the customer says is wrong, it is relatively easy to change either the connectivity of processes, or the detailed definition of the processes themselves, because the code will have almost no implicitly defined connections internally. Changing the connectivity of processes is usually very easy, and internal details of a process are so modular that changes of only a few types are possible. Perhaps a data set definition is slightly off, perhaps a function is not quite the correct mapping between inputs and outputs, or perhaps a state machine does not have quite the correct state space, or quite the correct transitions between states. In short, all such errors tend to be very localized, and thus easy to correct without fouling up some other part of the design as a result of inadvertant messing with implicit connections between poorly encapsulated modules.
The net result for clients is that the specification process is transparent to them from the outset, and any error messages they need to take back to the designers will quickly pinpoint the exact problem. The diagnosed problems can in turn be corrected extremely quickly, and the system can go back online equally quickly, often a matter of only a few hours from first seeing the error report to getting running again.
In point of fact, using these methods I almost never deliver a detectable bug to customers in the first place. For years I used to give my clients a warranty that said I would fix any bug for free in the first 180 days of operation. That sounds pretty daring, but in fact I have never had such a call at all. In several cases I called the clients every couple of weeks to make sure they were not about to surprise me, but they never had a problem. In one case my work ran for two solid years on many machines with no reported errors at all.
In that last case, even my own alpha testing turned up only two bugs, both easisly fixed. One was a typo in an array declaration in which I had put "16" instead of the correct "18", and in the other I inadvertently did a "big-endian to little-endian" swap in two different places in the code, thus cancelling the swap. That correction was simply a matter of pulling out the second line of code that did the swap.
The point of the foregoing couple of paragraphs is not to brag about my own genius, but rather to say that these design methods work extremely well, provided only that one is reasonably disciplined about using them. You cannot take shortcuts and expect to get good results. So, while I would claim serious self-discipline, it is otherwise the methods that keep the error rate so low, and anyone can learn both discipline and the methods. Indeed, the self-discipline becomes rather easy to enforce, because it is heavily reinforced by real success. After all, who wants to be a whiny loser, when it is relatively easy to be a star in the client's view.
Unfortunately, nearly the entire industry is hopelessly stuck on the Von Neumann architecture, and the relatively shallow techniques that make it moderately usable. In general I am reminded of a scene in Bonny and Clyde in which they are on the run and holed up in a grungy, rural motel. Bonny asks Clyde if he would do things differently if he had known at the outset how things would turn out. Clyde misses the point of the question totally, and says, yes, he would have one state where they did no heists, so they would always have one jurisdiction where they were not wanted criminals. The software industry's addiction to the intellectual equivalent of a life of crime is almost that pathetic also. In most cases it is not an actual criminal offense to deliver buggy systems, but I've met plenty of clients who think it ought to be.
My own response to this mess is to write a short little book of my own that I can give to potential clients. I don't think the industry will change much on its own, but spreading the word among clients might do the trick.
In particular I was struck by your take on Structured Analysis, and I think the main difference between your approach and mine consists of two items.
First, your processes seem to be pretty conventional code, while I did not want to accept that as necessary.
Eventually, of course, one must write some conventional code if the application is going to run on a conventional Von Neumann platform. However, it still bothers me to use procedural languages for abstract design, even when we are talking about rather small procedures. Instead, as I said yesterday, I prefer formal finite state machines, which I usually organize as both state tables and as a state transition diagram. The state table in particular is a "closed form" in the sense that one must account for every combination of state and input type. This does a lot to ensure that the designer is not accidentally overlooking obscure cases. In every instance the designer is forced to consider what it would mean to receive a given type of input in a given state. In many cases it will be an error or major proportions that indicates some part of the system is badly out of whack, or it may mean that the state machine needs to add one or more new states (or transitions) to properly cover all the cases that can actually occur. Similarly previously undefined types of input may also force a redesign effort. The overall point is that such discrepancies at run time are detected as soon as they occur, and a suitable, pre-defined message can be generated to warn the users of trouble immediately.
Conversely, using ordinary procedural code leaves open the possibility that unanticipated inputs can occur, but there is no guarantee that it will be recognized in advance by the designer, much less that notification will be given to the users. This is similar to the problem you pointed out in which data is discarded implicitly when new data is written into the same storage space. Your method in that case allows stray packets to be recognized by the system, and I am similarly concerned about recognizing that otherwise legitimate data is not acceptable because it has arrived when the process is in the wrong state of affairs to handle it correctly.
So, this is why, for example, I never bothered with Tom DeMarco?'s "structured English mini-specs". Not only were they still procedural, but they did not even have the precision of a real programming language.
The second large difference, is that I have never worried much about tools.
Custom round-robin schedulers are easy, and all the more so in an object oriented language, so I have never tried to come up with a fully canonical way to handle process scheduling or interprocess communication.
In similar fashion I have a couple of fairly standardized ways to turn state machine definitions into procedural code, and having a standard, one-size-fits-all tool for doing that always seemed more bother than it was worth. Particularly because my career has involved a very wide variety of operating environments, and porting such a tool to a new environment would have hardly paid back the effort in many cases. Instead, I have usually done the detailed coding by hand, and then spent a few afternoons writing up a nice twenty page document to explain the architecture to future maintenance programmers. And the key feature of the documentation is usually the dataflow diagrams, which can be tossed off with any number of graphical programs, even MS Paint in some cases. Saying that each bubble on the diagram is implemented with one process object in the code, tells readers much of what they need to know right off the bat, and then I am careful to explain how scheduling and communication work. A few comments in the code to tie the software objects back to the diagrams and it becomes very hard to get lost.
Of course, the non-programming managers who have commissioned the work focus almost entirely on the dataflow diagrams and the state transition diagrams. The latter may seem a bit arcane, but after seeing a few examples most people get what they are about, and in any case, they always understand the questions I ask in terms of their ordinary business procedures. That is to say, I go through the state tables systematically and make sure that we are talking about the same business activity in all cases. Thus, we get a meeting of the minds on all points, even if the client does not understand the FSM notation directly.
Actually, in one case my client was the VP of engineering for a company doing electronic navigation receivers. As a graduate EE, he certainly did understand state machines, and one afternoon we went through 106 pages of state diagrams together. He then said it was the first time he really understood how the system worked. On previous versions his chief designer would not let him see the code at all, and it was wildly convoluted assembler in any case. In any event, it was probably the most gratifying design review that either one of us has ever had, either before or since.
Normally I don't expect my customers to read state tables directly like that, but most do wind up impressed (or sometimes horrendously bored) with the number of obscure cases I manage to ask about. Again, as I said yesterday, I'm no genius. I just have a mental tool that forces me to ask all the questions in a systematic fashion.
I mentioned yesterday that one can also think of a state machine as a type of router or scheduler that sends input to the appropriate function for processing. Normally I have one function for each state, and then each of those will route the input to a subordinate handler based on the type of input. So topologically the dataflow network is a tree with two levels of branching. Data enters at the bottom of the trunk and is routed to the appropriate primary branch based on current state. The selected primary branch then looks at the input type and routes the data to the appropriate secondary branch. I have not often showed this particular scheme directly to many clients, but it has all the intellectual advantages of the dataflow approach, in that people can easily compare the routing in the state machine to similar routing in a human bureacracy. That is, the mail room routes to the right department, and the department head routes to the correct clerk, for example.
In addition, the work to be done by the actual processing function is often quite trivial, and has little decision logic because the major decisions were handled in the routing process. Indeed, you made a similar point somewhere in the first couple of chapters of your book, and I think we are talking about something quite similar, if I recall correctly. In any event, like you, I feel that it is important to distinguish between several different type of decision making, and to represent some types of decision as a routing problem. It is much easier for non-technical clients to deal with that way.
I mentioned yesterday that I was working on a book to make some of these ideas more accessible to potential clients, but I also want to tie that discussion in with both discussion and simple implementation examples aimed at technical folks. Or perhaps just working examples for programmers that implement the abstract examples I discuss in a form palatable for non-technical folks.
By the way, you say some things in the book about performance issues, and I have good news there. In C and C++ it is often possible to use an integer state variable as an index into an array of function pointers for state handlers, so selecting the right function to call boils down to a trivial bit of pointer arithmetic in those cases. The round robin scheduler can do something similar, but using a loop counter as an index. Thus, dispatching to the right process, and then the right state handler is blindingly fast. Occasionally I have timed scheduling overhead by calling a null process, say, a hundred thousand times, or so. Even on machines from the early nineties the overhead was often only a few dozen microseconds per process. It is not really much of a load on the system, and the payback for the overhead is the sort of design clarity that many programmers only dream of. I no longer take any crap whatsoever on the issue of performance. It's a no brainer.
Frankly, I have always been suspicious of the concept of rapid prototyping, partly because many key details get left out of such implementations, and partly because it gives both management and the clients a false sense of security when they see something running that looks much like what they had in mind. It often seems to provoke unrealistic expectations about how long the real implementation will take. However, that nothwistanding, I am taken with your comments about simulating a system, and I can see the possibilities for that. Two examples come to mind.
First, it is sometimes possible to use a real but lightweight component as an effective substitute for the industrial grade component that will be used in production. On one project, for example, we used the Access database in place of the MS SQL server that we intended to use in production. Access could be used by means of the same ODBC interfaces that would be used in production, so our application code did not have to be changed in order to interface to Access in stead of SQL Server; thus, we were able to realistically test our code without making a big and premature investment in the real server. When we eventually cut over to the real server we reran our suite of tests in the new environment, but as expected everything passed with flying colors. The overall point here is that Access is an effective RDBMS for simulating production, but it lacks all sorts of administrative features that would make it suitable for heavy commercial use.
Second, I am beginning to suspect that I should give more thought to some sort of interpreted simulator for distributed applications. The emphasis would be on highly graphical and canonical representations of the topology and logic of the system, but without the need to set up a real network of any proportions. Once the developers and clients had satisfied themselves that the functionality was correct in the simulation, the various parts of the system could be ported to their production platforms, which, of course, might differ considerably from one part of the system to another. This is hardly a new thought in the industry, and I myself have been thinking such things rather idly ever since I read De Marco's book on structured analysis back in 1980. However, it is only lately that I have become convinced that the combination of flow-based analysis, state machines, and strict functions might truly be sufficient for modeling the vast majority of practical applications.
That last requires a bit of explanation.
Even though I've been using the combination of flow-based architecture, state machines and strict functions for quite some while, it has often been the case that the actual implementation in a particular language on a particular platform required some careful transformations from the formal, abstract model to the executable code. In many cases I have chosen to take advantage of quirks of the production environment to minimize code size, run-time, and other performance issues. In a word, I tend to "go native" a bit in a particular environment, and the necessary judgment about implementation details is therefore not something easily automated.
For example, in C or C++, as I mentioned yesterday, I often use arrays of function pointers to dispatch the correct handler for a given state. That works well in languages that have pointers as first class programming objects, but it is impossible in many other languages, so it cannot be treated as a general-purpose method that applies everywhere. A more general, but slower method is to use a multi-way decision such as a CASE or SWITCH statement that is more commonly supported in modern languages. This also dispatches fairly quickly in most languages, and has the advantage that it is very readable for conventional programmers.
Anyway, the point is that such implementation considerations have tended to obscure for me the potential utility of simulating an entire system in complete detail before porting the logic to the various production platforms. At first glance it does not seem to get anyone very much for the effort, but perhaps that is a mistaken estimate. A great deal of the usual implementation effort is determining how to take a clean idea and implement it is some quirky environment, but the effort would perhaps be a lot simpler if realistic simulation had first allowed developers and their clients to work the kinks out of the design abstractions before ever worrying about platform issues.
I also believe that reducing our tool repertoire is the basic thing we have to do in order to get better reliability. Structured programming got rid of promiscuous goto statements, object-oriented programming got rid of most promiscuous memory references with strict encapsulation, flow-based programming frees clients and developers from the idiocy of the Von Neumann architecture, and so on. As I have said already, in effect, I would do the vast majority of logic in the form of state machines and strict functions, thus getting rid of both random side-effects and program states that are only implicitly defined. Moreover, since both strict functions and state machines (which are constructed of strict functions) can be modeled very naturally with the flow-based approach, there is no conflict in using functions and state machines in a flow-based model. Indeed, they probably fit better that the rather conventional code you currently use to implement your processing nodes.
That last is not meant as a criticism of your methods, but rather as a positive argument for adding state machines and strict functions to the flow-based toolkit. Indeed, since I cannot be sure that some processing will not be more cleanly expressed in a procedural language, it seems useful to be able to use either type of expression. In fact, were I building a simulation tool as expeditiously as possible, I would treat procedural code as the fundamental way to implement processing nodes (since it obviously is closer to the actual Von Neumann machine language), and then I would simply specify a canonical way to turn state machine definitions into the procedural language used by the simulator. Strict functions could easily be defined directly in the procedural language, but one would simply be careful to avoid non-functional usages such as side effects. Indeed, there is no problem writing strictly functional code in languages like C, PL1, or even COBOL. It is just a matter of obeying the rules that limit the valid usages. True functional languages like Lisp and Scheme are different primarily because they enforce such rules more strictly, much as C++ has stricter scoping rules than C.
I just gave a quick scan of your chapter on recursion, and it went quickly for me because we are on exactly the same wavelength, for all practical purposes. The difference is that you are trying to shake programmers out of "J = J + 1", while I am trying to help non-technical folks avoid that stuff in the first place.
I would add that, for me, side-effects have all the appeal of bedwetting. And I mean that almost literally, since leaving crap in persistent local storage (i.e. within the scope of some subroutine) is no more "hygienic" than crapping on one's bedding. Getting into bed the second time just is not the same thing as it was the first time. Unfortunately, many programmers are like birds - impossible to housebreak. For birds in flight, crap just disappears from sight, and programmers using conventional memory tend to feel the same about used data. I'm not using it anymore, so why worry about it? All I can say is: Yuck! Guess I won't sleep at their house (or use their code in my application).