[Home]MatthewCrewe

FlowBasedProgramming | RecentChanges | Preferences

Matthew Crewe

Image Rotation Application using FBP

Paul,

I want to step back a little and use an example for us to talk through. This I hope will reveal the direction/domain I'm coming from.

I will point out this is a contrived example which embodies most of the issues I have at present:

I think your experience in FBP will allow you and I to see different ways to construct this "application".

Ok, the specification for this application is as follows:

Version 0.1!

"The application will load from disk a standard windows bitmap "image". This image will be displayed to the user in the applications window, its center will correspond to the display windows center. The user may rotate the image using the mouse, upward vertical movement rotates the image clockwise, downward movement anti-clockwise, all rotation is around the image center. The edges of the image will be highlighted by drawing a red border around it"

Non-functional Requirements

The application must rotate in "real-time". But because the function involved in transforming the source image pixels to their rotated counterparts will be implemented in software it can take a significant time to compute a full resolution rotated image. Therefore the application should display smaller sub-sampled images during rotation, when a user stops rotating a full resolution image should be drawn.

For this exercise - Even the display of sub-sampled images will not be enough for the real time drawing performance that is required. Therefore the system should aim to compute only when it must. This implies it is not possible to render every change triggered by a mouse event and therefore some form of lazy evaluation or ability to abort drawing a frame will be required. E.g. lazy: Event arrives, invalidate the display wait for a system message to "pull" computation, or abort-able, event arrives, start drawing, next event arrives before complete, then abort if total distance changed is too small or some similar algorithm based on previous draw times and rate of change.

Changing the dimensions of the application window should cause the image to be re-centered.

Because the current design has been produced without customer contact we would ideally like to provide a means to allow non-developers to experiment/prototype with as much of the system as possible. This way we can let experts tinker with the system to tune it to their needs (without, or with minimal developer involvement). To allow this it is envisaged that the system should permit modification through a high level design mode. To allow both customers and developers to work on the system it is expected that the design mode will actually produce "code" and structures that a developer would/could hand create through a text editor.

Notes

Question: Paul, how would you split this up using FBP?


The following discussion is a preamble to ideas discussed in InterLanguageCommunication. It turns out that my earlier experience with AMPS was leading me in an unproductive direction! (2009-04-22)


1. In terms of Erlang wrt Java/.NET (and languages using the .NET CLR) - Erlang looked at concurrency from the start, the differences are less obvious at present but will become more apparent: http://unlimitednovelty.com/2009/01/cutting-edge-of-vm-design.html and http://insidehpc.com/2008/08/27/hundreds-of-thousands-of-threads-yes-with-erlang/ (twenty million Erlang processes - not sure what the benchmark was on!)

Also, C#, Java, Ruby, Python etc. are all dropping or have dropped green-threads :-(

2. I think Erlang looks to be a good fit for FBP, because of (1) which led it to adopt an event/message based protocol from the outset.

I see FBP being more than a language, more a design pattern/practice, sorry if it looked like I was saying Erlang was better than FBP? The problem I have at the minute is which language to implement it in! This leads to...

3. At work we are switching to using C# at an application/framework level (a bad choice IMHO because the garbage collector and need for IDispose wrecks clean design - Another debate!). We use C++ for low-level image analysis algorithms and rendering (all high performance) - We develop 3D medical applications, these shift GB's of data about, process it, segment it, analyze and render it, ideally in real-time (on standard COTS PC's). Note, we operate in an asynchronous environment, in that we build server based "thin-client" frameworks.

The image analysis code is based on lazy evaluation and copying data when modified (copy on write) - This is beneficial for threaded environments. Its akin to passing about handles but when the data is modified to make a new handle and pass that on. My understanding is Erlang forces this pattern?

The problem I have is what language to recommend in order to shift us to a FBP approach. If I pick C++ some will complain due to it's lack of more modern features (say reflection). If I pick C# another set of issues arise. If I picked Erlang then its lack of higher level windows support might become an issue too. I'm a little stuck!

I'm starting to think that the first step is to get the low-level libraries using a common communication protocol/API centered on the exchange of data. This begins to get a data-centered "wedge" into the system. I suspect this will seed FBP in the organization and allow it to grow - For instance at this point you could use say Python etc. to script and assemble.

After the lower layer has a decent API I want to move onto tackling the middle-layer in a language that could interact with the API and more importantly be better for writing "glue" components (tool and state based event handling etc.). These components would ideally permit being configured by a DSL. This would also allow non-developers, probably with a graphical front end, to create and play with the application.

The last layer is the user interface, which needs to be bound to the (data) output from the middle layer.

The worry I voiced, was that if I picked Erlang it wouldn't fit well for the middle layer. As I write I'm reversing this view-point!

Does this clarify?

I also see FBP as a design methodology and architecture - essentially language-agnostic. Borrowing the term used by Gelernter's Linda, I see FBP as a "coordination language", not a programming language as such.

Please excuse me if I go into a bit of history... The first implementation (called AMPS) was in HLASM, the Assembler language for the IBM 360, 370,...z90, series of machines. Machines in those days had one processor, and an application ran under a single operating system task (Thread today, I presume). Given this foundation, using AMPS, I successfully got Assembler, PL/I, Fortran, the Sort utility, REXX, and even Prolog talking to each other, often in the same application (i.e. under the same task). This sort of formed my assumptions about what an application development environment should be like... Generalizing this idea, I would say that the lower the level at which you implement FBP, the more different languages and even architectures you can fit into an FBP world... E.g. FPGAs - there is some good, weird, stuff in http://www.jpaulmorrison.com/cgi-bin/wiki.pl?TheConvergence ... Also, FBP concepts are now being used in music and art - you need to support that too! I have both of my FBP implementations playing tunes :-)

Now stir in multiple threads as well as languages - and I'm willing to stipulate only one language per thread - and we are beginning to have a very powerful combination. Unfortunately modern compiler-writers (lazy so-and-sos!) have introduced bytecodes (three that I know of: Java, C# and AS - and now possibly Erlang as well), which IMHO help the compiler writers, but not the customer. In fact, the customer is worse off, as s/he is vulnerable to VM problems, as well as bugs in their own code. So now, as you say, we are a bit stuck!

I took a look at UBF at your suggestion ... but I feel it assumes that we have already solved the above problem - what I really want to do is run multiple languages, communicating asynchronously via data packets, in one application. While it is important for programs written in different languages running on different machines to be able to communicate, and I quite agree XML is pretty awful, to me this misses most of the potential of FBP, which is to simplify development and maintenance of individual applications.

I would like to suggest a slightly different strategy from the one you suggest: find a way to build a multithreading engine in C++ (what systems support pthreads?), and then build interfaces allowing higher-level languages, utilities, etc. to run on top of that... Surely at least C# and C++ ought to be able to talk to each other - the extern modifier is supposed to facilitate that...

I agree with your comments about getting mixed languages to co-operate within one application.

Because I sense you see a way forward for me and my particular environment, I would like to understand precisely what you meant by:

I would like to suggest a slightly different strategy from the one you suggest: find a way to build a multithreading engine in C++ (what systems support pthreads?), and then build interfaces allowing higher-level languages, utilities, etc. to run on top of that... Surely at least C# and C++ ought to be able to talk to each other - the extern modifier is supposed to facilitate that...

Do you mean a C++ based "scheduler" which controls multiple threads which use thread safe data/packet passing - for data flow? I do not understand, especially the build interfaces which higher level languages "run?" on top of? How would this work?

In what follows I will use the terms "green threads" and "kernel threads" - maybe I should have called them "cooperative" and "preemptive"... I think, in the FBP environment, the real distinction is whether a component only yields at an API call, or can be interrupted at any time. Feel free to come up with better terminology!

Let me start with my mainframe experience, to give some background... Unfortunately I don't have documentation for the software that we used to build some successful, mixed language applications, but I do have documentation for the system that we built after that one, but was never actually marketed. Most of our experience, however, was based on the very first implementation, called AMPS, because that was imported into, and refined at, a large Canadian bank - and is still in use over 30 years later.

The infrastructure layer was written in mainframe Assembler, and handled packet creation/destruction, FBP port and connection management, (green) thread activation and deactivation, suspend/resume, etc. The second and third implementations also supported 2 higher-level languages for components: PL/I and COBOL, plus components could of course be written in Assembler as well. These implementations all used green threads, but the trouble with green threads is a) they can't take advantage of multiple processors on the machine, and b) that they can't multithread with software that was not written using the same infrastructure, e.g. standard database software. On the other hand, in those days, "real" threads were heavyweight, and the synchronization mechanisms were fairly primitive, so green threads were really the only option! Plus of course machines in those days only had one processor, so we didn't need to support multiple tasks, which the operating system needed for the multiprocessor machines that came later.

Unlike many reusable subroutines (e.g. trig functions), an FBP scheduler has to be able to provide an API that a Higher-Level Language (HLL) can call, and must be able to drive the HLL component. The latter function may be more complex, as the FBP scheduler may have to provide an environment for the HLL to run in - plus some HLLs were not designed to have multiple environments running concurrently - e.g. PL/I was, and it worked great..., until the last release, whose standard environment (LE/MVS) didn't provide that (although it kept the old facility for compatibility reasons!). COBOL didn't allow for that - at least then - so we had to do some pretty dirty stuff to make two COBOL programs multithread! I also successfully converted IBM's standard Sort utility to run under AMPS, but even IBM's standard Sort wasn't fully reentrant, so we had to interlock occurrences of this component so they didn't run concurrently!

Earlier I had built an application, described in http://jpaulmorrison.com/fbp/scrmgr.htm (look for PRORES), which combined Assembler, REXX, and IBM's standard Presentation Graphics Facility tool. REXX actually has a very powerful communication mechanism that lets it talk to other languages (called "subenvironments"), which essentially replaces calls with commands to a specialized command processor, so the FBP became a subenvironment under the FBP implementation.

Of course, Assembler is much more flexible than PL/I and COBOL, which are both pretty long in the tooth (COBOL even older!), so we had to provide solutions to the problems of e.g. creating an FBP packet using a PL/I structure, and then possibly getting that same packet decoded by a COBOL component. I don't say these solutions were the most elegant, but our philosophy was that reusable components don't necessarily have to be easy to code - they just have to be easy to use...

One more point: in the mainframe world, all languages generate what are called "load modules", and you can link any load module to any other to make a bigger load module. And load modules that have all cross-program links "resolved" are what you eventually execute. This nice and simple model has of course been abandoned by the languages of today - no doubt for reasons that make sense to the designers :-)

I believe also that FBP changes the requirements on HLLs: for instance, because of the separation of function in FBP, HLLs don't have to worry about say I/O. Also, if you have a way of getting different languages to communicate, no one language has to be able to do everything, so you get away from the kitchen-sink type of language, which tries to cover all possible functions - and there are language approaches which they can never handle. One such example might be PROLOG, which uses a totally different execution model - we successfully interfaces AMPS to PROLOG, which would have been quite useful for certain application types. This way, you can just use PROLOG just for what it is good at - you don't have to do everything with it!

I visualize a bunch of Domain-Specific Languages (DSLs), and in fact, some of them may not even look like languages! For instance, the DSL I describe in http://jpaulmorrison.com/fbp/bdl.htm , which I actually prototyped, looked quite promising for a certain application area... Therefore... if you want to carry this idea to its logical conclusion, I might advocate not even bothering with Java or C# - just write your own compiler(s) or interpreter(s). It's really not that hard to do. Voilą - no more compatibility problems!

By the way, another factor which in my view suggests that Java and C# are probably the wrong vehicles for business logic is that, at least in the majority of business applications, most of the data is "complexly" typed - where the basic processing of this data is built on top of more general classes - so it makes sense to me to push the processing for these data types as far down as possible. For instance, currency has to have at least an amount and a currency code (CAD, USD, etc.), which increases the overhead if this is in turn built on top of BigDecimal? and String classes.

Matthew: What I want to do is use FBP as a complete end-to-end approach. I.e. where the whole design is centered on data. I think Java and C# are fine for this. Is the main point that they are too complicated in many respects, i.e. they are centered around OOP and so orientate themselves on it e.g. I see no need for inheritance or templates (use duck typing?), delegates, classes etc. they all seem to be appendages related to the fact that people wrap data and functions together. Not sure how you replace the need for e.g. string processing though, in other respects I agree with your point

I am not sure why you are so enthusiastic about thousands of threads :-) In discussions with Joe Armstrong, I gathered that Erlang has zillions of short-lived threads, corresponding I suppose to individual phone calls, so green threads makes sense in that environment. In our FBP work, the maximum I ran into was 200 threads, and, as I say in http://www.jpaulmorrison.com/fbp/compos.htm - we found in that case that typically only about 1/3 of the processes actually got executed for a given transaction. This led directly to the idea of dynamic subnets, to cut down on the size of the module that has to be linked into a single executable. Given the fact that Java's JVM has no trouble with many times that number of threads, I am not sure why we are still bothering with green threads...?

Matthew: What I want is a scheduler that I control, not the operating system. This is to allow a sensible update strategy for components, if a component consumes too many cpu cycles before completion I want to stop it and maybe choose to schedule it differently. Without threads the current option is co-routines or a state machine design where snapshots of state in a memento are passed out and back into the component next time I schedule it. These last two options require the person implementing the component to scatter scheduling related code in the implementation - Which I dislike. Possibly we see components differently - For the most part I'm imagining them to be closer to functions than large boxes (or certainly some of these components are).

The number of threads in turn relates to the question of granularity - this is discussed in the context of performance vs. maintainability in http://www.jpaulmorrison.com/fbp/perform.htm Typically, the greater the granularity, the more maintainable, but the poorer performance. As I say in that chapter, there are some more variables involved, and it's pretty subjective, but I think you will develop a feel for what is a "comfortable" level of granularity, which in turn means an appropriate number of threads in your application...

Having talked about some of my experience, I will now move into research mode... So it's mostly a set of questions - I will try to get feedback from other gurus as to whether what follows is reasonable.

If we follow a similar strategy in this new world to what I have tried to describe above, we now have to worry about environments, machine architectures, as well as languages. To get Java and C# to talk to other languages, as I understand it, Java has native, and C# has extern, which expose us to different environments and machine architectures. I would go at it incrementally, starting with: either the most saleable, or the combinations you need most (if different).

I am guessing that C++ is fairly well spread across machine environments, plus its ancestor (C) was developed to do system programming, so it seems like it might be a good base for what we want to do...(?) We could go down to Assembler, for greater flexibility, but then we would have to have different versions of our infrastructure for different machine architectures. In the mainframe world, some brilliant IBM architects were able to devise a way of providing one instruction set for the complete size range of 360/370/../z90 machines, which in turn means the related Assembler language has been remarkably long-lived (over 40 years).

Let us start with C++ as a base... Now we have to support "kernel" threads...

I would propose running POSIX on top of that... POSIX also defines a standard threading library API which is supported by most modern operating systems. Linux is described as mostly POSIX-compliant. In the Windows environment, Interix is an optional, full-featured POSIX and Unix environment subsystem, an implementation of an environment subsystem running atop the Windows kernel. So that looks promising for the Windows environment...

The POSIX threading facility is called pthreads - see https://computing.llnl.gov/tutorials/pthreads/ - I see that it's supported by GNU C++, and it supports mutexes and condition variables, so we ought to be able to build an FBP implementation on top of it.

So now we need a C++ FBP API on top of POSIX - we could use the one in the THREADS C++ implementation of FBP - see http://www.jpaulmorrison.com/fbp/threads.htm . The API doesn't care whether the infrastructure is preemptive or cooperative... Packets are basically malloc'd storage chunks, so it shouldn't be too difficult to have them communicate with other languages...

Based on what little I know, it seems that the next step is to provide native interfaces for Java (JNI), and extern interfaces for C# (DllImport?) to let them talk to our API. I haven't used either, so this is sheer speculation, but it seems plausible...

Apparently the equivalent of a .dll in Unix is a .so file - or maybe we don't have to worry about C# in the Unix world? What environment were you planning to run under?

That's as far as I can go without starting to build a prototype. I am always uncomfortable talking about things I haven't tried!

I would very much appreciate feedback on whether you feel this is a possible path for your group to follow, or whether you feel that this is not where you want to go...


FlowBasedProgramming | RecentChanges | Preferences
This page is read-only - contact owner for a password | View other revisions
Last edited April 23, 2009 4:15 pm by host86-138-84-200.range86-138.btcentralplus.com (diff)
Search: