Data Flow Model
The Data Flow model emerged in the early 1970s through pioneering work by Jack Dennis at MIT1, paralleled by research by Robert Barton and Al Davis at the University of Utah2. This model arose as an alternative to the von Neumann architecture to create a framework for expressing parallelism. Unlike traditional von Neumann architectures, which execute instructions sequentially, the data flow model represents computation as a directed graph of data dependencies. Nodes in this graph correspond to operations, and edges represent data flowing between them. Execution is driven by the availability of data, allowing operations to proceed independently and in parallel. The data flow model was promising better parallel execution by eliminating the program counter and global updating of state that are essential in the operation of the Stored Program Machine.
Barton and his colleagues produced the first operational data flow machine, called the DDM-13. Other examples of data flow machines were the Manchester Dataflow Machine and MIT’s Tagged Token Dataflow Architecture. These systems employed mechanisms like token matching and dynamic scheduling to achieve fine-grained parallelism, enabling independent operations to execute simultaneously. However, challenges such as high hardware complexity and difficulties in scaling limited their widespread adoption.
The Synchronous Data Flow Model
Building on the general data flow model, the Synchronous Data Flow (SDF) model was introduced in the 1980s by Edward A. Lee and David G. Messerschmitt4. Designed for specific applications such as signal processing and embedded systems, SDF introduced a deterministic structure to the data flow paradigm. Each node in an SDF graph has fixed rates of token production and consumption, enabling static scheduling and precise resource analysis.
In the general data flow model:
- Nodes (Actors): Represent computational tasks or operations.
- Edges: Represent communication channels through which data flows.
- Tokens: Units of data that move between nodes along the edges, triggering execution.
In SDF, these principles are refined by enforcing fixed token rates, enabling:
- Static Scheduling: Tasks can be scheduled deterministically before execution.
- Simplified Analysis: Buffer sizes and resource requirements can be calculated in advance.
- Deadlock Detection: Ensures reliable execution for real-time systems.
The Data Flow Machine and SDF Modifications
Data Flow Machines initially implemented the general data flow model with dynamic scheduling and token matching to exploit parallelism. While effective for parallelism, their complexity and lack of determinism made them difficult to use for practical applications.
SDF-based systems modified this approach by introducing predictable behavior. By enforcing static token rates, SDF adaptations of Data Flow Machines achieved greater reliability and efficiency, making them particularly suited for specialized domains like digital signal processing. These modifications facilitated:
- Predictable Execution: Ideal for real-time applications.
- Resource Optimization: Reduced overhead and improved scalability in constrained environments.
The strength of the resource contention management of the Data Flow Machine allows it to execute along the free schedule, that is, the inherent parallelism of the algorithm. Any physical implementation, however, is constrained by the energy-efficiency of the CAM and the network that connects the CAM to the fabric of computational elements. As concurrency demands grow, the efficiency of both the CAM and the fabric decreases making large data flow machines uncompetitive. However, small data flow machines don’t have this problem and are able to deliver energy-efficient, low-latency resource management 5. Today, all high-performance microprocessors have such a restricted data flow machine at their core.
Success and Impact
The data flow model laid the foundation for parallel computing, influencing the design of modern parallel architectures and programming paradigms. Although Data Flow Machines faced practical challenges, their principles endure in modern systems, such as functional programming languages and streaming frameworks.
The SDF model further extended this legacy, offering deterministic and analyzable execution that has profoundly impacted signal processing, multimedia applications, and control systems. Modern tools and frameworks, like TensorFlow, Cloud Dataflow, and Apache Beam, draw on these foundational ideas to support parallel and streaming computations.
The data flow model and its extensions, including the Synchronous Data Flow model, represent a critical evolution in computational theory. By enabling parallelism and predictability, these models have significantly influenced both theoretical research and practical applications.
Footnotes
[1] Jack B. Dennis, Data Flow Supercomputing, IEEE Computer, Volume 13, Issue 11, November 1980 paper
[2] Utah Computing History Project [brochure]](https://www.sci.utah.edu/~nathang/utah-history/utah-history-computing.pdf)
[3] A. L. Davis, Architecture and System Method of DDM1: A recursively structured data driven machine, ISCA 1978: Proceedings of the 5th annual symposium on Computer architecture, Pages 210-215 paper
[4] Edward A. Lee, David G. Messerschmitt, Synchronous data flow, Proceedings of the IEEE, Volume: 75, Issue: 9, September 1987, paper
[5] Wen-mei Hwu and Yale N. Patt, HPSm, a High Performance Architecture Restricted Data Flow Having Minimal Functionality, paper