12.1. Introduction

This chapter highlights and addresses architecture-level software development issues facing researchers and practitioners in the field of computer vision. A new framework, or architectural style, called Software Architecture for Immersipresence (SAI), is introduced. It provides a formalism for the design, implementation, and analysis of software systems that perform distributed parallel processing of generic data streams. The SAI style is illustrated with a number of computer-vision related examples. A code-level tutorial provides a hands-on introduction to the development of image stream manipulation applications using Modular Flow-Scheduling Middleware (MFSM), an open source architectural middleware implementing the SAI style.

12.1.1. Motivation

The emergence of comprehensive code libraries in a research field is a sign that researchers and practitioners are seriously considering and addressing software engineering and development issues. In this sense, the introduction of the Intel OpenCV library [2] certainly represents an important milestone for Computer Vision. The motivation behind building and maintaining code libraries is to address reusability and efficiency by providing a set of standard data structures and implementations of classic algorithms. In a field like computer vision, with a rich theoretical history, implementation issues are often regarded as secondary to the pure research components outside of specialty subfields, such as real-time computer vision (see, e.g., [27]). Nevertheless, beyond reducing development and debugging time, good code design and reusability are key to such fundamental principles of scientific research as experiment reproducibility and objective comparison with the existing state of the art.

For example, these issues are apparent in the subfield of video processing/analysis, which, due to the conjunction of technical advances in various enabling hardware performance (including processors, cameras, and storage media) and high-priority application domains (e.g., visual surveillance [3]), has recently become a major area of research. One of the most spectacular side effects of this activity is the amount of test data generated, an important part of which is made public. The field has become so rich in analysis techniques that any new method must almost imperatively be compared to the state of the art in its class to be seriously considered by the community. Reference data sets, complete with ground-truth data, have been produced and compiled for this very purpose (see, e.g., 6, 7, 8, 9]). Similarly, a reliable, reusable and consistent body of code for established—and "challenger"—algorithms could certainly facilitate the performance comparison task. Some effort is made toward developing open platforms for evaluation (see, e.g., [19]). Such properties as modularity contribute to code reuse and to fair and consistent testing and comparison. However, building a platform generic enough to not only accommodate current methods, but also allow incorporation of other relevant algorithms, some of them not yet developed, is a major challenge. It is well known in the software industry that introducing features that were not planned for at design time in a software system is at least an extremely hard problem, and generally a recipe for disaster. Software engineering is the field of study devoted to addressing these and other issues of software development in industrial settings. The parameters and constraints of industrial software development are certainly very different from those encountered in a research environment, and thus software engineering techniques are often unadapted to the latter.

In a research environment, software is developed to demonstrate the validity and evaluate the performance of an algorithm. The main performance aspect on which algorithms are evaluated is, of course, the accuracy of their output. Another aspect of performance is measured in terms of system throughput, or algorithm execution time. This aspect is all the more relevant as the amount of data to be processed increases, leaving less and less time for storing and offline processing. The metrics used for this type of performance assessment are often partial. In particular, theoretical complexity analysis is but a prediction tool, which cannot account for the many other factors involved in a particular system's performance. Many algorithms are claimed to be "real-time." Some run at a few frames per second, but "could be implemented to run in real-time," or simply will run faster on the next generation machines (or the next, etc.). Others have been the object of careful design and specialization to allow a high processing rate on constrained equipment. The general belief that increasing computing power can make any system (hardware and software) run faster relies on the hidden or implied assumption of system scalability, a property that cannot and should not be taken for granted. Indeed, the performance of any given algorithm implementation is highly dependent on the overall software system in which it is operated (see, e.g., [20]). Video analysis applications commonly involve image input (from file or camera) and output (to file or display) code, the performance of which can greatly impact the perceived performance of the overall application and in some cases the performance of the individual algorithms involved in the processing. Ideally, if an algorithm or technique is relevant for a given purpose, it should be used in its best available implementation, on the best available platform, with the opportunity of upgrading either when possible.

As computer vision matures as a field and finds new applications in a variety of domains, the issue of interoperability becomes central to its successful integration as an enabling technology in crossdisciplinary efforts. Technology transfer from research to industry could also be facilitated by the adoption of relevant methodologies in the development of research code.

If these aspects are somewhat touched in the design of software libraries, a consistent, generic approach requires a higher level of abstraction. This is the realm of software architecture, the field of study concerned with the design, analysis, and implementation of software systems. Shaw and Garlan give the following definition in [26]:

As the size and complexity of software systems increase, the design and specification of overall system structure become more significant issues than the choice of algorithms and data structures of computation. Structural issues include the organization of a system as a composition of components; global control structures; the protocols for communication, synchronization and data access; the assignment of functionality to design elements; the composition of design elements; physical distribution; scaling and performance; dimensions of evolution; and selection among design alternatives. This is the Software Architecture level of design.

They also provide the framework for architecture description and analysis used in the remainder of this chapter. A specific architecture can be described as a set of computational components and a their interrelations, or connectors. An architectural style characterizes families of architectures that share some patterns of structural organization. Formally, an architectural style defines a vocabulary of components and connector types, and a set of constraints on how instances of these types can be combined to form a valid architecture.

If software architecture is a relatively young field, software architectures have been developed since the first software was designed. Classic styles have been identified and studied informally and formally, their strengths and shortcomings analyzed. A major challenge for the software architect is the choice of an appropriate style when designing a given system, as an architectural style may be ideal for some applications, while unadapted for others. The goal here is to help answer this question by providing the computer vision community with a flexible and generic architectural model. The first step toward the choice—or the design—of an architectural style is the identification and formulation of the core requirements for the target system(s). An appropriate style should support the design and implementation of software systems capable of handling images, 2D and 3D geometric models, video streams, and various data structures in a variety of algorithms and computational frameworks. These applications may be interactive and/or have real-time constraints. Going beyond pure computer vision systems, processing of other data types, such as sound and haptics data, should also be supported in a consistent manner in order to compose large-scale integrated systems, such as immersive simulations. Note that interaction has a very particular status in this context, as data originating from the user can be both an input to an interactive computer vision system and the output of a vision-based perceptual interface subsystem.

This set of requirements can be captured under the general definition of cross disciplinary dynamic systems, possibly involving real-time constraints, user immersion, and interaction. A fundamental underlying computational invariant across such systems is distributed parallel processing of generic datastreams. As no existing architectural model could entirely and satisfactorily account for such systems, a new model was introduced.

12.1.2. Contribution

SAI (Software Architecture for Immersipresence) is a new software architecture model for designing, analyzing, and implementing applications performing distributed, asynchronous parallel processing of generic data streams. The goal of SAI is to provide a universal framework for the distributed implementation of algorithms and their easy integration into complex systems that exhibit desirable software engineering qualities such as efficiency, scalability, extensibility, reusability, and interoperability. SAI specifies a new architectural style (components, connectors, and constraints). The underlying extensible data model and hybrid (shared repository and message-passing) distributed, asynchronous parallel processing model allow natural and efficient manipulation of generic data streams, using existing libraries or native code alike. The modularity of the style facilitates distributed code development, testing, and reuse, as well as fast system design and integration, maintenance, and evolution. A graph-based notation for architectural designs allows intuitive system representation at the conceptual and logical levels, while at the same time mapping closely to processes.

MFSM (Modular Flow Scheduling Middleware) [12] is an architectural middleware implementing the SAI style. Its core element, the Fast Scheduling Framework (FSF) library, is a set of extensible classes that can be specialized to define new data structures and processes or encapsulate existing ones (e.g., from libraries). MFSM is an open source project, released under the GNU Lesser General Public License [1]. A number of software modules regroup specializations implementing specific algorithms or functionalities. They constitute a constantly growing base of open source, reusable code, maintained as part of the MFSM project. The project also includes extensive documentation, including user guide reference guide and tutorials.

12.1.3. Outline

This chapter is a hands-on introduction to the design of computer vision applications using SAI and their implementation using MFSM.

Section 12.2 is an introduction to SAI. Architectural principles for distributed parallel processing of generic data streams are first introduced, in contrast to the classic Pipes-and-Filters model. These principles are then formulated into a new architectural style. A graph-based notation for architectural designs is introduced. Architectural patterns are illustrated through a number of demonstration projects ranging from single-stream, automatic, real-time video processing to fully integrated distributed interactive systems mixing live video, graphics, and sound. A review of key architectural properties of SAI concludes the section.

Section 12.3 is an example-based, code-level tutorial on writing image-stream processing applications with MFSM. It begins with a brief overview of the MFSM project and its implementation of SAI. Design and implementation of applications for image-stream manipulation is then explained in detail using simple examples based on those used in the online MFSM user guide. The development of specialized SAI components is also addressed at the code level. In particular, the implementation of a generic image data structure (object of an open source module) and its use in conjunction with the OpenCV library in specialized processes are described step by step.

In conclusion, Section 12.4 offers a summary of the chapter and some perspectives on future directions for SAI and MFSM.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset