Abstract —

Download 73.62 Kb.

Date	20.10.2016
Size	73.62 Kb.
	#6395

CASSANDRA Framework: A Service Oriented Distributed Multimedia Content Analysis Engine

Jan Nesvadba, Fons de Lange, Johan Lukkien, Alexander Sinitsyn, Andrei Korostelev

Abstract — Connected Consumer Electronics (CE) devices face an exponential growth in storage, processing capabilities and connectivity bandwidth. This increased complexity calls for new software architectures that naturally admit self-organization, resource management for efficient workload distribution, and a transparent cooperation of connected CE-terminals. The interconnected nature makes advanced applications feasible such as the smart distribution of complex multimedia content analysis algorithms, resulting in automatic semantic content-awareness creation. In this paper we describe a modular, self-aware, self-organizing, real-time and distributed multimedia content analysis system. Based on a Service Oriented Architecture it is, furthermore, capable of efficiently and dynamically using the available resources in a grid-like manner.^¹
Index Terms — Multimedia Content Analysis, Service-Oriented Architecture.

I.INTRODUCTION

Petabytes of storage, a million MIPS (Million Instructions Per Second) at an 1000-Euro-pricepoint and several gigabits-per-second Internet-Protocol-based (IP) connectivity are realistic technical capabilities of next decades Consumer Electronics (CE) networks, as described in more detail in [1]. But, next to the technology also consumers’ behaviors evolve. Today’s consumers, for example, gradually change from passive absorbers to active and often internet-community-attached multimedia content producers. Unfortunately, this results in a massive production of individual content-unaware multimedia assets, made easy by ubiquitous and pervasive content creation CE devices such as camera mobile phones, camcorders, video recorders and PC software. Hence, the enormous memory resources scattered across distributed, but interconnected CE devices combined with the massive acquisition of commercial content and the massive consumer-driven personal content production stir consumers into the dilemma of multimedia retrieval and management, equivalent to the search of the needle in the haystack.

As a matter of fact, Petabytes of content distributed across a network of memory devices, similar to the incredible amount of memory scattered across the human brain, are only manageable under the condition of (semantic) content- and to some extent also context awareness. Hence, content-awareness creation is of utmost importance. Fortunately, the distributed, but connected processing and memory faculties of CE networks provide sufficient computational resources to perform the required multimedia content analysis and to memorize the generated content-awareness-creating content descriptors. It is, therefore, foreseeable, that such networks will efficiently share functionality, resources (memory, processing) and content such as to support distributed analysis applications. Hence, connected CE devices will attract consumers soon through smartly interoperating with each other and through their integration with e.g. internet services providing intuitive, state-of-the-art capabilities and applications, respectively.

The underlying required domain-specific audio-, music-, speech-, image- or video content analysis (expert) systems, each exploiting one sensorial signal in isolation, have been developed during the past decade by various independent expert teams. The majority, whereof, has been developed in isolation of the other expert systems and for one specific application domain only, mainly because of capability constraints. Unfortunately, human-communication-like content-awareness, enabling e.g. human-like search queries, has to be based on smart combinations of those orthogonal multimedia content analysis results as done by our human brains automatically. In order to enable smart collaborations of those individual expert systems, spanning from music genre analysis to Automatic Speech Recognition (ASR) and from face localization / identification to emotion analysis, they have to be embedded in a flexible, extendable, scalable and upgradeable system. Flexibility, extendibility and scalability are guaranteed through a rigid modular approach, wherein each individual expert system (e.g. an individual analysis component) is encapsulated into a functional unit, further called Service Unit (SU), with clearly defined and standardized Input / Output (I/O) interfaces. The latter guarantee hassle-free upgradeability of individual SU expert systems through simple replacement of the involved analysis algorithms, provided that metadata I/O formats do not change.
The complex nature of such a system, i.e. a distributed service-oriented analysis engine hosting a multitude of disciplinary-orthogonal analysis algorithms, has to provide ease-of-use, robustness and interoperability.

Particularly, the integration of know-how provided by various disciplinary-orthogonal expert teams and the management of such a system should be time and effort efficient. The system’s service-oriented, modular nature realized through SUs (as described in section II), and SUs’ standardized I/O interfaces allow algorithm expert teams to efficiently and transparently integrate their algorithms into the system, but also to effortlessly upgrade (replace) them and, finally, to co-develop combined joint solutions.

Needless to say, that the complex nature of such a distributed analysis engine necessitates smart system management solutions to grant robustness, especially if the system should perform 24 hours, 7 days a week (27/7).

First of all, auto-discovery of services (Service Units) and auto-configuration of the system are required to establish the distributed analysis engine, using a Use Case Manager as described in section II. Hereunto the latter has to be bandwidth-, processing- and memory faculty aware (resource-aware) at every time instance able to dynamically distribute services (Service Units) across the distributed device entities, also called dynamic auto-load balancing. This Resource Management allows resource- and cost efficient usage of the distributed faculties, which for the exclusive case of shared processing is often called grid computing. Furthermore, dynamic resource awareness provides the system with self-awareness required to guarantee Quality of Service (QoS).

Hence, self-organization, self-management and interoperability have been achieved by focusing on the network as the meeting point for these functional Service Units, exposing them as services. In the same, resources and content are made available on the network encapsulated as services. In addition, by using explicit description of these services and by making management handles available for external control, as described in section II, the required self- manageability has been realized.

Finally, to grant sufficient stability and robustness auto-error detection and auto-error treatment have to be added to the self-management solution. Especially, in the special case of multi-team, -programmer and -supplier conditions the treatment of errors is of uttermost importance for 24/7 system solutions. An auto-error detection, auto-error treatment (auto-recovery) and self-healing will be described in detail in section III.

The final result is a system in which new analysis algorithms are effortlessly integrated in applications simply by examining their description after exposure on the network, as shown in section V.
In this paper we describe a service-oriented distributed multimedia content analysis prototyping framework, which, in addition, also serves as a demonstration of automatically cooperating devices. The paper is organized as follows. Section II elaborates the service concepts in more detail. In section III and IV the specific cases of error handling and data management are elaborated. The results are presented in section V and the conclusions are summarized in section VI.

II.the service oriented architecture

Global Structure
Our aim is to develop a software architecture that facilitates the composition of applications from distributed components. The platform of our applications consists of a network of cooperating devices such as personal computers, consumer electronics equipments as well as mobile terminals such as telephones. Hence, our applications are naturally distributed over multiple devices and must be designed such that they can function in a changing environment. The internal software of devices comes from different manufacturers and we must be able to easily integrate new pieces of software of unknown quality. Our software model supports this by describing applications as the coordination of passive services running on devices in the network. In order to explain this we first look at more traditional ways to look at distributed applications.

Figure 1 shows a simplified view of a running program, which we call a component. It has four dependencies: the operating system platform (1), the user interface (2), services provided (3) and services obtained from the network (4).

Figure 1. Basic model of a network component.
Each component has dependencies (4); a typical client application has dependencies (1) and (3); a typical server has dependency (2) and a stand-alone application has just (3). In our approach we use mainly two types of components. First, the ones with dependencies (1) and (2), i.e., components that provide services to the network while possibly using services themselves. These services are made available on the network through interfaces comprising function calls, eventing and connection points (called pins) for streaming data. In our system, called the Cassandra Framework, we call these components Service Units (SU). Each SU can be run on a particular device, called a Node, while nodes may run several components and component instances at the same time.

Given a set of available passive services there is the question where the control lies. This control pertains to establishing connections between services as well as the control flow within a set of connected services. In a regular client-server application both types of control usually lie at the client. The second type of component in the Cassandra Framework is therefore a coordinator engine. The task of the coordinator is to setup a collection of services according to a given description and to connect them accordingly. This is done a.o. through dynamic service discovery. In terms of Figure 1 it uses just (1) although the services it accesses are not for its own use. Instead it instructs services to build connections between them coupling data pins as well as control interfaces. The control flow is then entirely defined by this connectivity pattern and the behavior of the services. In many cases, a simple data flow is used but more advanced behavior is possible as well. Note that the coordinator is actually not a part of the application; after setting up the network state it might as well disappear. In the framework we call the coordinator the use-case manager and a composition of services a use-case.

For the purpose of setting up a collection of services, each Node runs an instance of the so-called Component Repository service unit. The Component Repository is responsible for maintaining a cache of locally-available components and keeping them synchronized with a Master Repository. This is done using UPnP networking, where the Component Repository of each Node functions as a UPnP device, announcing it presence on the network and where the Master Component Repository Manager is a UPnP control point, capable discovering all UPnP devices such as the Component Repository of each Node. A possible setup of services and service units is given in Figure 2 below.

Figure 2. Overall system architecture.
The Component Repository starts or stops a component upon a request of the use-case manager. After starting a component, the Component Repository service keeps track of its execution lifetime. To this end, the Master Repository maintains an up-to-date list of all nodes and their configuration, i.e. the available components on these nodes and the services that they implement (Figure 2). Note that the configuration may change dynamically, when components are added or removed to the system.
This service-oriented organization of the system introduces significant benefits.

First, by separating composition and configuration entirely from the services themselves we postpone the decision how to use the services. In fact, by improving our coordination software we improve the operation of the system without having to change the services. This concept is used in the robustness. The services have a separate interface supporting error monitoring and error reporting; a monitor in the network may use this interface to improve the robustness. This is described in section III (“Error detection and treatment”). However, this is not required and the system will function without it. We will proceed to investigate different aspects along the same lines, for example resource management and security, as shown in the future work section VI.

Second, the architecture makes it straightforward to take decisions based on resource information and performance indicators. In this way load-balancing decisions are made upon startup of a use case. In principle it is just as easy to do re-balancing when a new use case needs to be instantiated.
Third, the integration of new functionality is just as easy as deploying it as a service and exposing it to the network. Although in practice a standard middleware approach will be used to do this, it is not required. The system does not depend on any (OS-)platform or language. The only dependence is through the interfaces found on the network. Hence, the standardization is on the format and semantics of the exchanged messages. This leads to an approach of building new functionality by extension and recomposition rather than reworking existing software. The results section V gives some examples of systems created this way.
Fourth, the architecture supports the cooperation of devices without this pre-installed knowledge of the cooperation context. This is an essential ingredient for the advance of Ambient Intelligence, where information is shared across devices without pre-installation and can be accessed and interpreted as if it were centralized and standardized. The applications in our system (the use cases) are in fact scripts that coordinate the services. As such they could be executed anywhere in the system. The data management section (section IV) elaborates somewhat further on this.
Communication styles and performance
Services can be used through interfaces. Two styles of interfaces are used: control-type interfaces comprising function calls and eventing, and streaming type interfaces enabling only simple data pushing and pulling. Compared to the simple data push-pull interface, the control-type interface is more expressive in terms of actions and data types, but has more performance overhead. In the Cassandra Framework, the control-type communication between services is based on UPnP, while the simple data push/pull communication is based on TCP/IP. In both cases communication is based on ports and IP addresses, but UPnP communication involves more complex protocol handling and time consuming message interpretation, which is not done in case of basic TCP/IP streaming communication.

As an example, Figure 3 shows a commercial block detector consisting of several connected services. It explicitly shows the connection entities that take care of the network communication. In our system, these can be UPnP device protocol stacks, TCP/IP sockets or a combination of both. Note that the figure does not show the repository managers and nodes, since these are not relevant to the discussion. Components take care of setting up the UPnP device stack for control type communication and TCP/IP sockets for data push/pull communication.

Figure 3. Commercial block detect use-case and connection points for network communication.
The real-time nature of the stream processing poses an extra challenge for the software architecture. After a use case has been setup, the data that flows between connected services should not pass up and down a complicated software stack for interpretation. Instead, after setup the data should flow smoothly between functional processing units making up the services. These can be hardware processing as well (e.g. an encoder or decoder). One way to look at this is that the control part of the system is used to configure the hardware plane perpendicular to it. The simple data push/pull is the best choice for realizing such communication. In Figure 3 the AV streams are best communicated via sockets, while the Channel select and bit-rate-setting clients operate best via a (UPnP) control-type interface. The GUI may use a mix of both.

III.Error detection and error treatment

One of the aims of the Cassandra Framework is to support prototyping of advanced MCA scenarios built from algorithms implemented as black-boxes. These algorithms are often developed externally and, as a result, are not guaranteed to be properly tested. Typical Cassandra setup includes over 17 PCs running in total more than 30 Service Units. The knowledge of overall system state including SU states, connectivity, and resource consumption is not centrally maintained, but is queried from SUs and nodes. The distributed nature of the Cassandra Framework, the decentralisation of its management, and the vulnerability of network communication lead to an increased probability of a failure. These failures are of a partial nature, that is, an individual SU may fail without influencing other SUs directly.

The Service Unit is a smallest autonomous entity in the Cassandra Framework. Each SU provides applications with contractually-specified functionality, which is an implementation of MCA algorithms, referred as service. In these terms the major error an application might encounter is a delivered service failure, that is, when the service deviates from its contract and this misbehavior is observed by the application. Error detection and treatment in Cassandra target at handling delivered service failures. Because of the strong decoupling in service oriented systems we separate a failure of the service from failure of its delivery.
Fault model

The following breakdowns constitute a fault model of the Cassandra Framework.

Service Unit failure, resulting from data connection breakdowns between SUs or from internal SU errors.
Failure of an executable, used for deploying of SUs observed as a termination of the executable’s process. We will refer to such an executable as a physical component. In Cassandra physical components are represented by UPnP devices, providing a communication layer between SUs and applications.
Failure of a node containing the SU and observed as non-reachibility of the node via the network.

Error detection architecture

Error detection in the Cassandra Framework is based on monitoring. The error-monitoring model consists of entities monitored for errors (monitored objects) and error monitoring entities (monitors). With respect to the fault model, there are three types of monitored objects: SUs, physical components containing SUs and nodes containing running physical components. Accordingly, we have three types of monitors: Service Unit Health Monitor (SHM), Component Health Monitor (CHM) and System Monitor, see .

Figure 4. Error monitoring architecture.
Each monitor is attached to objects of the same type, each monitored object implements an error monitoring interface to communicate with a monitor. A SHM runs inside each physical component and is responsible for monitoring SUs residing inside it. A CHM runs in each node and monitors local physical components. A System Monitor exists as system-wide singleton and is responsible for monitoring network nodes and collecting error information from Component Health Monitors.

shows the System Monitor as the topmost monitor of the framework. It gathers and analyses error information from the entire system and provides this information to interested parties. These parties might be applications or error treatment services. In the Cassandra Framework a service called Health Keeper is responsible for communication with the System Monitor and performing subsequent error treatment actions.

The following quantitative parameters are used to control the quality of error detection: detection time, network overhead and detection accuracy.

The overall error detection process follows the way the software errors are caught using exceptions mechanisms. The usage of exceptions as a basic mechanism for detecting errors is argued by the service oriented nature of the Cassandra Framework. Like exceptions thrown from functions, a service provided by a SU throws an error message whenever the service’s contract is violated. Another issue resulting from service orientation of Cassandra dictates that all inter-service communication errors are to be detected and handled by the Cassandra Framework outside the SUs.

Whereas the overall process of error-detection follows the exception mechanisms, individual interactions between monitors and monitored objects are constructed on top of basic push-pull model considered in [4]. According to this model, in a push-interaction a monitored object periodically issues heartbeat messages to its monitor, whereas in a pull interaction the monitor sends periodic liveness requests to the object. On top of this model custom error-monitoring protocols were designed for error monitoring in Cassandra, for each protocol its best domain of usage within the monitoring model was identified with respect to the above-mentioned parameters.
Error treatment architecture

Error treatment in the Cassandra Framework follows error detection. A backward recovery is used in Cassandra to restore the system state after an occurrence of a failure. A backward recovery was chosen rather then a forward recovery because it does not make assumptions on the system architecture. To enable error treatment, a Health Keeper service was added. The Health Keeper communicates with the System Monitor to retrieve error information and perform appropriate recovery actions. The Health Keeper handles errors listed in the fault model above.

As described in the previous section, in the Cassandra Framework each node runs an instance of Component Repository service. The Component Repository is responsible for starting and stopping physical components upon a request of client, i.e. typically the use-case manager. After starting a physical component, Repository keeps track of its execution lifetime. Therefore, any running physical component can be stopped in two ways: by issuing an appropriate stop request to the Repository or externally, that is, bypassing the Component Repository service (e.g. using OS means such as kill(1)). The first type of exit is called intended, the latter is called unintended. Both types of exits are noticed and separated by the Component Repository service, thus allowing to distinguish between intended and unintended exits of physical component. The latter are treated in Cassandra as abnormal fail-stops. Such a separation of physical component exits is used by local Component Health Monitors to reveal the actual physical components’ failures and to allow subsequent treatment activities. The following section covers treatment of errors defined in the fault model.
Error treatment scenarios

Below we consider error-handling, which covers all errors from the Cassandra fault model, that is, SU failures, unexpected physical component fail-stops, and node un-reachibility. We start with the failures considered irrespective to any data connections (i.e. standalone), which can exist on these SUs. Then we extend these techniques to handling of errors in the presence of SUs connected across the network.

Detection of standalone errors. The first set of error-handling scenarios includes standalone errors, that is, errors regarded without considering their affect to SUs connected to the failed ones. shows an example of a standalone fail-stop of a physical component.

Figure 5. Standalone physical component fail-stop error.
To address a separation between a normal exit of a physical component from its unintended fail-stop, each CHM communicates with a local Component Repository. When a physical component’s fail-stop occurs, the Repository notices it as an unintended exit and notifies Component Health Monitor. The CHM then notifies the System Monitor, which in its turn informs the Health Keeper about the physical component’s fail-stop.
A detection of SU failures is preformed by a Service Unit Health Monitor attached to the failed SU, which sends a subsequent error message towards the System Monitor through the local CHM. Node failure is observed by the System Monitor as a broken connection with a correspondent Component Health Monitor.
Treatment of standalone errors. After the error message reaches the Health Keeper, the latter can either restart failed SUs on the node it actually failed, unless the entire node is failed, or it can look up the system for available SUs that can be launched and having the same service types as failed SUs. In the Cassandra Framework this search is done by the Health Keeper, which multicasts an UPnP discovery request to Repositories. After restarting the failed physical components, the Health Keeper restores SUs’ configurations existed before they fail-stopped, thus recovering the system processing.
Detection of errors in a presence of connected SUs. Another set of scenarios shown in includes errors observed by at least one of SUs connected across the network as a failure of a data connection.

Figure 6. Data connection breakdown error.
Because the Cassandra Framework uses TCP to deliver media content and metadata, a failure of a host will cause connected SU endpoints to consider this as a network failure until the host reboots and its TCP stack responds with the so-called RST segment. The same confusion exists between a failure of a remote SU and a failure of a network channel linking the local endpoint and the SU: a failure of a remote SU makes a remote TCP stack to send a so-called FIN segment; the same result is observed by a local SU writing data to a broken network channel after its TCP stack gives up retransmitting the segment to the channel. For correct error treatment, it is essential to be able to differentiate between these two types of errors.
We address this separation at the System Monitor level. For simplicity we consider only two SUs connected through the network. Each time this connected pair of Service Units faces an error occurred either in between the SUs (data connection breakdown) or on the endpoint side (SU/physical component/node failure), the System Monitor receives two error messages from the connected endpoints. All possible error reporting combinations are then: (‘data connection failure’, ‘data connection failure’), (‘SU failure’, ‘data connection failure’), (‘physical component failure’, ‘data connection failure’), and (‘node failure’, ‘data connection failure’). Correct error identification is now possible each time the System Monitor receives a pair of different error messages. Namely, it picks up a non-data connection related failure. For example, when a SU breakdown occurs, the System Monitor receives ‘SU failure’ message from SHM attached to the failed SU and ‘data connection failure’ from the SHM of the connected endpoint. The System Monitor then concludes a SU failure. In the same way a separation between a physical component’s fail-stop or a host breakdown and a network data connection failure is addressed: a physical component fail-stop is reported by the attached Component Health Monitor as described in the first scenario; a failure of a host is observed as an unreachibility of an appropriate CHM. Finally, data connection breakdowns as such are revealed by a network subsystem of SUs and appropriate error reports are sent towards the System Monitor. Additionally, to speedup a detection of failures of data connections with long periods of inactivity, the connections endpoints (i.e. SUs) have TCP keep-alive enabled.
Treatment of errors in a presence of connected SUs. Handling of errors in a presence of connected SUs is typically the same as for the standalone case: the Health Keeper looks up the Component Repositories for available SUs of failed types and performs subsequent reconfiguration and restart. Additionally to the standalone scenarios, the Health Keeper restores connections, broken as a result of SUs failures. When several SUs of the same type of are available for launching on different nodes, a choice of a certain node is based on their current processing loads and on estimated resource consumption of this SU type.
Results and future improvements

The above scenarios cover all errors from the Cassandra fault model and several techniques of handling these errors. To handle errors transparently for an application, the Health Keeper needs knowledge of a current use case, which can be either reconstructed from the running use case or directly requested from the application. The Health Keeper follows the first option in order to decouple from the application.

Different strategies can be dynamically applied to the Health Keeper to specify treatment activities for each type of error, such as repeating a failed action (time redundancy), using other nodes to restart failed Service Units (physical redundancy), or reducing a level of service provided by failed SUs (QoS). A choice of a certain treatment strategy is currently performed manually; an ongoing work involves automatically choosing a right strategy based on a success rate of strategies applied before.
Further analysis of occurred errors leads to considering of fault prevention mechanisms, which will result in dynamic decisions of choosing an optimal in terms of reliability system configuration for certain use cases, as well as static design recommendations.

IV.Data Management

Challenges

The Multimedia Content Analysis (MCA) services have several key demands from the data management perspective. Both acquired and generated metadata as well as multimedia content have to be stored in the system, requiring the ability to download and stream data from and to the system’s storage. This enables delayed, sequential or off-line metadata processing. Moreover, it facilitates AV management- related applications such as search and retrieval.

An alternative to the central server approach is to keep the participating devices fully autonomous but cooperative. A collection of such devices we call a Device Society. Each device has its own responsibility to deliver certain functionality and the collection as a whole delivers the environment to run high-level applications. In such a system, most data (i.e. content and metadata) is stored distributed and only transmitted across the network and integrated at query time as needed, enabling so-called system-wide local data collection and storage. Such distributed storage is robust and scalable but data management is not easy anymore.
The user and applications should have a single view on the entire data collection and should be able to query the data collection using a high-level query language. Each query should be able to operate over data collected from multiple sources in the system. We can draw an analogy with Google where a single search query encompasses millions of web pages. Beyond the keyword search, our system should support rich queries, which could include arithmetic, aggregation, and other database operators. More over queries can be posed anywhere on the system without the need to have a central query root.
The pervasive nature of the Device Society raises significant data integrity and privacy concerns. In our MCA system we assume that the entire system is administrated by a single trusted authority. In a real-world deployment, different authorities may control portions of the infrastructure and service authors might wish to compose services across authority boundaries. The MCA system should support defining and enforcing data integrity and privacy policies. The combination of these policies should determine how data are distributed, processed, and shared in a system.

There is also the issue of interoperability: how to cope with connectivity via multiple communication protocols with various types of devices of different sizes from different vendors? How to cope with the exchange of metadata in different formats (e.g. DIDL, MPEG-7/21, MPV)? Furthermore, we cannot suffice with a per application metadata set. We require a comprehensive schema or ways to merge schemas. The translation of data in one metadata format into another can be easily accommodated by creating detailed mappings from one format to another. The automatic generation of mapping tables is also possible if enough data with the same descriptors in the different formats is provided. But given the limited set of widespread metadata formats it is questionable whether the creation of such an automatic mapper is worthwhile. A bigger problem is how to decide on which metadata is required and how to ensure we can extend these requirements over time in devices that are already deployed. Based on the combination of applications we envisage, we are working on a comprehensive schema describing all data used and anticipated currently, which is sufficient for now and should be sufficient for the near future also. In our current CASSANDRA prototype each service is an UPnP service providing a standardized API accessible by UPnP devices. In this way UPnP control points enable system management and monitoring. Although inside our system services (SUs) are exchanging data in predefined proprietary XML schema, in order to be interoperable with outside world the system can import and export data in the MPEG-7 format.

AmbientDB

To meet the above challenges AmbientDB, peer-to-peer (P2P) data management middleware, is being developed [6]. The concept of AmbientDB is shown in Figure 7.

Figure 7. Representation of the AmbientDB concept. Some applications run on dedicated hardware like in the case of mobile devices. Some applications run in a distributed fashion. All data sources, including sensors, are logically interlinked, presenting an integrated view on all data throughout the AmbientDB data management layer.
The goal of AmbientDB is to present a unified XML database access across an ad-hoc p2p network shielding applications from version, schema differences, synchronization, discovery, mobility, access control.
Data lookup. In order to provide efficient indexed lookup, underneath AmbientDB makes use of a Distributed Hash Table (DHT), a scalable P2P data structure for sharing data among a potentially large collection of nodes, allowing nodes to join and leave without making the network unstable. It uniformly distributes data over all nodes using a hash-function, enabling efficient O(log(N)) data lookup. The current AmbientDB prototype uses the Bamboo DHT [9]. To improve scalability in situations where some devices are resource poor, AmbientDB keeps devices out of the DHT to prevent overloading them with data they cannot store or with queries they cannot handle. Upon connection, low resource nodes transfer their data to a resource-rich neighbor that handles queries on behalf of them.
Synchronization. The aim of traditional distributed database technology, to provide strict consistency, is not appropriate for P2P database systems. Algorithms, such as two-phase locking, are too expensive for a large and sparsely connected collection of nodes. Many applications do not need full transactional consistency, but just a notion of final convergence of updates. Also, applications often have effective conflict resolution strategies that exploit application-level semantics. Thus, the challenge for a P2P data management system is to provide a powerful formalism in which applications can formulate synchronization and conflict resolution strategies. Our first target is to support applications that use rule-based synchronization expressed in a prioritized set of database update queries [7].
Schema integration and evolution. As devices differ in functionality and make, their data differs in semantics and form. AmbientDB use table-view based schema integration techniques [8] to map local schemata to a global schema. AmbientDB itself does not address the automatic construction of such mappings, but aims at providing the basic functionality for applying, stacking, sharing, evolving and propagating such mappings. Providing support for schema evolution within one schema, e.g. such that old devices can cooperate with newer ones, is often forgotten. We foresee that a global certifying entity keeps track of changes in the various subschemas, maintaining bi-directional mappings between versions. Schema deltas are certified such that one peer may carry it to the next, without need for direct communication with a centralized entity.

V.Execution architecture & RESULTS

The central concept in the framework is the service unit. It represent an independent unit of deployment and is mapped onto the notion of a process in the operating system. The services inside the service unit represent the actual work being done and are mapped to threads in the operating systems. The main part of the service is comprised by the task, which makes up the functional behavior. The task implements the media processing function and may also come from a third party; it is cast into a service by mapping it in a standard way to UPnP networking. Services are represented on the network through three interfaces: control, data and health monitoring.

The special service unit repository represents the device. By exposing a list of potentially available service units and their resource requirements as well as resource properties of the devices it admits an external controller (the use-case manager) to make load-balancing decisions. Through communication with the repository the use-case manager start the required service units at the appropriate locations. These can subsequently be found again through the regular UPnP discovery. The control interface of the services is then used to setup the connections. The health-monitoring interface is use independently by the local health monitor as explained in section III.

Data connections must be flexible in that they must buffer some amount to deal with fluctuations and it must be possible to realize new use cases or to repair failing ones without losing data. To that end we use a double buffering scheme as indicated in Figure 8. A separate synchronization process works to make the buffers at the two sides identical.

As mentioned in section III it is important that after setup the data stream passes more or less uninterpreted between the tasks. Any interpretation between tasks adds to the latency and/or may result in insufficient performance. For applications that are largely control based, current UPnP stacks offer limited performance. We have experimented with layered services in [5]. As it turns out the penalty for communication and message interpretation is rather large. Particularly when executed on a regular OS, the overhead of frequent switching turns out to be a limitation. Designing a performance-tuned stack on a real-time platform will significantly improve this. In addition, the use of XML and subsequent parsing is cumbersome which may be resolved by using compressed XML or by negotiation of more compact protocols.

Figure 8. Buffering in the Cassandra Framework. Tasks produce output to or read from a buffer that is duplicated at both sides of the network. A separate and independent process works towards synchronizing these two ‘halfs’.
The system prototype has been realized using a mix of Windows-based and Linux-based PCs. Different UPnP kits have been used as well as different programming languages. This clearly shows the success of the open design of the framework and the ability to integrate components regardless of Operating System and programming language.

Up to 17 PCs have been used interconnected by switched 100Mbs Ethernet. Scenarios of up to 30 service units have been run. Robustness and performance turn out to be excellent yielding a 24/7 running system.

VI.Conclusions and Future work

We have presented the architecture of a robust distributed media processing framework that supports cooperation by composition through a Service Oriented approach. The framework has been tested and shown effective by integrating components on different operating systems and in different languages. Future work is directed to advancing the service oriented nature to other aspects of the system. For example, the current openness is in conflict with security and privacy concerns which we want to address in a similar way as we did with the robustness. The current system works well because of over-dimensioning, particularly of the network. We want to study an approach in which Quality of Service tradeoffs must be made. On the one hand this may lead to re-balancing from the perspective of the use-case manager. On the other hand this leads to the need that resource needs are guaranteed by the platform on which the service units run.
As an example, Figure 9 shows the Graphics User Interface of the Cassandra Use-Case Manager for an example network of SUs, realizing a complex Content Analysis System. By means of this GUI, networks can be defined, instantiated, executed, controlled and monitored.

Figure 9. Use Case Manager GUI, showing execution of complex network of SUs.

VII.References

J. Nesvadba, “The Vision on Content Analysis Engines for Content based Search Engines and Services”, Int. Conf UITV, Athens, Greece, 2006, submitted for publication.
C. Fetzer, M. Raynal, and F. Tronel. An adaptive failure detection protocol. In Proc. of the 8th IEEE Pacific Rim Symp. on Dependable Computing. Seoul, Korea, 2001.
I. Gupta, T. D. Chandra, and G. S. Goldszmidt. On scalable and efficient distributed failure detectors. In Proc. of the 20^th Annual ACM Symp. on Principles of distributed computing. Rhode Island, USA, 2001
N. Hayashibara, A. Cherif, T. Katayama, Failure Detectors for Large-Scale Distributed Systems, Proc. of the 21st IEEE Symp. on Reliable Distributed Systems. Osaka, 2002
R.J.J. Beckers, Evaluation of UPnP in the context of Service Oriented Architectures, Master Thesis, Eindhoven University of Technology, Computer Science, December 2005.
W.F.J. Fontijn, P.A. Boncz. “AmbientDB: P2P Data Management Middleware for Ambient Intelligence”. Middleware Support for Pervasive Computing Workshop at the 2nd Conference on Pervasive Computing, (PERWARE04), pp. 203-207, March 2004
A. Sinitsyn. “A Synchronization Framework for Personal Mobile Servers”. Middleware Support for Pervasive Computing Workshop at the 2nd Conference on Pervasive Computing (PERWARE04), pp. 208-212, March 2004.
A. Halevy. “Answering queries using views: A survey”. VLDB Journal 10(4): pp. 270-294, 2001.
The Bamboo DHT. http://bamboo-dht.org/
J. Nesvadba, Y. Sunit, “Feature Point based Parallel Shot Detector”, Proc. IEEE Int. Conf. On Multimedia and Expo, Toronto, Canada, 2006, submitted for publication.

Acknowledgment

Special thanks go to Jan Ypma and Willem Fontijn for their contribution to the framework.
J

an Nesvadba (M’02) studied electrical engineering and electro-biology (thesis: di-electrophoresis of biological cells) at Vienna University of Technology, Austria, and graduated in ‘97 (cum laude). Currently, he finalizes his Ph.D. thesis at Labri, Univ. of Bordeaux, France. He joined Philips Research, The Netherlands, in ‘98 and worked on HybridFiberCoax-networks, and since ’99 as senior scientist on multimedia content analysis algorithms, related smart architectures, the development of related codec ICs and content management solutions for consumer storage devices. Furthermore, he is an active IEEE conference committee member and reviewer of several related journals.

F

ons de Lange is an embedded system architect at Philips since 1991. He has worked and published on a variety of HW/SW architectures, IC design tools, applications and implementations, including multi window TV, high throughput digital video processors, software architecture for digital TV set-top boxes, prototyping of embedded system architectures with PC networks, and holds multiple US patent applications on these topics. Currently, Fons works as a senior research scientist in the Software Architecture group at Philips Research. His current research interests are in information technology for medical systems with a focus on generic software architectures for multi-modality medical imaging. Dr. Fons de Lange has a MSc in Electrical Engineering, in particular on Computer Aided IC Design, as well as PhD in computer science on design methods for parallel processors, both from Delft University of Technology.

J

ohan Lukkien is an associated professor in Computer Science at Eindhoven University of Technology, the Netherlands. He obtained an MSc in mathematics and a PhD in Computer Science from Groningen University in 1991. He stayed at Caltech part of that period and joined Eindhoven University of Technology 1991. His research focus was parallel computations, in particular large-scale simulations. Currently, he chairs the group System Architecture and Networking. His current research area comprises networked, resource-constrained embedded systems with an emphasis on consumer electronics. Several research projects are currently running in collaboration with Philips Research with extensive output. Johan Lukkien is chair of the networking track in the ICCE-06.

lexander Sinitsyn is a research scientist at Philips Research Eindhoven, The Netherlands and a PhD candidate at University of Amsterdam. His research interests include distributed systems, database systems and data management in peer-to-peer and sensor networks. He received his MSc in computer science from Belarusian State University of Computer Science.

ndrei Korostelev gained MSc. in Mathematics in the Belarussian State University in 1999 and Professional Doctorate in Engineering in 2003 in Technical University of Eindhoven (TU/e), The Netherlands. He worked for 3 years as a software engineer for Invention-Machine Corp. in Minsk, Belarus. He currently works as a research scientist at TU/e and Philips Research Eindhoven, the Netherlands. His research interests include distributed architectures (UPnP) with an emphasis on service-orientation.

1 J. Nesvadba, F. de Lange and A. Sinitsyn are with Philips Research Eindhoven in The Netherlands (e-mail: {jan.nesvadba, fons.delange, alexander.sinitsyn} @philips.com). J. Lukkien and A. Korostelev are with the Eindhoven University of Technology in The Netherlands (e-mail: {j.j.lukkien, a.v.korostelev} @tue.nl).

Download 73.62 Kb.

Share with your friends: