Multimedia Data

Data Mining Trends and Research Frontiers

Jiawei Han , ... Jian Pei , in Data Mining (Third Edition), 2012

Mining Multimedia Data

Multimedia data mining is the discovery of interesting patterns from multimedia databases that store and manage large collections of multimedia objects, including image data, video data, audio data, as well as sequence data and hypertext data containing text, text markups, and linkages. Multimedia data mining is an interdisciplinary field that integrates image processing and understanding, computer vision, data mining, and pattern recognition. Issues in multimedia data mining include content-based retrieval and similarity search, and generalization and multidimensional analysis. Multimedia data cubes contain additional dimensions and measures for multimedia information. Other topics in multimedia mining include classification and prediction analysis, mining associations, and video and audio data mining (Section 13.2.3).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123814791000137

Data Mining and Knowledge Discovery

Sally I. McClean , in Encyclopedia of Physical Science and Technology (Third Edition), 2003

III.F Multimedia Data Mining

Multimedia Data Mining involves processing of data from a variety of sources, principally text, images, sound, and video. Much effort has been devoted to the problems of indexing and retrieving data from such sources, since typically they are voluminous. A major activity in extracting knowledge from time-indexed multimedia data, e.g., sound and video, is the identification of episodes that represent particular types of activity; these may be identified in advance by the domain expert. Likewise domain knowledge in the form of metadata may be used to identify and extract relevant knowledge. Since multimedia contains data of different types, e.g., images along with sound, ways of combining such data must be developed. Such problems of Data Mining from multimedia data are, generally speaking, very difficult and, although some progress has been made, the area is still in its infancy.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B0122274105008450

Hyper-Media Databases

William I. Grosky , in Encyclopedia of Information Systems, 2003

II. The Characteristics of Multimedia Data

Multimedia data are quite different from standard alphanumeric data, both from a presentation as well as from a semantics point of view. From a presentation viewpoint, multimedia data is quite huge and has time-dependent characteristics that must be adhered to for a coherent viewing. Whether a multimedia object is preexisting or constructed on the fly, its presentation and subsequent user interaction push the boundaries of standard database systems. From a semantics viewpoint, metadata and information extracted from the contents of a multimedia object are quite complex and affect both the capabilities and the efficiency of a multimedia database system. How this is accomplished is still an active area of research.

Multimedia data consist of alphanumeric, graphics, image, animation, video, and audio objects. Alphanumeric, graphics, and image objects are time-independent, while animation, video, and audio objects are time-dependent. Video objects, being a structured combination of image and audio objects, also have an internal temporal structure which forces various synchronization conditions. A single frame of an NTSC quality video requires (512 × 480) pixels × 8 bits/pixel = 246 kb, while a single frame of an HDTV quality video requires (1024 × 2000) × 24 bits/pixel = 6.1 mb. Thus, at a 100:1 compression ratio, an hour of HDTV quality video would take 6.6 gb of storage, not even considering the audio portion. Utilizing a database system for presentation of a video object is quite complex, if the audio and image portions are to be synchronized and presented in a smooth fashion.

Besides its complex structure, multimedia data require complex processing in order to extract semantics from their content. Real-world objects shown in images, video, animations, or graphics, and being discussed in audio are participating in meaningful events whose nature is often the subject of queries. Utilizing state-of-the-art approaches from the fields of image interpretation and speech recognition, it is often possible to extract information from multimedia objects which is less complex and voluminous than the multimedia objects themselves and which can give some clues as to the semantics of the events being represented by these objects. This information consists of objects called features, which are used to recognize similar real-world objects and events across multiple multimedia objects.

How the logical and physical representation of multimedia objects are defined and relate to each other, as well as what features are extracted from these objects and how this is accomplished are in the domain of multimedia data modeling.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B0122272404000897

A Unified Data Model for Representing Multimedia, Timeline, and Simulation Data

John David N. Dionisio , Alfonso F. Cárdenas , in Readings in Multimedia Computing and Networking, 2002

Queries with Multimedia Predicates

Multimedia data can be used in MQuery not only as query results but also as participants in the actual predicates. The queries below show predicates that cannot be answered solely from alphanumeric information.

Query 6.

Obtain the sex, age, and doctor of all patients with tumors similar in shape to the tumor currently being viewed.

Query 7.

Locate other treated lesions in the database similar with respect to size, shape, intensity, and growth or shrink rate of the current case.

Query 8.

Does the lesion overlap any of the activated areas from the functional MRI study?

Query 6 and Query 7 can be expressed and answered by the current prototype, although better techniques for answering Query 7 are being investigated. Query 8 can be expressed but not answered by the current prototype.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9781558606517501212

Multimedia Systems: Content-Based Indexing and Retrieval

Faisal Bashir , ... Dan Schonfeld , in The Electrical Engineering Handbook, 2005

6.1 Introduction

Multimedia data, such as text, audio, images and video, are rapidly evolving as main avenues for the creation, exchange, and storage of information in the modern era. Primarily, this evolution is attributed to rapid advances in the three major technologies that determine the data's growth: VLSI technology that is producing greater processing power, broadband networks (e.g., ISDN, ATM, etc.) that are providing much higher bandwidth for many practical applications, and multimedia compression standards (e.g., JPEG, H.263, MPEG, MP3, etc.) that enable efficient storage and communication. The combination of these three advances is spurring the creation and processing of increasingly high-volume multimedia data, along with efficient compression and transmission over high-bandwidth networks. This current trend toward the removal of any conceivable bottleneck for those using multimedia data, from advanced research organizations to home users, has led to the explosive growth of visual information available in the form of digital libraries and online multimedia archives. According to a press release by Google Inc. in December 2001, the search engine offers access to over 3 billion Web documents and its Image search comprises more that 330 million images. Alta Vista Inc. has been serving around 25 million search queries per day in more than 25 languages, with its multimedia search featuring over 45 million images, videos, and audio clips.

This explosive growth of multimedia data accessible to users poses a whole new set of challenges relating to data storage and retrieval. The current technology of text-based indexing and retrieval implemented for relational databases does not provide practical solutions for this problem of managing huge multimedia repositories. Most of the commercially available multimedia indexing and search systems index the media based on keyword annotations and use standard text-based indexing and retrieval mechanisms to store and retrieve multimedia data. There are often many limitations with this method of keyword-based indexing and retrieval, especially in the context of multimedia databases. First, it is often difficult to describe with human languages the content of a multimedia object (e.g., an image having complicated texture patterns). Second, a manual annotation of text phrases for a large database is prohibitively laborious in terms of time and effort. Third, since users may have different interests in the same multimedia object, it is difficult to describe it with a complete set of keywords. Finally, even if all relevant object characteristics are annotated, difficulty may still arise due to the use of different indexing languages or vocabularies by different users. As recently as the 1990s, these major drawbacks of searching visual media based on textual annotations were recognized as unavoidable, and this prompted a surging increase in interest in content-based solutions (Goodrum, 2000). In content-based retrieval, manual annotation of visual media is avoided, and indexing and retrieval are instead performed on the basis of media content itself. There have been extensive studies on the design of automatic content-based indexing and retrieval (CBIR) systems. For visual media, these contents may include color, shape, texture, and motion. For audio/speech data, contents may include phonemes, pitch, rhythm, and cepstral coefficients. Studies of human visual perception indicate that there exists a gradient of sophistication in human perception, ranging from seemingly primitive inferences (e.g., shapes, textures, and colors), to complex notions of structures (e.g., chairs, buildings, and affordances), and to cognitive processes (e.g., recognition of emotions and feelings). Given the multidisciplinary nature of the techniques for modeling, indexing, and retrieval of multimedia data, efforts from many different communities of engineering, computer science, and psychology have merged in the advancement of CBIR systems. But the field is still in its infancy and calls for more coherent efforts to make practical CBIR systems a reality. In particular, robust techniques are needed to develop semantically rich models to represent data, computationally efficient methods to compress, index, retrieve, and browse the information; and semantic visual interfaces integrating the above components into viable multimedia systems.

This chapter reviews the state-of-the-art research in the area of multimedia systems. Section 6.2 reviews storage and coding techniques for different media types. Section 6.3 studies fundamental issues related to the representation of multimedia data and discusses salient indexing and retrieval approaches introduced in the literature. For the sake of compactness and focus, this chapter reviews only CBIR techniques for visual data (i.e., for images and videos); the review of systems for audio data readers are referred to Foote (1999).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780121709600500323

Database Archiving Overview

Jack E. Olson , in Database Archiving, 2009

Multimedia and other complex documents

Multimedia data refers to pictures, videos, and images scanned into a computer. It is not unusual for an organization to have a requirement that these documents be kept for long periods of time. For example, a casualty insurance company could require that field agents photograph automobile accidents or storm damage to houses and keep the images in archives for many years. Most likely, the requirement would involve using digital cameras, with the resulting computer files typed to the representation format. Many such formats for pictures, scanned images, and videos can exist in the universe of possible input sources.

Other examples of complex documents are digitized X-rays or output of magnetic resonance imaging (MRI) devices. Specialized presentation programs are needed to display the archived documents. This technology and the programs that manipulate the electronic representation of these documents are evolving as well, requiring that the archivist understand the latest and the oldest processing programs needed and to make sure that they are available.

These types of objects may also have special search algorithms that can locate elements of images by reading the content of documents. Such search algorithms can become unusable if the number of documents to be searched becomes too large. An archiving solution may consider pre-searching objects as they are placed in the archive and building external indexes of common characteristics.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123747204000017

Querying Multimedia Presentations Based on Content

Taekyong Lee , ... Gultekin Özsoyoglu , in Readings in Multimedia Computing and Networking, 2002

1 INTRODUCTION

Multimedia data is a combination of video, audio, text, graphics, still images, and animation data. A multimedia presentation is a synchronized and, possibly, interactive delivery of multimedia data to users. Multimedia presentations are used extensively in many applications such as computer-aided training, computer-aided learning, world wide web sites, product demonstrations, document presentations, online books, and electronic encyclopedias.

Presently, multimedia presentations are created by using commercial multimedia authoring tools and stored into persistent storage such as a CD medium. Recently, commercial multimedia authoring tools have added database access or a database front end to let users access media files and clip libraries. However, to the best of our knowledge, interaction between a multimedia authoring tool and a multimedia database is loose and the database is used for only very basic purposes. We believe that multimedia presentations should be managed by multimedia databases and queried by an integrated presentation query language to allow users to store or select multimedia presentations with respect to their content. The content of a multimedia presentation consists of the contents of the individual multimedia streams in the presentation, as well as the playout order of streams, which describes how the information is presented. Playout order is an important part of content, especially for querying purposes.

In this paper, we assume that multimedia presentations are created and stored in the form of multimedia presentation graphs, which can be viewed as high level abstractions for multimedia presentations. Simply, a multimedia presentation graph specifies the playout order of various types of streams making up the multimedia presentation, i.e., it is a visual specification of a presentation plan. Using a graph model for presentations, this paper discusses languages for querying multimedia presentation graphs.

Each node of a presentation graph is a media stream and a directed edge between two nodes specifies the playouttime precedence relationship among the corresponding streams. Fig. 1 gives an example of a simple presentation graph entitled "National Geography" consisting of video streams "Four Seasons of Yellowstone", "Yellowstone", "Wildlife", "Landscapes", "Forests", and "Next week in National Geography", and audio streams "Promo Song" and "Four Seasons". In this paper, we illustrate our contributions using only the video multimedia data type.

Fig. 1. A multimedia presentation graph.

A multimedia video stream consists of a sequence of video frames, each of which is associated with some content information, namely, a set of content objects and content relationships among its content objects. So, our object-oriented data model includes presentation graph, stream, frame and content-object classes whose objects represent, respectively, multimedia presentation graphs, multimedia streams, video frames, and content objects. Presentation node is also a class and inherits attributes of the stream class.

To query multimedia presentation graphs, this paper discusses GVISUAL, which is an icon-based object-oriented graphical query language. For comparison, we also introduce GOQL, an OQL-like language with constructs to create and manipulate presentation graphs. We also present GCalculus, which forms a formal basis for GVISUAL.

All three presentation graph languages are developed for querying presentation graphs using temporal operators Next, Connected, and Until, and path formulas. Paths of a multimedia presentation graph are specified by using computational tree logic [21] (extended with different semantics for path quantification [34]), and temporal operators of propositional linear temporal logic [21], [55] (extended with node and path variables). Path formulas not only can specify paths constructed by the nodes of a presentation graph, but also specify content changes among frames of streams and hierarchical relationships between a stream and its contents. We use temporal logic formulas for path formulas and introduce node and path variables to temporal logic formulas in order to identify distinct paths and to specifically refer to nodes in multimedia presentation graphs.

GVISUAL is a graphical icon-based query language for the graph data model and extends VISUAL [11]. For querying multimedia presentation graphs (or graphs in general), new constructs of GVISUAL that are graphical representations of temporal operators allow users to express relationships between nodes, edges, and paths along presentation graphs.

The rest of the paper is organized as follows: The data model for multimedia presentation graphs is given in Section 2. VISUAL is briefly reviewed in Section 3. GVISUAL and GOQL are discussed in Section 4. Section 5 presents GCalculus. In Section 6, we briefly discuss the expressive power and user friendliness of GVISUAL. Section 7 summarizes the GVISUAL implementation effort and the ongoing work on query processing in GVISUAL. Section 8 provides a discussion on related work. Section 9 concludes.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9781558606517501224

Analyzing qualitative data

Jonathan Lazar , ... Harry Hochheiser , in Research Methods in Human Computer Interaction (Second Edition), 2017

11.5 Analyzing Multimedia Content

Multimedia data has become prevalent in our daily life thanks to the rapid advances in affordable portable electronic devices and storage technologies. Researchers can collect a large quantity of image, audio, and video data at fairly low cost. Multimedia information such as screen shots, cursor movement tracks, facial expressions, gestures, pictures, sound, and videos provide researchers an amazingly rich pool of data to study how users interact with computers or computer-related devices.

Multimedia information also presents substantial challenges for data analysis. In order to find interesting patterns in the interactions, the image, audio, and video data need to be coded for specific instances (i.e., a specific gesture, event, or sound). Without the support of automated tools, the researcher would have to manually go through hours of audio or video recordings to identify and code the instances of specific interest. This process can be extremely time-consuming, tedious, and in many cases, impractical.

The basic guidelines for analyzing text content also apply to multimedia content. Before you start analyzing the data, you need to study the literature and think about the scope, context, and objective of your study. You need to identify the key instances that you want to describe or annotate. After the analysis, you need to evaluate the reliability of the annotation. If a manual annotation approach is adopted, it may be a good idea to select a subset of the entire data set for analysis due to high labor cost. For example, Peltonen et al. (2008) picked eight days of data from a study that lasted for 1 month. They first automatically partitioned the video footage into small "sessions," then manually coded the information in which they were interested (the duration of interaction, the number of active users, and the number of passive bystanders).

Another application domain related to multimedia content analysis is the online search of media content. There is a huge amount of images, videos, and audios on the web. Users frequently go online to search for images, videos, or audio materials. Currently, most multimedia search is completed by text-based retrieval, which means that the multimedia materials have to be annotated or labeled with appropriate text. So far, annotation can be accomplished through three approaches: manual annotation, partially automated annotation, and completely automated annotation.

Considering the huge amount of information that needs to be annotated, the manual approach is extremely labor intensive. In addition, it can also be affected by the coder's subjective interpretation. The completely automated approach is less labor intensive. However, due to the substantial semantic gap between the low-level features that we can currently automatically extract and the high-level concepts that are of real interest to the user, existing automatic annotation applications are highly error prone (i.e., many images that have nothing to do with cats may be annotated with "cat" using this automatic annotation). A more recent development in this field is the partially automated approach. Human coders manually annotate a subset of the multimedia data. Then the manually coded data is used to train the application to establish the connection between the low-level features and the high-level concept. Once a concept detector is established, the detector can be used to automatically annotate the rest of the data (Rui and Qi, 2007). The same approach can be applied to images and video and audio clips.

The techniques for multimedia content analysis are built on top of multiple domains including image processing, computer vision, pattern recognition and graphics. One of the commonly adopted approaches used by all those fields is machine learning. The specific algorithms or techniques of multimedia content analysis are still seeing dramatic advances. For more detailed information on those topics, see publications in the related fields (Hanjalic et al., 2006; Sebe et al., 2007; Divakaran, 2009; Ohm, 2016). The specific applications that are particularly interesting to the HCI field include action recognition and motion tracking (Zhu et al., 2006; Vondrak et al., 2012), body tracking (Li et al., 2006), face recognition, facial expression analysis (Wu et al., 2006; Wolf et al., 2016), gesture recognition (Argyros and Lourakis, 2006), object classification and tracking (Dedeoğlu et al., 2006; Guo et al., 2015), and voice activity detection (Xue et al., 2006). A substantial number of studies have focused on automatic annotation and management of images.

In addition to the automatic annotation applications, a number of other tools have been developed to facilitate the process of multimedia content analysis. Dragicevic et al. (2008) developed a direct manipulation video player that allows a video analyst to directly drag and move the object of interest in the video to specific locations along their visual trajectory. Wilhelm et al. (2004) developed a mobile media metadata framework that enables image annotation on a mobile phone as soon as a picture is taken. The unique feature of this system is that it guesses the content of the picture for the purpose of reducing the amount of text entry needed during the annotation. Kandel et al. (2008) proposed the PhotoSpread system, which allows users to organize and analyze photos and images via an easy-to-use spreadsheet with direct manipulation functions. Applications that support content visualization for easy data sharing and analysis have also been developed (Cristani et al., 2008). The ChronoViz tool supports playback and review of multiple, synchronized streams of multimedia data (Fouse et al., 2011).

Techniques for automatic annotation still need substantial advancements in order to achieve reliable coding. The applications to facilitate manual coding have shown promising results but improvements are also needed to improve the usability and reliability of those systems.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012805390400011X

Pervasive Sensing and Monitoring for Situational Awareness

Sharad Mehrotra , ... Carolyn Talcott , in Handbook on Securing Cyber-Physical Critical Infrastructure, 2012

20.1 Introduction

Advances in sensing and multimedia data capture technologies coupled with mechanisms for low-power wireless networking have enabled the possibility of creating deeply instrumented cyber-physical spaces. Embedded sensors and data capture devices in such environments enable the possibility of digitally capturing the state of the evolving physical systems and processes, which can then be used to gain situational awareness of the activities in the instrumented space. Situational awareness, in a broad sense, refers to a continuum of knowledge that captures the current state of the physical environments being observed to future projected states of these observed environments. Such awareness is created through processing of data from the sensed environment. Deeply instrumented physical spaces generate sensor data that is used to create digital representations of the physical world, which can then be used to implement new functionalities or improve existing ones, and to adapt the configuration of the system itself, we refer to such cyber-physical spaces as sentient spaces. Sentient spaces embody the reflective design principle of "observe-analyze-adapt" wherein a system continuously observes its state to adapt its behavior (based on its state). Such adaptations may be at the system level (e.g., adjustment of network parameters to enable more effective information collection), or at the application level to achieve new functionalities or to optimize overall application goals (e.g., automated control of devices based on user behavior to conserve energy). Examples of sentient space applications in the infrastructure security domain include:

Surveillance systems for critical infrastructures such as ports and nuclear facilities or societal spaces such as malls, schools, and buildings and

Emergency response systems that provide incident situational awareness during unexpected disasters such as fires and floods.

Sentient spaces offer unprecedented opportunities to bring IT-driven adaptations and control to a variety of societal systems in application domains such as energy management, building design, transportation, avionics, agriculture, water management, and infrastructure lifelines.

The goal of this chapter is to identify fundamental challenges in building large-scale sentient spaces. Before we discuss challenges and describe emerging technological advances to address them, we briefly discuss existing work on data streaming systems and sensor networks.

20.1.1 Stream-Processing Engines and Sensor Networks

Over the past decade, various stream-processing engines (SPEs) such as TelegraphCQ [1], STREAMS [2], S3 [3], Cayuga [4], Aurora [5], Borealis [6], and MedSMan [7] have been proposed in the literature and many related commercial products have been developed (e.g., S4 by Yahoo). Such systems provide on-the-fly techniques to resolving continuous queries and performing analyses on the data streams before (or instead of) storing the streaming data into the database. Such approaches are in contrast to traditional database approach wherein streaming data would be first stored into a database and queried/analyzed later. With the exception of Aurora and Borealis, many stream-processing systems have focused on providing support for SQL-like queries. Examples include CQL [8], MF-CQL [6], TelegraphCQ [1], and TinyDB [9]. These languages extend SQL with window operators, relation-to-stream operators, syntax to specify the sampling period and the lifetime of the sensor network, and even syntax to generate output streams based on the query result. In contrast to the above SQL-style languages, Aurora and Borealis focus on a "Box-and-Arrow" programming model where one describes queries as a graph of operators with a series of parameters. Service-oriented middlewares (SOM) for pervasive spaces such as Gaia [10], Oxygen [11], PICO [12], Scooby [13], and Aura take an approach similar to Aurora and Borealis where applications are described as graphs of services. Each device in the pervasive/ubiquitous space advertises its capabilities as services. The main challenges then include how to optimally perform a QoS-based service discovery and composition [12, 14, 15], proactive and reactive failure resilience [14, 15], and dynamic swapping of services and service graphs.

SPEs usually execute queries on a centralized server and many mechanisms to scale data stream processing to high data rates given memory and CPU constraints have been devised. These include techniques for load shedding (to dynamically adjust stream rates to those manageable by the stream engine) [16], chain scheduling [5], dynamic tuple routing [1, 17], load balancing (to distribute stream processing across multiple processors) [18], and approximate computation (to reduce memory requirements and speed up stream-processing computation). Recently, Yahoo's S4 system has explored an actor-based framework to scale stream processing dynamically by exploiting cloud resources [18].

Although the work on SPEs has focused on scaling stream processing to high data stream rates, research on wireless sensor networks (WSNs) has focused on in-network processing of sensor information primarily from the goal of minimizing communication to maximize battery life of sensor nodes. These include techniques for improved ad hoc programming of sensor networks via dynamic code upload to each node [19] or providing a database-like view of the sensor network and pushing the execution of the relational operators into the WSN nodes [20].

20.1.2 Limitations of Existing Research

Since pervasive sensing and monitoring systems create awareness out of continuous data streams generated at the sensors, many of the techniques for stream processing and sensor networks discussed above are highly relevant to building sentient space applications. Although existing efforts provide effective data processing capabilities over continuous streams of data, it exhibits significant limitations, in our view, to serve as a platform for building sentient spaces. We highlight these challenges below:

Semantic Foundations and Flexible Programming Environments

Pervasive applications deal with diverse sensor types that may generate different types of data at different levels of semantic abstraction. Such heterogeneities make programming pervasive applications very complex, especially if applications are required to explicitly deal with failures, disruptions, timeliness properties under diverse networking and system conditions, and missing or partial information. None of the previous approaches provides the level of abstraction desired for programming sentient spaces; in many cases, they require the application to specify exactly how to answer a query. For instance, SPEs require applications to specify which streams to connect; WSN applications often expliticity specify what information to sense and which sensors to use; service-oriented approaches require applications to specify which services are needed. New programming abstractions required for programming sentient space applications that hide application programmers from having to deal with heterogeneity of sensors, low-level details of the specific sensor devices, or to write defensive code to overcome errors and failures. Such a programming environment will empower the application writers to express their higher-level application goals which are then translated into lower-level sensor-specific programs by the system. Such a framework will also enable effective reasoning about observations and actions to bring about effective adaptations of both the system behavior and the pervasive application.

(1)

Scalability: To create situational awareness, pervasive spaces are instrumented with large numbers of heterogeneous multimodal sensors that generate voluminous data streams that must be processed in real time. Techniques to enable accurate and fast processing of relevant data in the presence of communication and computing constraints must be explored. Although techniques developed in the context of SPEs provide a starting point, a semantically enriched representation of sentient spaces provides new opportunities for optimizations. We will illustrate one such optimization in the form of semantic scheduling of sensors under network constraints.

(2)

Robustness of sensing: The sensing process is inherently unreliable; in addition to sensor and communication errors, pervasive space deployments are unsupervised and often exposed. These changes influence the validity of the information being captured and these uncertainties can propagate to the higher-level event processing tasks. Techniques to support robust/trusted situational awareness that will handle small physical perturbations to sensors (e.g., due to wind, tampering), large system failures, and network losses must be designed.

(3)

Human-centric deployment issues: In pervasive spaces that monitor and observe human activities and interactions, additional challenges related to wide-scale deployment further arise. One such concern is that of privacy. Although the issue of data privacy has received significant research attention in the context of Internet-based applications (wherein websites store individual-centric data) and in collecting and disseminating electronic medical records, pervasive systems that continuously capture and process information such as location, activity, and interactions using sensing technologies raises additional challenges by introducing additional inference channels. This chapter will identify the privacy challenges that arise and summarize the progress that has been made in this context.

Although other chapters in this book will include details about specific mechanisms to address many of the above discussed challenges (e.g., techniques to detect events from lower-level sensor information, techniques to deal with uncertainty, data mining mechanisms, etc.), the thesis of this chapter is that much of the desired functionalities should be incorporated in an adaptive middleware environment. This chapter will discuss design principles in the creation of the sensing and monitoring middleware for pervasive spaces that can address the multifaceted challenges of scalability, robustness, and flexibility. It will also discuss the role of formal methods and reasoning in the realization of such a middleware framework. We will discuss technological advances in event-processing architectures that can help develop a wide range of situational awareness applications. We will finally discuss our ongoing efforts in developing such a middleware framework – SATWARE built on top of the Responsphere pervasive instrumented space at UC Irvine.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124158153000200

Introduction

Hongjiang Zhang , in Readings in Multimedia Computing and Networking, 2002

By processing, we refer to multimedia data enhancement and manipulation to improve or transform the representation of multimedia in different applications, such as immersive audio and multimedia compression and content analysis. Processing is essential to both compression and retrieval of multimedia. Multimedia processing has been researched extensively in the signal processing community, and thousands of papers have been published. For this same reason, multimedia data processing technologies are not covered extensively in this book; rather, only a few papers have been selected to provide some background for immersive audio in Chapter 1 and for image and video compression and watermarking in Chapter 2.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9781558606517500863