Unformatted text preview: Research Methods / Research Foundations in Computer ACM SIGMM (Special Interest Group of ACM on Multimedia ) Retreat Report on Future Directions in Multimedia Research ACM MM The ACM Multimedia Special Interest Group was created ten years ago. Since that time, researchers have solved a number of important problems related to media processing, multimedia databases, and distributed multimedia applications. Multimedia Research is Multidisciplinary Researchers often identify themselves as being in signal processing, computer systems, databases, user interfaces, graphics, vision, or computer networking. The content side of multimedia, whether it be artistic, entertainment, or educational, must also be considered part of the multimedia research community. 1st goal of the retreat was to identify the unifying themes that unite the multimedia field. 1Unifying Themes 1: DEFINITION A multimedia system or application is composed of more than one media that are correlated. The media can be discrete (e.g., an image or text document) or timebased (e.g., weather samples collected by a sensor network or a video). 2Unifying Themes 2: INTEGRATION AND ADAPTATION. A Any distributed multimedia application with user interactions must deal with endtoend performance and user perception. i.e. delivery of dynamic multimedia content, and the content must adapt to the user's environment. For example, content displayed on a PDA might look and behave differently than content displayed on a large projection screen in a classroom or theatre. 2Unifying Themes 2: INTEGRATION AND ADAPTATION. B Ubiquitous interaction with multiple media. i.e. a user should be able to enter a room and interact with various devices and sensors in that space. For example, the user's laptop computer or PDA should sense or query the environment to locate cameras, microphones, printers, presentation projectors and the applications available to manage and use them. 2Unifying Themes 2: INTEGRATION AND ADAPTATION C using multiple media and context to improve application performance. Early research on multimedia focused on content analysis, summarization, and search focused on one media type (e.g., still image or music archive query) and limited context. 2 Unifying Themes 2: INTEGRATION AND ADAPTATION C using multiple media and context to improve application performance. Now researchers are exploring systems that use information derived from correlated media and context. For example, executing a query to find information about the election of a state governor might involve identifying segments in TV news programs in which a person shown in the video stream uses the words "election" and "governor" in the audio stream. 3 Unifying Themes 3 multimedia applications are multimodal and interactive. The conventional interface to a desktop or laptop computer, is being replaced with new interface modalities (e.g., pen, voice, gesture, touch, etc.) and multiple devices (e.g., PDA's, tablet computers, projectors with embedded computers connected directly to the network, etc.) and smart spaces. 3 Unifying Themes 3 multimedia applications are multimodal and interactive. Humancomputer interactions and communication among humans through e.g., VoiceoverIP, videoconferencing, immersive environments, etc. Note that...in past decade MM Researchers have focused on the development of infrastructure to support the capture, storage, transmission, and presentation of multimedia data. Researchers and product developers worked on I/O devices, scheduling algorithms, media representations, compression algorithms, media file servers, streaming and realtime network protocols, multimedia databases, and tools for authoring multimedia titles. Now... Multimedia researchers should focus on applications that incorporate correlated media, fuse data from different sources, and use context to create or improve application performance that can solve important problem and produce highquality user experiences. 2nd goal of the retreat is To direct future direction of multimedia research to focus on identifying and delivering applications that impact users in the realworld. The retreat suggested that the community focus on solving three grand challenges:
1. To make authoring complex multimedia titles as easy as using a word processor or drawing program, To make interactions with remote people and environments nearly the same as interactions with local people and environments, and To make capturing, storing, finding, and using digital media as an everyday occurrence in our computing environment. 2. 3. 1st Challenge: To make authoring complex multimedia titles as easy as using a word processor or drawing program 1. Content authoring is expensive and difficult. 2. Most groups that produce hypermedia content use teams of multiple experts supervised by producers and directors. 3. Specialized tools are used for different media Justifications: e.g., a word processor for text, a nonlinear editor for audio and video, an image editing tool for still images, a 3D modeling system for a animations, etc. 1st Challenge: To make authoring complex multimedia titles as easy as using a word processor or drawing program 4. Producing the title, that is, coding the material and physically publishing it (e.g., pressing a DVD or uploading the material to one or more servers) is timeconsuming and complex. 5. Different versions of the title are typically produced for different environments (e.g., TV settop box, game console, desktop computer, PDA, etc.), which is itself a challenge. Justifications: Some excellent tools exist Some excellent tools exist either for particular media (e.g., Photoshop for images, Dreamweaver for websites, Premiere for audio/video, etc.) and particular applications (e.g., PowerPoint for presentations, iMovie for home movies, FrameMaker and Word for documents, etc.). But.. These tools are not integrated, do not encourage content reuse, run on different platforms, and are targeted at different user communities. For example, Photoshop is an excellent tool for graphic design experts, and iMovie is an excellent tool for less sophisticated endusers. Secondly, expertuser tools require too much learning and enduser tools are typically too restrictive. Therefore, the multimedia research community should develop the algorithms, heuristics, and tools that will allow average users to produce compelling multimedia content. develop systems and tools required to support widespread authoring of multimedia content. develop new userinterface paradigms, software abstractions, media processing algorithms, display presentations and operations for editing media, and media databases that will significantly reduce the effort required to produce highquality multimedia titles. 1st Challenge: Ordinary people needs simple to use tools EXAMPLE 1 A teacher needs tools to prepare educational material that includes video demonstrations to show an object and simulations and animations to illustrate dynamic behaviors. Good educational material allows students to explore the underlying principles and objects by modifying the input parameters to a simulation and examining related objects. 1st Challenge: Ordinary people needs simple to use tools EXAMPLE 2 A travel agent needs tools to prepare material showing places and experiences potential customers might want to visit. It might include live interactions with people at a remote location and links to material authored by someone who has taken the trip. This title might be composed of slide shows, videos, trip summaries, and links to detailed information about places visited, artifacts seen, places to stay, and methods of transportation. Challenge 2: To make interactions with remote people and environments nearly the same as interactions with local people and environments This grand challenge incorporates two problems: distributed collaboration and interactive, immersive threedimensional environments. Videophones and videoconferencing have been around for a long time, but the telephone is still the dominant medium for remote collaboration. Challenge 2: To make interactions with remote people and environments nearly the same as interactions with local people and environments Videophones and videoconferencing have been around for a long time, but the telephone is still the dominant medium for remote collaboration. Why? Among the problems are: 1. The difficulty of setting up and operating the equipment, 2. Limited services on real time collaborative activities, which hinders naturalness Justification: But the promise of multimodal interactions with remote people, places, and virtual environments, can change the way we live. And the supported technologies are out there! New sensors (e.g., touch, smell, taste, motion, etc.) and output devices (e.g., large immersive displays and personal displays integrated with eye glasses) offer the opportunity for more close and sensitive interaction with a remote environment. Continued development of semiconductor technology will bring realtime threedimensional virtual environments to every computing and communication platform. The grand challenge is... to understand the opportunities these new hardware technologies offer and to develop user interfaces and interaction paradigms that allow seamless communication and interactions with remote and virtual environments. Challenge 2: To make interactions with remote people and environments nearly the same as interactions with local people and environments Therefore possible research problems including: exploring the use of multiple streams of data, whether it be images, sounds, or sensor readings, and developing interaction hardware and software that allow humans to use this data. For example, users interacting with educational or entertainment programs, whether it be live sporting events, lecture webcasts, or broadcast programs, want a variety of services that allow them to: locate interesting or important events, view program summaries (e.g., a lecture outline or baseball game summary) with links to allow them to watch detailed segments of the program, skim through interesting stored programs rapidly, record material for viewing at a different time (i.e., time shifting), and view a program on a different platform (e.g., a TV Challenge 3: To make capturing, storing, finding, and using digital media as an everyday occurrence in our computing environment. Justification: The widespread adoption of digital cameras and the emergence of cellphones with builtin video cameras are adding to the information glut. Increases in storage capacity and reductions in cost make it possible to store massive amounts of this data. But how to make this information useful? The following scenarios are STILL today's problems 1: Search an archive of radio broadcasts to find an interview with a particular individual and a picture archive to find a photo of the person visiting a particular city. Texttospeech requires context to disambiguate the words being spoken and identifying where a particular photo was taken might require extensive image analysis (e.g., geographic location of the camera at the time the picture was captured). The problem is complicated by the fact that the data in the broadcast archive is not fused with the photo archive. 2:How to find lectures by a particular person published on the web? This problem might be solved by looking at the text associated with a streaming media file published on a web page. However, it may be difficult to identify the text associated with a video clip if the web page is generated dynamically. Problems arise too because most commercial web casting systems use proprietary media coding, storage representations, and network packet formats. 3: Who is that person across the room? The idea is to point your cell phone camera at the person and have it tell you the name of the person. The obvious solution is to do face matching on the person using the captured image. But, this approach might return too many possible matches or take too much time. To restrict the candidate matches to people who might actually be at the event, the system should use the context of the situation (e.g., a holiday party for a company or a workshop at a conference) [context, data fusion, shared databases] 4:People make billions of hours of home video currently stored in shoeboxes useful. But there are no good tools to organize and store it in a form so a user can say, "show me the shot in which Jay ordered Lexi to get the ball." The solution to this problem may require developing semiautomatic analysis tools coupled with powerful tagging and indexing to organize data so it is easily accessible using unified indexes. Past research has addressed multimedia database models, algorithms to analyze media data, and algorithms in order to search for relevant or interesting data e.g., query a music archive by humming a tune or find pictures with similar color palettes. the grand challenge is to work on the fundamental algorithms (e.g., query planning, parallel search, mediaspecific search and restriction, combining partial results, unified indexing, and tagging multimedia data) so the problem can be solved sufficiently well that a system could be built and deployed that people will use. Final Challenge: digital rights management While this topic is not directly related to multimedia, it will have a dramatic impact on the development and use of content. particularly for fairuse and educationaluse rights, the need to track the source of a media asset, and the need for an economic model to pay content owners and creators. IEEE Multimedia Community 1. 2. 3. 4. Research Challenges: MobileMultimedia Internet/Web/Home Multimedia Gigapixel Multimedia Common Issues MobileMultimedia Resourceaware yet portable development, inclusion of context awareness Modeling of multimedia (starting from sound to graphics) particular in games Common software platform for smart phones Others Internet/Web/Home Multimedia Low effort, high reusability Next generation metadata, based on latest generation recognition, natural language processing, vision, etc. More intelligent content repurposing Sophisticated multimodal interfaces TV as a replacement of PC at home Killer applications Others Gigapixel Multimedia Cognitive limits into consideration Hardware configuration Interaction design Killer applications Others Common Issues Adaptive to networks, devices and users with addition of personalization on top of context awareness) Management of content, including maintenance, distribution and consumption Security (encryption, authentication and control) Level of convenience, intelligence, and naturalness for the users Useful Readings How to read scientific papers http://helios.hampshire.edu/~apmNS/design/RES Steps in Scientific Method: http://www.ldolphin.org/SciMeth2.html The Scientific Methods http://www2.selu.edu/Academics/Education/ EDF600/Mod3/index.htm References Rowe, L. A. and Jain, R. 2005. ACM SIGMM retreat report on future directions in multimedia research. ACM Trans. Multimedia Comput. Commun. Appl. 1, 1 (Feb. 2005), 313. DOI= http://doi.acm.org/10.1145/1047936.1047938 Hirakawa, M, 2006, Issues, Challenges, and Future Directions in Multimedia Research, Proceedings of the 8th IEEE Int. Symposium on Multimedia Each student is required to produce a conceptual research paper with prepare a written critical report to each of the FOUR selected research articles. These articles will be students' own choices. Each critique can count up to 5 points. Each student is required to do a literature review on any one of the given topic (see next slide). Assessment includes the following elements: a). Research paper 25 pts. b). Oral Presentation 5 pts. Assignment 30% Assignment (30%); due: 7 September 2007 Assignment (30%); Write a scientific research paper on any of multimedia research topic/issue discussed earlier.
The form must follow the standard structure of a scientific paper (refer to ACM proceedings template at http://www.acm.org/sigs/publications/proceedings templates), and contain a thorough presentation of the subject and relevant findings. The "related work" section must contain an overview of the subject, and the reference list must include a number of central papers and/or books on the subject. Assignment (30%); due: 7 September 2007 Assignment (30%); The paper must be written in English and Word document. These papers should consist of: Title, authors, contact author, email addresses, abstract, introduction (background, problem definition, summary of contributions, related work), method, result, summary and conclusions, and references. Papers should be about 2500 4500 words with normal text size. Next Class On 7 September 2007 research methodology in computer science that related to multimedia research. ...
View Full Document
- Winter '09
- multimedia applications, Multimedia Research, complex multimedia titles, multimedia research community, remote people