Crossing Textual and Visual Content in Different Application Scenarios

Ah-Pine, J., et al. “Crossing Textual and Visual Content in Different Application Scenarios.” Multimedia Tools and Applications 42 1 (2009): 31-56.

This article is quite outside the scope of my research and is bordering on irrelevant to it. The article discusses two approaches to text-image information processing in the multimodal scenario. In doing so, the paper is rather thick with formulas and coding to create these methods by which multimodal documents can be automatically scanned and various types of information (text, image, video, audio, etc.) can be extracted and coded.

However, I draw on this article for a few points that the authors address about our current state of multimodality on the Web and about how we now think differently about the interaction of visual (image and video) and text.

From the introduction:

Information, especially digital information, is no longer monomodal: web pages can have text, images, animations, sound and video; we have audiobooks, photoblogs and videocasts; the valuable content within a photo sharing site can be found in tags and comments as much as in the actual visual content. (31-2).

This situation has been building for quite some time, since the dawn of the Web, really (not to negate pre-Internet multimedia). From textual beginnings, there eventually came the inclusion of graphics, formatted text in which typography becomes a study of the visual, animated gifs, audio, static video, and interactivity. Also, the text itself is increasingly interactive with the ability to provide feedback within Web pages and even to contribute. Thus the reader becomes author. Even the content itself is changing; no longer is it just the thought of one author pushed to the reader. It is now the interactive document, but also it is filled with photos and much personal and social information about the author’s life, interests, relationships, friends, thoughts of the day, location, etc.

“This major shift in the way we access content and what type of content we access is largely due to the connected, easily accessible, global nature of the internet.” (32).

Essentially, we no longer expect to just find an author’s content and perhaps a couple images. We now expect this multimodal mass of information to know more about both the author and the topic. What is the author’s email address, can I follow her on Twitter or Facebook?, what happens if I Digg her name?, what video sites does she use?

In this way, I we are actively moving away from the anonymous, cold, distant feeling of the Internet and are now thriving on making it as personal and feeling as possible through the majority of our interactions there. By nature, we seek out that socialization, we thrive on it and require it. Similarly, it is why there is so much benefit in video communication, when other communication methods are available.

Leave a Reply