Published online by Cambridge University Press: 13 July 2017
Social and behavioral signals carry invaluable information regarding how audiences perceive the multimedia content. Assessing the responses from the audience, we can generate tags, summaries, and other forms of metadata for multimedia representation and indexing. Tags are a form of metadata which enables a retrieval system to find and re-find the content of interest (Larson et al., 2011). Unlike classic tagging schemes where users’ direct input is needed, implicit human-centered tagging (IHCT) was proposed (Pantic & Vinciarelli, 2009) to generate tags without any specific input or effort from users. Translating the behavioral responses into tags results in “implicit” tags since there is no need for users’ direct input as reactions to multimedia are displayed spontaneously (Soleymani & Pantic, 2012).
User generated explicit tags are not always assigned with the intention of describing the content and might be given to promote the users themselves (Pantic & Vinciarelli, 2009). Implicit tags have the advantage of being detected for a certain goal relevant to a given application. For example, an online radio interested in the mood of its songs can assess listeners’ emotions; a marketing company is interested in assessing the success of its video advertisements.
It is also worth mentioning that implicit tags can be a complementary source of information in addition to the existing explicit tags. They can also be used to filter out the tags which are not relevant to the content (Soleymani & Pantic, 2013; Soleymani, Kaltwang, & Pantic, 2013). A scheme of implicit tagging versus explicit tagging is shown in Figure 26.1. Recently, we have been witnessing a growing interest from industry on this topic (Klinghult, 2012; McDuff, El Kaliouby, & Picard, 2012; Fleureau, Guillotel, & Orlac, 2013; Silveira et al., 2013) which is a sign of its significance.
Analyzing spontaneous reactions to multimedia content can assist multimedia indexing with the following scenarios: (i) direct translation to tags – users’ spontaneous reactions will be translated into emotions or preference, e.g., interesting, funny, disgusting, scary (Kierkels, Soleymani, & Pun, 2009; Soleymani, Pantic, & Pun, 2012; Petridis & Pantic, 2009; Koelstra et al., 2010; Silveira et al., 2013; Kurdyukova, Hammer, & Andr, 2012);