MULTIMODAL MECHANISMS TO ENGAGE VIEWERS IN APPRECIATING VIETNAMESE STREET FOOD IN A FOOD REVIEW VIDEO
DOI:
https://doi.org/10.63506/jilc.0903.357Keywords:
Multimodal orchestration; verbal; visual; paralanguage; Vietnamese street food; viewer engagementAbstract
This study investigates how verbal, visual and paralinguistic modes orchestrate to engage viewers in appreciating Vietnamese street food in the most-viewed food review video on YouTube about this cuisine. Using a multimodal discourse analytical framework informed by Systemic Functional Linguistics, the analysis reveals that meaning emerges not from any single mode but from flexible multimodal orchestration, operating through both intersemiosis (the integration of meaning across modes) and intrasemiosis (the elaboration of meaning within modes). Five key mechanisms that drive viewer engagement include creating interpersonal closeness, claiming expert authority, explaining food values, using familiar comparisons and constructing sequential narrative progression. These strategies collectively meet the genre demands of food review videos, which balance personal connection, credibility, educational clarity, accessibility and narrative flow. The study contributes theoretically by demonstrating the applicability of SFL-based multimodal analysis to video data, and practically by offering insights for content creators on how deliberate multimodal coordination enhances audience engagement and cultural appreciation.
Downloads
References
Avieli, N. (2011). Making sense of the Vietnamese cuisine. Education about Asia, 16(3), 42-45. https://www.asianstudies.org/wp-content/uploads/making-sense-of-vietnamese-cuisine.pdf?utm
Baldry, A., & Thibault, P. (2006). Multimodal transcription and text analysis. Equinox.
Barthes, R. (1977). Image, music, text. Fontana.
Bi, N. C. (2018). Product review videos on YouTube as eWOM. In L. Ha (Ed.), The audience and business of YouTube and online videos (pp. 59–72). Rowman & Littlefield. https://www.bloomsbury .com/us/audience-and-business-of-youtube-and-online-videos-9781978755772/?
Briliana, V., Ruswidiono, W., & Deitiana, T. (2020). Do millennials believe in food vlogger reviews? A study of food vlogs as a source of information. Journal of Management and Marketing Review, 5(3) 170-178. https://www.researchgate.net/publication/347381879_Do_Millennials_believe _in_food_vlogger_reviews_A_study_of_food_vlogs_as_a_source_of_information
Calude, A. S. (2023). The linguistics of social media: An introduction (1st ed.). Routledge. https://www.routledge.com/The-Linguistics-of-Social-Media-An-Introduction/Calude/p/book/ 9781032330945
Cenni, I., & Vásquez, C. (2025). Italian food experiences on Airbnb: A multimodal analysis of hosts’ promotional videos. Ibérica, (49), 45-76. https://doi.org/10.17398/2340-2784.49.45
Cheng, S. (2023). A review of interpersonal metafunction studies in systemic functional linguistics (2012–2022). Journal of World Languages, 10(3), 623-667. https://doi.org/10.1515/jwl-2023-0026
Creswell, J.W. & Creswell, J.D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.) Sage.
Croker, R. A. (2009). An introduction to qualitative research. In J. Heigham & R. A. Crocker (Eds.). Qualitative research in applied linguistics: A practical introduction. (pp.3-24). Palgrave Macmillan. https://doi.org/10.1057/9780230239517_1
Cunningham, S., & Craig, D. (2017). Being ‘really real’ on YouTube: Authenticity,
community and brand culture in social media entertainment. Media
International Australia, 164(1), 71–81. https://doi.org/10.1177/1329878X17709098
Figuero-Espadas, J. (2019). A review of scene and sequence concepts. Communication & Society, 32(1), 267-277. https://doi.org/10.15581/003.32.37829
Halliday, M. A. K. (1985). An introduction to functional grammar. Edward Arnold. https://www.cambridge.org/core/journals/studies-in-second-language-acquisition/article/abs/an-introduction-to-functional-grammar-michael-a-k-halliday-london-edward-arnold-1985-pp-384/0B10E76178E3B17CC5418DE0E8117C32
Halliday, M.A.K., Matthiessen, C.M.I.M., Halliday, M., & Matthiessen, C. (2004). An introduction to functional grammar (3rd ed.). Routledge. https://doi.org/10.4324/9780203783771
Horton, D., & Wohl, R.R. (1956). Mass communication and para-social interaction: Observations on intimacy at a distance, Psychiatry, 19(3), 215-229. https://doi.org/10.1080/00332747. 1956.11023049
Hyland, K., & Diani, G. (2009). Introduction: Academic evaluation and review genres. In K. Hyland & G. Diani (Eds.) Academic evaluation (pp. 1-14). Palgrave Macmillan. https://doi.org/10.1057/9780230244290_1
Iedema, R. (2001). Analyzing film and television: A social semiotic account of Hospital – an unhealthy business. In van Leeuwen, T. & Jewitt. C. (Eds.), Handbook of visual analysis (pp. 183–206). SAGE. https://doi.org/10.4135/9780857020062.n9
Jewitt, C., Bezemer, J., & O’Halloran, K. (2025). Introducing multimodality (2nd ed.). Routledge. https://www.routledge.com/Introducing-Multimodality/Jewitt-Bezemer-OHalloran/p/book/9781 032845388
Kathpalia, S. S. (2021). Persuasive genres: Old and new media (1st ed.). Routledge. https://doi.org/10.4324/9780429243721
Kress, G., & van Leeuwen, T. (2021). Reading images: The grammar of visual design (3rd edition). Routledge. https://www.routledge.com/Reading-Images-The-Grammar-of-Visual-Design/Kress-vanLeeuwen/p/book/9780415672573
Lu, Y. (2024). Multimodal discourse analysis of the promotional film countdown: Beginning of spring for the opening ceremony of Beijing Winter Olympics. International Journal of Language, Literature and Culture, 4(4), 17–29. https://doi.org/10.22161/ijllc.4.4.3
Martin, B., & Ballantine, P. W. (2005). Forming parasocial relationships in online communities. Advances in Consumer Research, 13(2), 197-202. https://researchportal.bath.ac.uk/en/publications/forming-parasocial-relationships-in-online-communities/
Martin, J. R., & White, P. R. R. (2005). The language of evaluation: Appraisal in English. Palgrave Macmillan.
Martinec, R., & Salway, A. (2005). A system for image–text relations in new (and old) media. Visual Communication, 4(3), 337-371. https://doi.org/10.1177/1470357205055
Matthiessen, C., Lam, M. & Teruya, K. (2010). Key terms in systemic functional linguistics (1st ed.). Bloomsbury. https://www.bloomsbury.com/us/key-terms-in-systemic-functional-linguistics-9781847064400/
McKee, R. (1997). Story: Substance, structure, style, and the principles of screenwriting. Harper-Collins Publishers.
Mostafa, M. M., Feizollah, A., & Anuar, N. B. (2023). Fifteen years of YouTube scholarly research: Knowledge structure, collaborative networks, and trending topics. Multimedia Tools Application, 82, 12423–12443. https://doi.org/10.1007/s11042-022-13908-7
Ngo, T., Hood, S., Martin, J. R., Painter, C., Smith, B. A., & Zappavigna, M. (2022). Modelling paralanguage using systemic functional semiotics: Theory and application. Bloomsbury. https://www.bloomsbury.com/au/modelling-paralanguage-using-systemic-functional-semiotics-9781350074910/
Norris, S. (2019). Systematically working with multimodal data: Research methods in multimodal discourse analysis. Wiley-Blackwell. DOI:10.1002/9781119168355
O'Halloran, K. L. (2005). Mathematical discourse: Language, symbolism and visual images. London: Continuum.
Paltridge, B., & Phakiti, A. (2015). Research methods in applied linguistics: A practical resource (2nd ed.), Bloomsbury. https://www.bloomsbury.com/uk/research-methods-in-applied-linguistics-9781472 524560/
Pfeuffer, A., & Phua, J. (2021). Stranger danger? Cue-based trust in online consumer product review videos. International Journal of Consumer Studies, 46(3), 964-983. https://doi.org/10.1111 /ijcs.12740
Royce, T. (1999). Visual-verbal intersemiotic complementarity in the Economist magazine. [Ph.D. Dissertation]. The University of Reading. http://www.isfla.org/Systemics/Print/Theses/Royce Thesis/
Sahelices-Pinto, C., Lanero-Carrizo, A., Vázquez-Burguete, J. L., & Gutierrez-Rodriguez, P. (2018). Ewom and 2.0 opinion leaders in the food context: A study with a sample of Spanish food-related weblogs. Journal of Food Products Marketing, 24(3), 328–347. doi:10.1080/10454446 .2017.1266561
Truong, T. A., Piscarac, D., Kang, S. M., & Yoo, S. C. (2025). Virtual culinary Influence: Investigating the impact of food vlogs on viewer attitudes and restaurant visit intentions. Information, 16(1), 44. https://www.mdpi.com/2078-2489/16/1/44
Unsworth, L. (2001). Teaching multiliteracies across the curriculum: Changing contexts of text and image in classroom practice. Open University Press.
van Leeuwen, T. (2005). Introducing social semiotics: An introductory textbook. Routledge. https://www.routledge.com/Introducing-Social-Semiotics-An-Introductory-Textbook/Leeuwen /p/book/9780415249447
Vo, H. C. (2025). Resonance in expressions of facial affect and voice affect in “Frozen” animation. VNU Journal of Foreign Studies, 41(1S (Special Issue)), 29-43. https://doi.org/10.63023/2525-2445/jfs.ulis.5390
Zhang, F. (2022). Meaning construction of multimodal synergy in documentary discourse: Taking The lockdown: One month in Wuhan as an example. International Journal of Linguistics, Literature and Translation, 5(6). 52-60. https://doi.org/10.32996/ijllt.2022.5.6.7
Webpages
Nguyen Quy (2019). Netflix series on Asian street food focuses on Saigon. Retrieved from https://e.vnexpress.net/news/travel/food-recipes/netflix-series-on-asian-street-food-focuses-on-saigon-3908602.html
Tam Anh (2025). TasteAtlas ranks Vietnamese street foods among the best in Southeast Asia. Retrieved from https://e.vnexpress.net/photo/food-recipes/tasteatlas-ranks-vietnamese-street-foods-among-the-best-in-southeast-asia-4856205.html
VnExpress (2016). Saigon's banh mi hailed among the kings and queens of street foods. Retrieved from https://e.vnexpress.net/news/travel-life/saigon-s-banh-mi-hailed-among-the-kings-and-queens-of-street-foods-3498068.html












