Characterizing vaping posts on instagram by using unsupervised machine learning

Vili Ketonen, Aqdas Malik*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

10 Citations (Scopus)
79 Downloads (Pure)


Electronic cigarettes (e-cigarettes) usage has surged substantially across the globe, particularly among adolescents and young adults. The ever-increasing prevalence of social media makes it highly convenient to access and engage with content on numerous substances, including e-cigarettes. A comprehensive dataset of 560,414 image posts with a mention of #vaping (shared from 1 June 2019 to 31 October 2019) was retrieved by using the Instagram application-programming interface. Deep neural networks were used to extract image features on which unsupervised machine-learning methods were leveraged to cluster and subsequently categorize the images. Descriptive analysis of associated metadata was further conducted to assess the influence of different entities and the use of hashtags within different categories. Seven distinct categories of vaping related images were identified. A majority of the images (40.4 %) depicted e-liquids, followed by e-cigarettes (15.4 %). Around one-tenth (9.9 %) of the dataset consisted of photos with person(s). Considering the number of likes and comments, images portraying person(s) gained the highest engagement. In almost every category, business accounts shared more posts on average compared to the individual accounts. The findings illustrate the high degree of e-cigarettes promotion on a social platform prevalent among youth. Regulatory authorities should enforce policies to restrict product promotion in youth-targeted social media, as well as require measures to prevent underage users' access to this content. Furthermore, a stronger presence of anti-tobacco portrayals on Instagram by public health agencies and anti-tobacco campaigners is needed.

Original languageEnglish
Article number104223
Number of pages7
JournalInternational Journal of Medical Informatics
Publication statusPublished - Sep 2020
MoE publication typeA1 Journal article-refereed


  • Adolescents
  • e-Cigarettes
  • Electronic cigarettes
  • Instagram
  • Machine-learning
  • Photos
  • Social media
  • Young adults


Dive into the research topics of 'Characterizing vaping posts on instagram by using unsupervised machine learning'. Together they form a unique fingerprint.

Cite this