Overview

Social Media Prediction Dataset (SMPD) consists of SMPD-Image and SMPD-Video for social images and videos. SMPD-Image contains 486K social image posts from 70K users, and SMPD-Video is a social short-form video dataset with 6K videos and 4.5K users. Both of them also have anonymized photo-sharing records, user profiles, web images, text, time, location, category, etc. SMPD is a multi-faced, large-scale, temporal web data collection, collected from Flickr or Tiktok (one of the largest photo-sharing and video-sharing platforms).

Dataset #Post #User #Categories Duration(M) #Tags
SMPD-Image 486k 70k 756 16 250k
SMPD-Video 6k 4.5k 120 24 40k

Histogram of Labels


Hierarchy for 756 Category Classes

The inner circle denotes the first level categories, including 11 different classes. The second circle denotes the second level categories, including 77 different classes. And the last circle denotes the third level categories, including 668 different classes.


Photo Tag Cloud

This tag cloud denotes all of the customize tags provided by users, including 250k different words.

Copyright © 2025. SMP Challenge Organization Committee. All rights reserved.