The SMHP dataset collected from Flickr (a photo sharing platform) for headline prediction task. We split the data with time-order, resulting in train and test data ratio is 10:1.The tables below show the statistics of dataset.
The SMHP dataset collected from Flickr (a photo sharing platform) for headline prediction task. We split the data with time-order, resulting in train and test data ratio is 10:1.The tables below show the statistics of dataset.
Readme Document
Download Link for Train Image Urls
(Path Sample: train/77@N93/551891.jpg)
Download Link for Train Data
(include image paths, meta data and labels)
Download Link for Time Zone of Train Data
Download Link for Test Data
(include image paths, meta data and without labels)
Download Link for Time Zone of Test Data
Note that the datasets will ONLY be released to participants who have registered the challenge during the competition. Until the challenge completes, we will make the data publically available to the whole research community.
#Post | #User | #Categories | Temporal Range (Months) | Avg. Title Length | #Tags | #POIs | Avg. Views |
---|---|---|---|---|---|---|---|
340K | 80K | 11 | 16 | 26 | 669 | 103K | 306 |
*In the dataset, we provide the category information for each photo.