Data Description
The dataset contains a million randomly sampled video instances listing 10 fundamental video characteristics along with the YouTube video ID.
The videos were all transcribed from one format into another, measuring the memory usage and the transcription time.
The goal is to predict the transcription time using the input information and the desired output format.
Attribute Description
1. *id* - Youtube video id (should be dropped for the analysis)
2. *duration* - duration of video
3. *codec* - coding standard used for the video ("mpeg4", "h264", "vp8", "flv")
4. *width* - width of video in pixles
5. *height* - height of video in pixles
6. *bitrate* - video bitrate
7. *framerate* - actual video frame rate
8. *i* - number of i frames in the video
9. *p* - number of p frames in the video
10. *b* - number of b frames in the video
11. *frames* - number of frames in video
12. *i_size* - total size in byte of i videos
13. *p_size* - total size in byte of p videos
14. *b_size* - total size in byte of b videos
15. *size* - total size of video
16. *o_codec* - output codec used for transcoding ("mpeg4", "h264", "vp8", "flv")
17. *o_bitrate* - output bitrate used for transcoding
18. *o_framerate* - output framerate used for transcoding
19. *o_width* - output width in pixel used for transcoding
20. *o_height* - output height used in pixel for transcoding
21. *umem* - total codec allocated memory for transcoding, alternate target feature
22. *utime* - total transcoding time for transcoding, target feature