Handling Media on Multiple Devices / Platforms
My opnion on which road to take when attempting to transcode media for multiple devices. There are many different devices that you want to serve with a digital application, and these devices allow for different audio / video codecs and different audio / video containers. This is an exploration on which road is best to take given a few options. I also talk about the challenges of rendering images and video on different devices.
Why Write This?
As mentioned above, there are many different types of devices to which you aim to serve media files when creating a digital application. In the case of creating a website or a mobile application, there are different types of browsers and devices that you have to worry about to make sure that media files - audio and video files - can be played on all devices by all users. The images below show examples of compatibility for different types of audio and video containers and codecs for different browsers:
I am writing this to try to decide which is the best path to take when trying to serve audio and video files to different types of devices. So far, I have tried:
- Not Transcoding Video and Audio Files at All
- This resulted in video / audio files not being ab
- Transcoding Video and Audio Files on the Backend after Completing Upload using ffmpeg
- Pros:
- Simple to setup
- Fast for some video / audio codecs
- Allows you to generate thumbnail, different video formats, keep track of uploaded content in database, extract audio content, and start the process of generating closed captions all in one place.
- Cons:
- This can be a somewhat lengthy process, e.g. I uploaded a 10 minute video earlier today and the process of transcoding the video took a couple minutes. Note: Transcoding the video file to two video codecs / containers took a short amount of time while transcoding the video to two other video codecs took a much longer amount of time (I might be able to remove the slower codecs).
- The video and audio files take up approximately 4 times and 3 times the amount of storage of the original video, respectively
- Transcoding Video and Audio Files in an AWS Lambda Function using ffmpeg
- This pros of this process are similar to the pros discussed in 2., except it is more difficult to setup - since ffmpeg is pretty large and Lambda functions have a storage limit - and since this process requires you to make two separate s3 buckets to store incoming videos and transcoded videos, which again requires additional setup.
I will try to evaluate the best way to go about handling media files based on, in order of importance, speed, complexity of setup, and storage. Note: The video must be able to transcoded to H.264 codec due to the fact that services that check videos for inappropriate content often require this video codec.
I am mainly using this as an opportunity to learn about Adaptive Bitrate Streaming and whether or not that is the best way to go about things.
Definitions
- codec
- A codec is a device or computer program that encodes or decodes a data stream or a signal
- container format
- A container format (informally, sometimes called a wrapper) or metadata is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams.
- file format
- A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium.
Notes on Adaptive Bitrate Streaming
- Idea is: no matter what device the user is using, the user's device can receive the content in the format that it needs (at any bitrate).
- Video would degrade in quality when user's bandwidth gets worse
- Packaging
- (Re)packaging compressed audio and video (a.k.a. transmuxing, rewrapping, or packetizing(
- Requires minimal hardware resources
- Transcoding
- Converting from one or more codecs, bitrates, or resolutions to others
- Typically requires significant resources
- Can be combined with re(packaging)
- Player Switching Algorithm
- How does the adaptive bitrate stream know what to send to the client? The client does calculations based on the attributes below and asks for the appropriate format. Note: The server initially lets the client know what formats are available.
- Network throughput
- Screen Size
- Rendering Speed (fps)
- Dropped Frames
- Buffer Size
- Previous ABR switches
Implementing Adaptive Bitrate Streaming on AWS
Resource: Amazon Cloudfront for Media
The ABR (Adaptive Bitrate) experience is superior to delivering a video at a single bitrate because the video stream can be continually switched dependent on available network bandwidth, allowing playback at different bitrates within an ABR ladder.
- The workflow that we need:
- Video on Demand (VOD) - media assets stored as econded video files are played on request. The media assets are stored in multiple formats and bitrates for playout, or packaged at the time of requesting, using just-in-time (JIT) packaging, for different client devices
- Video content has to be encoded several times in parallel to provide a set of streams at different bitrates using services such as AWS Elemental MediaConvert. This group of video streams is known as an ABR ladder
- The origin is the endpoint where viewer requests are processed and where streaming video files are served for both live and on-demand workflows - use s3 for this.
- You can use CloudFront to cache requested videos and decrease load on the origin.
- AWS CloudFront
- CloudFront has Regional Edge Caches to minimize latency of responses
- You can write your own code to customize the processing of HTTP requests and responses by associating an edge function with your CloudFront distribution.
- CloudFront Origin Shield is an optional feature that lets you select an additional layer of caching to reduce load on your origins and improve cache hit ratios
- CloudFront Configuration:
- You want to maximize the CHR value
- To maximize CHR value, edit Cache Policies, Origin request policies, and response header policies
- Use Cache Policies optimized for a specific origin - e.g., using s3 as an origin
- Avoid forwarding CORS headers to the origin
- Security
- Use geographic restriction
- Access control through CloudFront
Rendering Media on Different Devices
- As mentioned above different devices have different capabilities when playing audio, playing video, and rendering images
- Different browsers / different devices mainly cause the issues related to playing/rendering different types of media, but differently sized devices offer challenges in rendering the media in a way that is visually pleasing to users.
- As a result, for images and videos, there are different widths and heights that need to be rendered on different devices in order for the images to be viewable and also to prevent Content Layout Shift (CLS) on load.
- For static HTML pages, this is not difficult, since the site owner controls the
width
andheight
attributes of<video>
and<img>
elements and can change the values of these attributes based on whatever device is requesting content - For user generated content, however, there are challenges in letting the user resize images and appropriately sizing images and video for multiple devices.
- Options for content:
- Images
- The rich text editor allows users to resize images on the client. There should be different max-widths / max-heights for mobile, tablet, and desktop devices that constrain the width / height of the image during resizing.
- The resized height / width should then be scaled up given whether or not the device is a tablet, desktop computer, or phone.
- This requires that we store 3 different kinds of HTML on the server though.
- Video
- Most platforms that I have seen don't allow users to resize video, so resizing the video on the client is not an issue. However, you might want to look into resizing the video on the server using ffmpeg. You can then store three different heights for the video based on whether the device is a phone, tablet, or desktop computer.
- This requires that we store 3 different kinds of HTML on the server though (same as images).
- There is an issue with playing video inline and full-screen that needs to be addressed.
Conclusion
- Adaptive bitrate streaming with AWS is definitely the best option for serving media content
- Need to write about finalized AWS Process HERE
Comments
You have to be logged in to add a comment
User Comments
There are currently no comments for this article.