Recently in Codecs Category

As we were on holiday last week, in the chilly snows of Austria, we almost missed an important announcement regarding the Schrödinger implementation of the Dirac codec.

It has been roughly eleven months since the last Schrödinger release, so this is indeed welcome news.

Don't know what either Schrödinger or Dirac are? Dirac is an advanced royalty-free video compression format, initially developed by the UK's BBC Research and Development team. To quote from the recent release announcement:

"Schrödinger is a cross-platform implementation of the Dirac video compression specification as a C library. The Dirac project maintains two encoder implementations: dirac-research, a research encoder, and Schrödinger, which is meant for user applications. As of this release, Schrödinger outperforms dirac-research in most encoding situations, both in terms of encoding speed and visual quality."

That last sentence is really important. Previous testing by Stream0 showed that while Schrödinger was a much faster implementation than Dirac Research, the quality suffered enormously. If indeed Schrödinger has now surpassed Dirac Research in quality terms, this is exciting news.

Further information regarding enhancements in this release, and plans for a more regular release cycle, are available on the Dirac Video website.

With the increasing acceleration of HTML 5 acceptance, it'd be fantastic to see more browser support for Dirac, alongside Ogg Theora, as an alternative to the currently almost ubiquitous Flash/H.264 combination.
Stream #0 recently started looking at Amazon's EC2 computing offering. We created our first public AMI, based on Debian Squeeze, including FFmpeg and x264 pre-installed. Now that we can easily start instances with the necessary basics installed, it is time to compare the relative merits of the different instance sizes that Amazon offers.

EC2 Instances come in a variety of sizes, with different CPU and RAM capacities. We tested the 64-bit offerings, including the recently announced High-Memory Quadruple Extra Large instance.

These 64-bit instances are listed on the EC2 website in the following way:

Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform
Extra Large Instance 15 GB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform
High-CPU Extra Large Instance 7 GB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform
High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform

We'll take a closer look later at the in-depth specifications of each below.

Our test file was 5810 frames (a little over 4 minutes and 285MB) of the HD 1920x1080 MP4 AVI version of Big Buck Bunny. The FFmpeg transcode would convert this to H264 using the following 2-pass command:

>ffmpeg -y -i big_buck_bunny_1080p_surround.avi -pass 1 -vcodec libx264 -vpre fastfirstpass -s 1920x1080 -b 2000k -bt 2000k -threads 0 -f mov -an /dev/null && ffmpeg -deinterlace -y -i big_buck_bunny_1080p_surround.avi -pass 2 -acodec libfaac -ab 128k -ac 2 -vcodec libx264 -vpre hq -s 1920x1080 -b 2000k -bt 2000k -threads 0 -f mov big_buck_bunny_1080p_stereo_x264.mov

Setting Threads to zero should mean that FFmpeg automatically takes advantage of the entire number of CPU cores available on each EC2 instance.

FFmpeg revealed the following information about the transcode:

Input #0, avi, from 'big_buck_bunny_1080p_surround.avi':
Duration: 00:09:56.48, start: 0.000000, bitrate: 3968 kb/s
Stream #0.0: Video: mpeg4, yuv420p, 1920x1080 [PAR 1:1 DAR 16:9], 24 tbr, 24 tbn, 24 tbc
Stream #0.1: Audio: ac3, 48000 Hz, 5.1, s16, 448 kb/s
[libx264 @ 0x6620f0]using SAR=1/1
[libx264 @ 0x6620f0]using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
[libx264 @ 0x6620f0]profile High, level 4.0
Output #0, mov, to 'big_buck_bunny_1080p_stereo_x264.mov':
Stream #0.0: Video: libx264, yuv420p, 1920x1080 [PAR 1:1 DAR 16:9], q=10-51, pass 2, 2000 kb/s, 24 tbn, 24 tbc
Stream #0.1: Audio: aac, 48000 Hz, 2 channels, s16, 128 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Stream #0.1 -> #0.1

Ignore the duration, as that's read from the file header, and we only uploaded part of the overall file.

Now to look at how each EC2 instance performed.

m1.large
(Large Instance 7.5 GB of memory, 4 EC2 Compute Units)

Firstly, querying the machine capacity (cat /proc/cpuinfo) returns the following information:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz
stepping : 6
cpu MHz : 2659.994
cache size : 6144 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca lahf_lm
bogomips : 5322.41
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

There's 2 of these cores available. RAM is confirmed as 7.5GB (free -g).

The FFmpeg transcode showed the following:

H264 1st Pass = 11fps - 18 fps, 5 minutes 30 seconds
H264 2nd Pass = 4-5fps, 18 minutes 38 seconds

Total Time: 24 minutes, 8 seconds

m1.xlarge
Extra Large Instance 15 GB of memory, 8 EC2 Compute Units

CPU Info:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz
stepping : 10
cpu MHz : 2666.760
cache size : 6144 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca lahf_lm
bogomips : 5336.15
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

There's 4 of these cores available. RAM is confirmed at 15GB.

The FFmpeg transcode showed the following:

H264 1st Pass = 11fps - 14 fps, 5 minutes 30 seconds
H264 2nd Pass = 6-7fps, 14 minutes 19 seconds

Total Time: 19 minutes, 49 seconds

c1.xlarge
High-CPU Extra Large Instance 7 GB of memory, 20 EC2 Compute Units
CPU Info:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
stepping : 10
cpu MHz : 2333.414
cache size : 6144 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca lahf_lm
bogomips : 4669.21
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

There's 8 of these cores available. RAM confirmed at 7GB.

The FFmpeg transcode showed the following:

H264 1st Pass = 24-29fps, 3 minutes 24 seconds
H264 2nd Pass = 11-13fps, 7 minutes 8 seconds

Total Time: 10 minutes, 32 seconds

m2.4xlarge
High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units

CPU Info:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2666.760
cache size : 8192 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca popcnt lahf_lm
bogomips : 5338.09
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

There's 8 of these cores available. RAM confirmed at 68GB.

The FFmpeg transcode showed the following:

H264 1st Pass = 35-38fps, 2 minutes 47 seconds
H264 2nd Pass = 12-15fps, 6 minutes 30 seconds

Total Time: 9 minutes, 17 seconds

What can be revealed from these figures? As expected, the High-Memory Quadruple Extra Large Instance performed best, but not by much. Certainly all the additional RAM didn't make much of an impact, and the time saving is probably really down to the slightly increased CPU specifications. Obviously, over a larger file set this time saving would be more evident.

Let's look at which EC2 instance gives best value for money for this test. Amazon charges per CPU hour, shown below:

m1.large: $0.40/hour
m1.xlarge: $0.80/hour
c1.xlarge: $0.80/hour
m2.4xlarge: $2.40/hour

These are US Dollars and for a US based instance (European instances are slightly more expensive). Amazon has also revealed that there will be a price reduction in effect from November 1st 2009.

Looking at the time taken to transcode our test file, on each instance, reveals the following:

m1.large
Total Time: 24 minutes, 8 seconds
Total Cost: $0.16 ((($0.40/60)/60) x 1448 seconds)
Cost per GB: $0.57 ((1024MB/285MB) x $0.16)

m1.xlarge
Total Time: 19 minutes, 49 seconds
Total Cost: $0.26 ((($0.80/60)/60) x 1189 seconds)
Cost per GB: $0.93 ((1024MB/285MB) x $0.26)

c1.large
Total Time: 10 minutes, 32 seconds
Total Cost: $0.14 ((($0.80/60)/60) x 632 seconds)
Cost per GB: $0.50 ((1024MB/285MB) x $0.14)

m2.4xlarge
Total Time: 9 minutes, 17 seconds
Total Cost: $0.37 ((($2.40/60)/60) x 557 seconds)
Cost per GB: $1.33 ((1024MB/285MB) x $0.37)

Clearly the c1.large instance represents the best value for money, although I was surprised how close behind the m1.large costs were. The additional RAM, and slightly better CPU specifications for the m2.4xlarge instance do not outweigh the much more expensive per hour cost, at least when it comes to video transcoding.

A typical HD file used for broadcast or high end post production purposes is around 85GB for 60 minutes (DnxHD at 185Mbps). Obviously the time taken to transcode this file, to an H264 at 2Mbps, could vary from the actual source content we used, but from the figures above we can estimate that it would cost $42.50 and take approximately 53.62 hours!

Taking into account that these figures may vary for different input and output files, the above should represent a worst case scenario. For example, I would expect an SD MPEG2 50Mbps file to take proportionally much less effort to transcode than a DNxHD 185Mbps HD file. Only a further test will tell......

Is Amazon's EC2 offering worth considering for high end video file transcoding? Compared to the prices charged by Post-Production facilities it is certainly a lot cheaper, as long as you have time to wait for the end result. However, that's the beauty of cloud based computing power - if you're in a hurry just scale up! Keep in mind though, content still needs to be uploaded to EC2 before transcoding can begin, that's going to take additional time and add further cost. 
There's a great article called How Firefox Is Pushing Open Video Onto the Web by Micheal Calore over at WebMonkey, dealing with the HTML 5 <video> tag and Firefox's native Ogg Theora support. The piece outlines the technical details of the <video> tag and includes an interview with Mozilla director of Firefox Mike Beltzner and Mozilla director of platform engineering Damon Sicore.

An excerpt from the interview:

Webmonkey: How do you see these factors -- the HTML 5 video tag, putting the Ogg codecs right into the browser, presentation techniques that mimic the plug-in player experience -- affecting video on the web? What's it going to change in six months? Or six years?

Beltzner: In six months, you're going to see more sites like DailyMotion doing things where they detect that the browser supports Ogg and the video tag, and in that case, they're going to give those users an Ogg-and-video-tag-experience.

I think you'll see content sites doing this because they'll have the ability to re-encode their entire video libraries without having to pay any licensing fees. The Ogg Theora encoders are completely license-free and patent-proof. They don't need to worry about which player you've got. They also don't need to worry about which hardware you've got. Ogg Theora will run on Windows, Mac and Linux, or any embedded device or mobile device built on the Linux platform.

Here's a beta example page from DailyMotion demonstrating use of the HTML 5 <video> tag. If you have Firefox 3.5 installed, or a reasonably new version of Webkit/Safari and the XiphQT component install, you should have in browser video playback - Ogg Theora and no Flash player needed.

YouTube's demonstration page here.


Spending the last two days at the Open Video Conference has been a great experience, lots of interesting speakers and I've learned a few things. Perhaps I'll write more in general later, however it's worth mentioning, while still fresh in my mind, today's sessions around royalty-free codecs and the HTML 5 <video> tag.

The main focus of the Royalty Free Codecs session seemed to be around Ogg Theora. Also present though were Sun, speaking about their new Open Media Stack, and David Schleef to represent his work on the Schroedinger Dirac library. I would have loved to hear more about what was happening with Dirac, but the crowd wanted Theora news.

A short demonstation on the projector screen showed H.263/H.264 content versus the same Ogg Theora content at various bit rates, the highest less than 500Kbps. The results, from Theora's perspective, were very good. Visually I couldn't pick out any differences on the large screen. I would have liked to see the demonstration done at higher, greater than 1Mbps, bitrates, though. Not the one used today, but a similar demonstration is available here.

Sun did not do themselves any favours at this Conference. A session yesterday gave them time to discuss the process they undertook to ensure there were no IP encumbrance in their new codec and Open Media Stack, but right at the end the key revelation was that they're unable to Open Source their work.

David did not have much of a chance to talk in depth about Dirac, and I was disappointed not to have gained a better understanding the current development status, and community input velocity around Dirac. He did make a point that the BBC were using Dirac internally, which is true but only to a very small extent. In non-linear editing environments, DVCProHD, AVC-I 100 and ProRes are still the codecs of choice. In my opinion this due to the lack of tools available for Dirac work. Dirac tool development needs a great leap forward if this codec is to gain any significant traction.

The next session had representatives from all major browsers (Firefox, Webkit and Opera), except IE, present to talk about HTML 5 and the new <video> tag.

Firstly, I was particularly interested in the W3C Draft Web Fragments specification. Amongst other things, this will allow playback of just segments of video, based on a time specification in seconds. While not currently possible, if this could be extended to read an embedded timecode track and seek in a frame accurate manner, that would be truly powerful in an open standard.

With Safari on Mac, the <video> tag can be used to playback any video format for which the user has the relevant codec and QuickTime component installed. Thus we have Theora support through the XiphQT component. In the latest version of iMovie, QuickTime Pro and Final Cut Pro, users can now also choose to export or render in Ogg Theora. If only the Dirac QT component was ready.

Metavid developers also demonstrated a cute javascript library embed workaround that covered IE's lack of support for the <video> tag. Full details are available on the Metavid website, as well as a demonstration of the code in action. Even if you're browser doesn't currently support the HTML 5 <video> element, this script will take care of it.

The cross fade is particularly interesting. Do we no longer need to finish clips in a non-linear editor? Can we now perform hard cuts based on an edit decision list and let the browser deal with the fading or finishing element of the job?

Hopefully there's some exciting times ahead for open source, royalty free video codecs and ubiquity of embedded video on the Web. 
We're only about two weeks late noticing that the BBC has released the second episode in their R&DTV series. Again they're providing a whole bunch of different video codecs - including Ogg Theora, but they're still not their offering files encoded in their own Dirac codec. More information available on the main page or the BBC Backstage blog, but a wider selection of files can also be found directly on the FTP site where both 30 minute and 5 minute versions are available, as well as an entire asset bundle with rushes.

This episode features interviews with David Kirby on the BBC's Ingex project, Matt Biddulph CTO of Dopplr and Jason Calacanis CEO of Mahalo.com.

The BBC has released this content under a Creative Commons attribution licence, allowing everyone to remix as they see fit, providing an original BBC credit is maintained.

Our post regarding Episode 1 of R&DTV goes into some more details regarding the technical details of the available files.

About this Archive

This page is a archive of recent entries in the Codecs category.

Cinelerra is the previous category.

FFmbc is the next category.

Find recent content on the main index or look in the archives to find all content.

Pages