MediaElement and MediaPlayer, and the ocx control fairy tale
As some of you know I'm always monitoring the msdn forums. I usually only step in the conversation when no one else had or when it's a complex subject I know something about. That leaves me with very few contributions :)
That said, I often read from Microsoft people and from external people that anything that plays in Windows Media Player will play in the WPF video elements, that it uses the windows media player ocx under the hood, or that the content runs in WMP in another process.
So I thought I'll clear up a bit how the rendering of video is done, from the few sources I've found and from some experimenting.
First I created a small WPF application containing a small video, and I ran Process Explorer on it to see what was actually loaded.
There's a few interesting things happening there which I think are interesting to naute. First let's review the components that are running. wmp.dll is the core library for Windows Media Player; quartz.dll is DirectShow related; evr.dll is the Enhanced Video Renderer; wdmaud.drv and dsound.dll are audio related; d3d9.dll is what is used for WPF rendering and part of Direct3D; and a bunch of filters like msmpeg2vdec.dll and some .ax splitters files that don't show on that capture. Finally, we have MilCore.dll.
We know that MilCore.dll is the unmanaged part of WPF, and is responsible for composition and rendering. It's built on top of Direct3d. We can deduct that MilCore.dll is the one implementing the audio / video playing, because it exports functions such as MilMediaSetIsScrubbingEnabled, MilMediaOpen, etc.
What we do also know is that for WPF to render video inside the composited graph, it takes whatever default DirectShow graph created automagically for a specific file, and replaces the normal rendering surface by the EVR. As explained on msdn, the EVR is a new renderer shipping as part of Vista that can be used for both DirectShow and MediaFoundation pipelines for rendering and compositing video streams. It's been said before that the renderer is also shipped as part of .net 3.0. Don't be surprised if some media center applications not built on WPF still have a requirement of .net 3.0 exactly for that reason: Leverage the EVR to replace the ageing VMR9. You can find more information on these on Msdn. Finally, the EVR supports a model where you can replace Presenters, aka the surface on which the video gets rendered. And we know from an msdn post that WPF ships its own Presenter.
In passing, that's where you may encounter scrubbing issues on XP. The presenter is the only component knowing intimately about your hardware. It's as such the one that will know about the vsync on your monitor. In XP, the custom presenter synchronizes with the composition engine, but that engine is only on one window. The windows themselves are still in GDI/USER32 land. Without knowing enough of how the composition engine works, I still assume tearing still happens exactly for this reason, even though the WPF content is a retained mode scene. On windows Vista, the same composition engine is used to do a composition for a WPF application or for the whole desktop. As I understand it, it is actually only one rendering tree, where GDI apps are just a node in the tree with a graphic, whereas a WPF app has a whole visual tree, which in turn is what lets the magnifier on Vista to properly zoom on WPF content.
Well, there you are. The difference between plain old Windows Media Player and WPF is large enough (different renderer) that there's no guarantees that a video playing in WMP will play without issues on WPF. All this still begs the question as to why is the WPF team using wmp.dll rather than go straight to DirectShow? I would place my bet on not duplicating engineering work, as Windows Media Player now supports initializing either the MediaFoundation or the DirectShow pipeline based on the file, and leveraging that support would save code duplication. Or it could be something completely different.
But what we can say is that the Windows Media Player process *is not executed*. We can also say that while it's true that there's a linking to wmp.dll and the use of COM interfaces, it is not possible to say if an actual ActiveX control or OCX is used or if some code is not leveraged for DirectShow/MediaFoundation graph construction.