Thursday, April 18, 2013
Sunday, March 24, 2013
So after a bit of irc discusssions yesterday it turns out that you can overclock intel gpus by quite a margin. Which makes some sense now that the gfx performance of intel chips isn't something to be completely ashamed of.
Sunday, February 17, 2013
Thursday, February 7, 2013
Now that my kernel modesetting lockig rework has landed in Dave's drm-next tree and is gearing up for inclusion into 3.9 I've figured it's time to also post my little intro here:
The aim of this locking rework is that ioctls which a compositor should be might call for every frame (set_cursor, page_flip, addfb, rmfb and getfb/create_handle) should not be able to block on kms background activities like output detection. And since each EDID read takes about 25ms (in the best case), that always means we'll drop at least one frame.
The solution is to add per-crtc locking for these ioctls, and restrict background activities to only use the global lock. Change-the-world type of events (modeset, dpms, ...) need to grab all locks.
Tuesday, February 5, 2013
Monday, January 7, 2013
Part 1 talks about the different address spaces that a i915 GEM buffer object can reside in and where and how the respective page tables are set up. Then it also covers different buffer layouts as far as they're a concern for the kernel, namely how tiling, swizzling and fencing works.
Part 2 covers all the different bits and pieces required to submit work to the gpu and keep track of the gpu's progress: Command submission, relocation handling, command retiring and synchronization are the topics.
Part 3 looks at some of the details of the memory management implement in the i915.ko driver. Specifically we look at how we handle running out of GTT space and what happens when we're generally short on memory.
Finally part 4 discusses coherency and caches and how to most efficiently transfer between the gpu coherency domains and the cpu coherncy domain under different circumstances.
In the previous installment we've taken a closer look at some details of the gpu memory management. One of the last big topics now still left are all the various caches, both on the gpu (both render and display block) and the cpu, and what is required to keep the data coherent between them. Now one of the reasons gpus are so fast at processing raw amounts of data is that caches are managed through explicit instructions (cutting down massively on complexity and delays) and there are also a lot of special-purpose caches optimized for different use-cases. Since coherency management isn't automatic, we will also consider the different ways to move data between different coherency domains and what the respective up- and downsides are. See the i915/GEM crashcourse overview for links to the other parts of this series.
Wednesday, November 28, 2012
In previous installments of this series we've looked at how the gpu can access memory and how to submit a workload to the gpu. Now we will look at some of the corner cases in more detail. See the i915/GEM crashcourse overview for links to the other parts of this series.
Wednesday, November 21, 2012
So kernel 3.7 hasn't even shipped yet, but we're already lining up all the ducks for 3.8. And since feature wise I don't expect anything massive any more on top (since the feature merge period will close rsn) I've figured I might do the overview as well a bit earlier: