Optimize AttributeBuffer to OutputVertex conversion (#3283)
Optimize AttributeBuffer to OutputVertex conversion First I unrolled the inner loop, then I pushed semantics validation outside of the hotloop. I also added overflow slots to avoid conditional branches. Super Mario 3D Land's intro runs at almost full speed when compiled with Clang, and theres a noticible speed increase in MSVC. GCC hasn't been tested but I'm confident in its ability to optimize this code.
This commit is contained in:

committed by
Yuri Kunde Schlesner

parent
3f7f2b42c0
commit
41929371dc
@@ -87,6 +87,8 @@ struct RasterizerRegs {
|
||||
BitField<8, 5, Semantic> map_y;
|
||||
BitField<16, 5, Semantic> map_z;
|
||||
BitField<24, 5, Semantic> map_w;
|
||||
|
||||
u32 raw;
|
||||
} vs_output_attributes[7];
|
||||
|
||||
INSERT_PADDING_WORDS(0xe);
|
||||
|
Reference in New Issue
Block a user