Looking at top 5 functions in profiler for pavillon_barcelone_v1.2 (Ubuntu 14.04, CPU Intel Core i7 4771, compiled with gcc --march=native):
|Function Name||CPU Time by Utilization||Instructions Retired||CPI Rate||CPU Frequency Ratio|
As you can see from table, powf calls are too expensive even for Haswell.
Each time cycles kernel fetches an interpolated color for pixel (x, y), it applies alpha (if use_alpha flag from SVM stack is set) and converts the result from srgb to linear (if srgb flag from SVM stack is set) -- see svm_image_texture. Therefore cycles kernel produces billions of color_srgb_to_scene_linear calls, which use powf. Is far as I can see, both use_alpha and srgb flags are seem to be constants: only EnvironmentTextureNode::compile/ImageTextureNode::compile set them and only svm_node_tex_environment, svm_node_tex_image_box and svm_node_tex_image decode them from SVM stack.
If Cycles internals work only in linear space, can we convert images to linear space before starting raytracer? This could give a noticeable boost for textured objects.
- In theory, interpolation between pixels gives different results in linear space (right now cycles interpolates in srgb space). This difference is tiny and only noticeable for extremely lowres textures.
- interpolate(premultiply(image), x, y) ≡ premultiply(interpolate(image, x, y)), AFAIK
- If user places the same image in the node tree, but with different settings (e. g. Color and Non-color data), then a copy of image should be created.
No patch yet, waiting for Brecht's comment.