Compare commits

...

12 Commits

Author SHA1 Message Date
Ray Strode
7f4f9b122e wip! native: release dumb fb after flip for EGLStreams
fixme probably need to recreate the dumb fb any time doing
an explicit modeset
2018-11-30 14:34:10 -05:00
Ray Strode
82f03c51ac wip! renderer/native: start to add stream copy mode
At the moment, mutter only supports using GBM for
doing GPU blits to secondary video cards.

This commit starts to sketch out using eglstreams
for doing the copy.

FIXME: in order to get memcpy fastpath we get
inverted colors and have to fix it up in the shader.

FIXME: need to move some of the globals to be in
a more structured place

Closes https://gitlab.gnome.org/GNOME/mutter/issues/205
2018-11-30 14:34:10 -05:00
Ray Strode
8f538671d6 wip! renderer/native: only flip secondary crtc if using GBM
It's possible that the secondary gpu isn't using GBM for
renderering. if that's the case we shouldn't try to flip using
drm apis.

This commit checks, and only does the flip when expected.
2018-11-30 14:34:10 -05:00
Ray Strode
0540e4dc45 wip! renderer/native: generalize flip_egl_stream
We're going to need to be able to flip secondary egl streams,
so this commit parameterizes the stream and related objects.

A new wrapper function, `flip_primary_egl_stream`, takes over the
previous role of `flip_egl_stream`

fixme no_egl_output_drm_flip_event needs to be handled for secondary
streams
2018-11-30 14:34:10 -05:00
Ray Strode
5d8248d65b renderer/native: initialize egldevice on secondary cards
If possible initialize the egldevice renderer on secondary
cards.
2018-11-30 14:34:10 -05:00
Ray Strode
b61c2f2120 wip! renderer/native: separate renderer gpu data initialization from creation
We're going to need to initialize both gbm and egldevice based rendering
on some machines for one renderer data, so this commit splits
initialization out from construction.
2018-11-30 14:34:10 -05:00
Ray Strode
3dacd0f6f5 wip! renderer/native: generalize meta_renderer_native_create_surface_egl_device
Eventually we want to use meta_renderer_native_create_surface_egl_device
for secondary displays, but it currently implicitly uses the primary
display and egl context.

This commit changes the function to take a display and a context
as arguments.
2018-11-30 14:34:10 -05:00
Ray Strode
336cb36d3c wip! renderer/native: pass MetaMonitor instead of MetaLogicalMonitor to meta_renderer_native_create_surface_egl_device 2018-11-30 14:34:10 -05:00
Ray Strode
bb33dad473 wip! renderer-native: use proper surface type for egldevice 2018-11-30 14:34:09 -05:00
Ray Strode
88a7b6e1da renderer/native: use GBM_FORMAT_ARGB8888 for primary rendering format
At the moment we use GBM_FORMAT_XRGB8888 which unfortunately triggers
slow read pixels code in mesa.

This commit changes it to ARGB8888 instead, which copies with memcpy.
2018-11-30 14:34:09 -05:00
Pekka Paalanen
008a12a637 cogl: pick glReadPixels format by target, not source
Presumably glReadPixels itself can be more performant with pixel format
conversions than doing a fix-up conversion on the CPU afterwards. Hence,
pick required_format based on the destination rather than the source, so
that it has a better chance to avoid the fix-up conversion.

With CoglOnscreen objects, CoglFramebuffer::internal_format (the source
format) is also wrong. It is left to a default value and never set to
reflect the reality. In other words, read-pixels had an arbitrary
intermediate pixel format that was used in glReadPixels and then fix-up
conversion made it work for the destination.

The render buffers (GBM surface) are allocated as DRM_FORMAT_XRGB8888.
If the destination buffer is allocated as the same format, the Cogl
read-pixels first converts with glReadPixels XRGB -> ABGR because of the
above default format, and then the fix-up conversion does ABGR -> XRGB.
This case was observed with DisplayLink outputs, where the native
renderer must use the CPU copy path to fill the "secondary GPU"
framebuffers.

This patch stops using internal_format and uses the desired destination
format instead.

_cogl_framebuffer_gl_read_pixels_into_bitmap() will still use
internal_format to determine alpha premultiplication state and multiply
or un-multiply as needed. Luckily all the formats involved in the
DisplayLink use case are always _PRE and so is the default
internal_format too, so things work in practise.

Furthermore, the GL texture_swizzle extension can never apply to
glReadPixels. Not even with FBOs, as found in this discussion:
https://gitlab.gnome.org/GNOME/mutter/issues/72
Therefore the target_format argument is hardcoded to something that can
never match anything, which will prevent the swizzle from being assumed.
2018-11-30 14:34:09 -05:00
Pekka Paalanen
c8112a1dbc cogl: remove mesa_46631_slow_read_pixels_workaround
This function gets hit even today on relatively modern Intel systems (I
have a Haswell Desktop with Mesa 18.2.4) if the pixel format is right.
Presumably it makes things slower for no longer a reason.

According to cb146dc515, this
functionality was refactored into a workaround path in 2012. The commit
message mentions the problem existing before Mesa 8.0.2. The number
refers to https://bugs.freedesktop.org/show_bug.cgi?id=46631 .

The use case where I hit this is when improving support for DisplayLink
video outputs. These are used through a "secondary GPU", and since
DisplayLink does not have a GPU, Mutter uses the CPU copy path with Cogl
read-pixels[1]. If the DisplayLink framebuffer was allocated as
DRM_FORMAT_XRGB8888 (the only format it currently handles correctly),
mesa_46631_slow_read_pixels_workaround would get hit. The render buffer is
the same format as the framebuffer, yet doing the copy XRGB -> XRGB ends
up being slower than XRGB -> XBGR which makes no sense.

This patch is not sufficient to fix the XRGB -> XRGB copy performance,
but it is required.

This patch reverts CoglGpuInfoDriverBug into what it was before
cb146dc515.

[1] This is not actually true until
    https://gitlab.gnome.org/GNOME/mutter/merge_requests/278 is
    merged.
2018-11-30 14:34:09 -05:00
6 changed files with 777 additions and 244 deletions

View File

@ -74,12 +74,7 @@ typedef enum
typedef enum typedef enum
{ {
/* If this bug is present then it is faster to read pixels into a COGL_GPU_INFO_DRIVER_STUB
* PBO and then memcpy out of the PBO into system memory rather than
* directly read into system memory.
* https://bugs.freedesktop.org/show_bug.cgi?id=46631
*/
COGL_GPU_INFO_DRIVER_BUG_MESA_46631_SLOW_READ_PIXELS = 1 << 0
} CoglGpuInfoDriverBug; } CoglGpuInfoDriverBug;
typedef struct _CoglGpuInfoVersion CoglGpuInfoVersion; typedef struct _CoglGpuInfoVersion CoglGpuInfoVersion;

View File

@ -568,13 +568,5 @@ probed:
gpu->architecture_name); gpu->architecture_name);
/* Determine the driver bugs */ /* Determine the driver bugs */
gpu->driver_bugs = 0;
/* In Mesa the glReadPixels implementation is really slow
when using the Intel driver. The Intel
driver has a fast blit path when reading into a PBO. Reading into
a temporary PBO and then memcpying back out to the application's
memory is faster than a regular glReadPixels in this case */
if (gpu->vendor == COGL_GPU_INFO_VENDOR_INTEL &&
gpu->driver_package == COGL_GPU_INFO_DRIVER_PACKAGE_MESA)
gpu->driver_bugs |= COGL_GPU_INFO_DRIVER_BUG_MESA_46631_SLOW_READ_PIXELS;
} }

View File

@ -4,6 +4,7 @@
* A Low Level GPU Graphics and Utilities API * A Low Level GPU Graphics and Utilities API
* *
* Copyright (C) 2007,2008,2009,2012 Intel Corporation. * Copyright (C) 2007,2008,2009,2012 Intel Corporation.
* Copyright (C) 2018 DisplayLink (UK) Ltd.
* *
* Permission is hereby granted, free of charge, to any person * Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation * obtaining a copy of this software and associated documentation
@ -1240,96 +1241,6 @@ _cogl_framebuffer_gl_draw_indexed_attributes (CoglFramebuffer *framebuffer,
_cogl_buffer_gl_unbind (buffer); _cogl_buffer_gl_unbind (buffer);
} }
static CoglBool
mesa_46631_slow_read_pixels_workaround (CoglFramebuffer *framebuffer,
int x,
int y,
CoglReadPixelsFlags source,
CoglBitmap *bitmap,
CoglError **error)
{
CoglContext *ctx;
CoglPixelFormat format;
CoglBitmap *pbo;
int width;
int height;
CoglBool res;
uint8_t *dst;
const uint8_t *src;
ctx = cogl_framebuffer_get_context (framebuffer);
width = cogl_bitmap_get_width (bitmap);
height = cogl_bitmap_get_height (bitmap);
format = cogl_bitmap_get_format (bitmap);
pbo = cogl_bitmap_new_with_size (ctx, width, height, format);
/* Read into the pbo. We need to disable the flipping because the
blit fast path in the driver does not work with
GL_PACK_INVERT_MESA is set */
res = _cogl_framebuffer_read_pixels_into_bitmap (framebuffer,
x, y,
source |
COGL_READ_PIXELS_NO_FLIP,
pbo,
error);
if (!res)
{
cogl_object_unref (pbo);
return FALSE;
}
/* Copy the pixels back into application's buffer */
dst = _cogl_bitmap_map (bitmap,
COGL_BUFFER_ACCESS_WRITE,
COGL_BUFFER_MAP_HINT_DISCARD,
error);
if (!dst)
{
cogl_object_unref (pbo);
return FALSE;
}
src = _cogl_bitmap_map (pbo,
COGL_BUFFER_ACCESS_READ,
0, /* hints */
error);
if (src)
{
int src_rowstride = cogl_bitmap_get_rowstride (pbo);
int dst_rowstride = cogl_bitmap_get_rowstride (bitmap);
int to_copy =
_cogl_pixel_format_get_bytes_per_pixel (format) * width;
int y;
/* If the framebuffer is onscreen we need to flip the
data while copying */
if (!cogl_is_offscreen (framebuffer))
{
src += src_rowstride * (height - 1);
src_rowstride = -src_rowstride;
}
for (y = 0; y < height; y++)
{
memcpy (dst, src, to_copy);
dst += dst_rowstride;
src += src_rowstride;
}
_cogl_bitmap_unmap (pbo);
}
else
res = FALSE;
_cogl_bitmap_unmap (bitmap);
cogl_object_unref (pbo);
return res;
}
CoglBool CoglBool
_cogl_framebuffer_gl_read_pixels_into_bitmap (CoglFramebuffer *framebuffer, _cogl_framebuffer_gl_read_pixels_into_bitmap (CoglFramebuffer *framebuffer,
int x, int x,
@ -1350,40 +1261,6 @@ _cogl_framebuffer_gl_read_pixels_into_bitmap (CoglFramebuffer *framebuffer,
CoglBool pack_invert_set; CoglBool pack_invert_set;
int status = FALSE; int status = FALSE;
/* Workaround for cases where its faster to read into a temporary
* PBO. This is only worth doing if:
*
* The GPU is an Intel GPU. In that case there is a known
* fast-path when reading into a PBO that will use the blitter
* instead of the Mesa fallback code. The driver bug will only be
* set if this is the case.
* We're not already reading into a PBO.
* The target format is BGRA. The fast-path blit does not get hit
* otherwise.
* The size of the data is not trivially small. This isn't a
* requirement to hit the fast-path blit but intuitively it feels
* like if the amount of data is too small then the cost of
* allocating a PBO will outweigh the cost of temporarily
* converting the data to floats.
*/
if ((ctx->gpu.driver_bugs &
COGL_GPU_INFO_DRIVER_BUG_MESA_46631_SLOW_READ_PIXELS) &&
(width > 8 || height > 8) &&
(format & ~COGL_PREMULT_BIT) == COGL_PIXEL_FORMAT_BGRA_8888 &&
cogl_bitmap_get_buffer (bitmap) == NULL)
{
CoglError *ignore_error = NULL;
if (mesa_46631_slow_read_pixels_workaround (framebuffer,
x, y,
source,
bitmap,
&ignore_error))
return TRUE;
else
cogl_error_free (ignore_error);
}
_cogl_framebuffer_flush_state (framebuffer, _cogl_framebuffer_flush_state (framebuffer,
framebuffer, framebuffer,
COGL_FRAMEBUFFER_STATE_BIND); COGL_FRAMEBUFFER_STATE_BIND);
@ -1397,9 +1274,12 @@ _cogl_framebuffer_gl_read_pixels_into_bitmap (CoglFramebuffer *framebuffer,
if (!cogl_is_offscreen (framebuffer)) if (!cogl_is_offscreen (framebuffer))
y = framebuffer_height - y - height; y = framebuffer_height - y - height;
/* Use target format ANY, because GL texture_swizzle extension cannot
* ever apply for glReadPixels.
*/
required_format = ctx->driver_vtable->pixel_format_to_gl_with_target (ctx, required_format = ctx->driver_vtable->pixel_format_to_gl_with_target (ctx,
framebuffer->internal_format,
format, format,
COGL_PIXEL_FORMAT_ANY,
&gl_intformat, &gl_intformat,
&gl_format, &gl_format,
&gl_type); &gl_type);

View File

@ -179,6 +179,97 @@ paint_egl_image (MetaGles3 *gles3,
GL_NEAREST)); GL_NEAREST));
} }
/* FIXME: all these declarations are just floating here
*/
struct __attribute__ ((__packed__)) position
{
float x, y;
};
struct __attribute__ ((__packed__)) texture_coordinate
{
float u, v;
};
struct __attribute__ ((__packed__)) vertex
{
struct position position;
struct texture_coordinate texture_coordinate;
};
struct __attribute__ ((__packed__)) triangle
{
unsigned int first_vertex;
unsigned int middle_vertex;
unsigned int last_vertex;
};
enum
{
RIGHT_TOP_VERTEX = 0,
BOTTOM_RIGHT_VERTEX,
BOTTOM_LEFT_VERTEX,
TOP_LEFT_VERTEX
};
static const float view_left = -1.0f, view_right = 1.0f, view_top = 1.0f, view_bottom = -1.0f;
static const float texture_left = 0.0f, texture_right = 1.0f, texture_top = 0.0f, texture_bottom = 1.0f;
static GLuint vertex_array;
static GLuint vertex_buffer;
static GLuint triangle_buffer;
struct vertex vertices[] = {
[RIGHT_TOP_VERTEX] = {{view_right, view_top},
{texture_right, texture_top}},
[BOTTOM_RIGHT_VERTEX] = {{view_right, view_bottom},
{texture_right, texture_bottom}},
[BOTTOM_LEFT_VERTEX] = {{view_left, view_bottom},
{texture_left, texture_bottom}},
[TOP_LEFT_VERTEX] = {{view_left, view_top},
{texture_left, texture_top}},
};
struct triangle triangles[] = {
{TOP_LEFT_VERTEX, BOTTOM_RIGHT_VERTEX, BOTTOM_LEFT_VERTEX},
{TOP_LEFT_VERTEX, RIGHT_TOP_VERTEX, BOTTOM_RIGHT_VERTEX},
};
gboolean
meta_renderer_native_gles3_draw_pixels (MetaEgl *egl,
MetaGles3 *gles3,
unsigned int width,
unsigned int height,
uint8_t *pixels,
GError **error)
{
meta_gles3_clear_error (gles3);
GLBAS (gles3, glClearColor, (0.0,1.0,0.0,1.0));
GLBAS (gles3, glClear, (GL_COLOR_BUFFER_BIT));
GLBAS (gles3, glViewport, (0, 0, width, height));
GLBAS (gles3, glTexImage2D, (GL_TEXTURE_2D, 0, GL_RGBA,
width, height, 0, GL_RGBA,
GL_UNSIGNED_BYTE, pixels));
GLBAS (gles3, glBindBuffer, (GL_ARRAY_BUFFER, vertex_buffer));
GLBAS (gles3, glEnableVertexAttribArray, (0));
GLBAS (gles3, glVertexAttribPointer, (0,
sizeof (struct position) / sizeof (float), GL_FLOAT,
GL_FALSE,
sizeof (struct vertex), (void *) offsetof (struct vertex, position)));
GLBAS (gles3, glEnableVertexAttribArray, (1));
GLBAS (gles3, glVertexAttribPointer, (1,
sizeof (struct texture_coordinate) / sizeof (float), GL_FLOAT,
GL_FALSE,
sizeof (struct vertex), (void *) offsetof (struct vertex, texture_coordinate)));
GLBAS (gles3, glDrawElements, (GL_TRIANGLES,
G_N_ELEMENTS (triangles) * (sizeof (struct triangle) / sizeof (unsigned int)), GL_UNSIGNED_INT,
0));
return TRUE;
}
gboolean gboolean
meta_renderer_native_gles3_blit_shared_bo (MetaEgl *egl, meta_renderer_native_gles3_blit_shared_bo (MetaEgl *egl,
MetaGles3 *gles3, MetaGles3 *gles3,
@ -238,3 +329,126 @@ meta_renderer_native_gles3_blit_shared_bo (MetaEgl *egl,
return TRUE; return TRUE;
} }
static void
meta_renderer_native_gles3_load_basic_shaders (MetaEgl *egl,
MetaGles3 *gles3)
{
GLuint vertex_shader = 0, fragment_shader = 0, shader_program;
gboolean status = FALSE;
const char *vertex_shader_source =
"#version 330 core\n"
"layout (location = 0) in vec2 position;\n"
"layout (location = 1) in vec2 input_texture_coords;\n"
"out vec2 texture_coords;\n"
"void main()\n"
"{\n"
" gl_Position = vec4(position.x, position.y, 0.0f, 1.0f);\n"
" texture_coords = input_texture_coords;\n"
"}\n";
const char *fragment_shader_source =
"#version 330 core\n"
"uniform sampler2D input_texture;\n"
"in vec2 texture_coords;\n"
"out vec4 output_color;\n"
"void main()\n"
"{\n"
/* FIXME: cogl uses a framebuffer format of COGL_PIXEL_FORMAT_RGBA_8888_PRE
* by default which maps to DRM_FORMAT_ABGR8888 on little endian. The
* destination format is DRM_FORMAT_XRGB8888, so the color channels are
* swapped. The .bgra swizzle here swaps it back, but we should see if
* we can find a better fix (this probably breaks on big endian) */
" output_color = texture(input_texture, texture_coords).bgra;\n"
"}\n";
vertex_shader = glCreateShader (GL_VERTEX_SHADER);
glShaderSource (vertex_shader, 1, &vertex_shader_source, NULL);
glCompileShader (vertex_shader);
glGetShaderiv (vertex_shader, GL_COMPILE_STATUS, &status);
if (!status)
{
char compile_log[1024] = "";
glGetShaderInfoLog (vertex_shader, sizeof (compile_log), NULL, compile_log);
g_warning ("vertex shader compilation failed:\n %s\n", compile_log);
goto out;
}
fragment_shader = glCreateShader (GL_FRAGMENT_SHADER);
glShaderSource (fragment_shader, 1, &fragment_shader_source, NULL);
glCompileShader (fragment_shader);
glGetShaderiv (fragment_shader, GL_COMPILE_STATUS, &status);
if (!status)
{
char compile_log[1024] = "";
glGetShaderInfoLog (fragment_shader, sizeof (compile_log), NULL, compile_log);
g_warning ("fragment shader compilation failed:\n %s\n", compile_log);
goto out;
}
shader_program = glCreateProgram ();
glAttachShader (shader_program, vertex_shader);
glAttachShader (shader_program, fragment_shader);
glLinkProgram (shader_program);
glGetProgramiv (shader_program, GL_LINK_STATUS, &status);
if (!status)
{
char link_log[1024] = "";
glGetProgramInfoLog (shader_program, sizeof (link_log), NULL, link_log);
g_warning ("shader link failed:\n %s\n", link_log);
goto out;
}
glUseProgram (shader_program);
out:
if (vertex_shader)
glDeleteShader (vertex_shader);
if (fragment_shader)
glDeleteShader (fragment_shader);
}
gboolean
meta_renderer_native_gles3_prepare_for_drawing (MetaEgl *egl,
MetaGles3 *gles3,
GError **error)
{
GLuint texture;
meta_renderer_native_gles3_load_basic_shaders (egl, gles3);
meta_gles3_clear_error (gles3);
GLBAS (gles3, glGenVertexArrays, (1, &vertex_array));
GLBAS (gles3, glBindVertexArray, (vertex_array));
GLBAS (gles3, glGenBuffers, (1, &vertex_buffer));
GLBAS (gles3, glBindBuffer, (GL_ARRAY_BUFFER, vertex_buffer));
GLBAS (gles3, glBufferData, (GL_ARRAY_BUFFER, sizeof (vertices), vertices, GL_STREAM_DRAW));
GLBAS (gles3, glGenBuffers, (1, &triangle_buffer));
GLBAS (gles3, glBindBuffer, (GL_ELEMENT_ARRAY_BUFFER, triangle_buffer));
GLBAS (gles3, glBufferData, (GL_ELEMENT_ARRAY_BUFFER, sizeof (triangles), triangles, GL_STREAM_DRAW));
GLBAS (gles3, glActiveTexture, (GL_TEXTURE0));
GLBAS (gles3, glGenTextures, (1, &texture));
GLBAS (gles3, glBindTexture, (GL_TEXTURE_2D, texture));
GLBAS (gles3, glTexParameteri, (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,
GL_NEAREST));
GLBAS (gles3, glTexParameteri, (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,
GL_NEAREST));
GLBAS (gles3, glTexParameteri, (GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,
GL_CLAMP_TO_EDGE));
GLBAS (gles3, glTexParameteri, (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,
GL_CLAMP_TO_EDGE));
GLBAS (gles3, glTexParameteri, (GL_TEXTURE_2D, GL_TEXTURE_WRAP_R_OES,
GL_CLAMP_TO_EDGE));
return TRUE;
}

View File

@ -37,4 +37,15 @@ gboolean meta_renderer_native_gles3_blit_shared_bo (MetaEgl *egl,
struct gbm_bo *shared_bo, struct gbm_bo *shared_bo,
GError **error); GError **error);
gboolean meta_renderer_native_gles3_prepare_for_drawing (MetaEgl *egl,
MetaGles3 *gles3,
GError **error);
gboolean meta_renderer_native_gles3_draw_pixels (MetaEgl *egl,
MetaGles3 *gles3,
unsigned int width,
unsigned int height,
uint8_t *pixels,
GError **error);
#endif /* META_RENDERER_NATIVE_GLES3_H */ #endif /* META_RENDERER_NATIVE_GLES3_H */

File diff suppressed because it is too large Load Diff