Rasterizer cache refactor (#6375)

* rasterizer_cache: Remove custom texture code

* It's a hacky buggy mess, will be reimplemented later when the cache is in a better state

* rasterizer_cache: Refactor surface upload/download

* Switch to the texture_codec header which was written as part of the vulkan backend by steveice and me

* Move most of the upload logic to the rasterizer cache and out of the surface object

* Scaled uploads/downloads have been disabled for now since they require more runtime infrastructure

* rasterizer_cache: Refactor runtime interface

* Remove aspect enum which is the same as SurfaceType

* Replace Subresource with specific structures for each operation (blit/copy/clear). This mimics moderns APIs vulkan much better

* Pass the surface to the runtime instead of the texture

* Implement CopyTextures with glCopyImageSubData which is available on 4.3 and gles.
  This function also has an overload for cubes which will be removed later.

* rasterizer_cache: Move texture allocation to the runtime

* renderer_opengl: Remove TextureDownloaderES

* It's overly compilcated and unused at the moment. Will be replaced with a simple compute shader in a later commit

* rasterizer_cache: Split CachedSurface

* This commit splits CachedSurface into two classes, SurfaceBase which contains the backend agnostic functions and Surface which is the opengl specific part

* For now the cache uses the opengl surface directly and there are a few ugly casts with watchers, those will be taken care of when the template convertion and watcher removal are added respectively

* rasterizer_cache: Move reinterpreters to the runtime

* rasterizer_cache: Move some pixel format function to the cpp file

* rasterizer_cache: Common texture acceleration functions

* They don't contain any backend specific code so they shouldn't be duplicated

* rasterizer_cache: Remove BlitSurfaces

* It's better to prefer copy/blit in the caller anyway

* rasterizer_cache: Only allocate needed levels

* rasterizer_cache: Move texture runtime out of common dir

* Also shorten the util header filename

* surface_params: Cleanup code

* Add more comments, organize it a bit etc

* rasterizer_cache: Move texture filtering to the runtime

* rasterizer_cache: Move to VideoCore

* renderer_opengl: Reimplement scaled uploads/downloads

* Instead of looking up for temporary textures, each allocation now contains both a scaled and unscaled handle
  This allows the scale operations to be done inside the surface object itself and improves performance in general

* In particular the scaled download code has been expanded to use ARB_get_texture_sub_image when possible
  which is faster and more convenient than glReadPixels. The latter is still relevant for OpenGLES though.

* Finally allocations are now given a handy debug name that can be viewed from renderdoc.

* rasterizer_cache: Remove global state

* gl_rasterizer: Abstract common draw operations to Framebuffer

* This also allows to cache framebuffer objects instead of always swapping the textures, something that particularly benefits mali gpus

* rasterizer_cache: Implement multi-level surfaces

* With this commit the cache can now directly upload and use mipmaps
  without needing to sync them with watchers. By using native mimaps
  directly this also adds support for mipmap for cube

* Texture cubes have also been updated to drop the watcher requirement

* host_shaders: Add CMake integration for string shaders

* Improves build time shader generation making it much less prone to errors.
  Also moves the presentation shaders here to avoid embedding them to the cpp file.

* Texture filter shaders now make explicit use of uniform bindings for better vulkan compatibility

* renderer_opengl: Emulate lod bias in the shader

* This way opengles can emulate it correctly

* gl_rasterizer: Respect GL_MAX_TEXTURE_BUFFER_SIZE

* Older Bifrost Mali GPUs only support up to 64kb texture buffers. Citra would try to allocate a much larger buffer the first 64kb of which would work fine but after that the driver starts misbehaving and showing various graphical glitches

* rasterizer_cache: Cleanup CopySurface

* renderer_opengl: Keep frames synchronized when using a GPU debugger

* rasterizer_cache: Rename Surface to SurfaceRef

* Makes it clear that surface is a shared_ptr and not an object

* rasterizer_cache: Cleanup

* Move constructor to the top of the file

* Move FindMatch to the top as well and remove the Invalid flag which was redudant;
  all FindMatch calls used it expect from MatchFlags::Copy which ignores it anyway

* gl_texture_runtime: Make driver const

* gl_texture_runtime: Fix RGB8 format handling

* The texture_codec header, being written with vulkan in mind converts RGB8 to RGBA8. The backend wasn't adjusted to account for this though and treated the data as RGB8.

* Also remove D16 convertions, both opengl and vulkan are required to support this format so these are not needed

* gl_texture_runtime: Reduce state switches during FBO blits

* glBlitFramebuffer is only affected by the scissor rectangle so just disable scissor testing instead of resetting our entire state

* surface_params: Prevent texcopy that spans multiple levels

* It would have failed before as well, with multi-level surfaces it triggers the assert though

* renderer_opengl: Centralize texture filters

* A lot of code is shared between the filters thus is makes it sense to centralize them

* Also fix an issue with partial texture uploads

* Address review comments

* rasterizer_cache: Use leading return types

* rasterizer_cache: Cleanup null checks

* renderer_opengl: Add additional logging

* externals: Actually downgrade glad

* For some reason I missed adding the files to git

* surface_params: Do not check for levels in exact match

* Some games will try to use the base level of a multi level surface. Checking for levels forces another surface to be created and a copy to be made which is both unncessary and breaks custom textures

---------

Co-authored-by: bunnei <bunneidev@gmail.com>
This commit is contained in:
GPUCode
2023-04-21 10:14:55 +03:00
committed by GitHub
parent 9414db4f65
commit 26d6f9d1c6
105 changed files with 4606 additions and 4984 deletions

View File

@@ -1,480 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/microprofile.h"
#include "common/scope_exit.h"
#include "common/texture.h"
#include "core/core.h"
#include "video_core/rasterizer_cache/cached_surface.h"
#include "video_core/rasterizer_cache/morton_swizzle.h"
#include "video_core/rasterizer_cache/rasterizer_cache.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_downloader_es.h"
#include "video_core/renderer_opengl/texture_filters/texture_filterer.h"
namespace OpenGL {
static Aspect ToAspect(SurfaceType type) {
switch (type) {
case SurfaceType::Color:
case SurfaceType::Texture:
case SurfaceType::Fill:
return Aspect::Color;
case SurfaceType::Depth:
return Aspect::Depth;
case SurfaceType::DepthStencil:
return Aspect::DepthStencil;
default:
LOG_CRITICAL(Render_OpenGL, "Unknown SurfaceType {}", type);
UNREACHABLE();
}
return Aspect::Color;
}
CachedSurface::~CachedSurface() {
if (texture.handle) {
auto tag = is_custom ? HostTextureTag{GetFormatTuple(PixelFormat::RGBA8),
custom_tex_info.width, custom_tex_info.height}
: HostTextureTag{GetFormatTuple(pixel_format), GetScaledWidth(),
GetScaledHeight()};
owner.host_texture_recycler.emplace(tag, std::move(texture));
}
}
MICROPROFILE_DEFINE(RasterizerCache_SurfaceLoad, "RasterizerCache", "Surface Load",
MP_RGB(128, 192, 64));
void CachedSurface::LoadGLBuffer(PAddr load_start, PAddr load_end) {
ASSERT(type != SurfaceType::Fill);
const bool need_swap =
GLES && (pixel_format == PixelFormat::RGBA8 || pixel_format == PixelFormat::RGB8);
const u8* const texture_src_data = VideoCore::g_memory->GetPhysicalPointer(addr);
if (texture_src_data == nullptr)
return;
if (gl_buffer.empty()) {
gl_buffer.resize(width * height * GetBytesPerPixel(pixel_format));
}
// TODO: Should probably be done in ::Memory:: and check for other regions too
if (load_start < Memory::VRAM_VADDR_END && load_end > Memory::VRAM_VADDR_END)
load_end = Memory::VRAM_VADDR_END;
if (load_start < Memory::VRAM_VADDR && load_end > Memory::VRAM_VADDR)
load_start = Memory::VRAM_VADDR;
MICROPROFILE_SCOPE(RasterizerCache_SurfaceLoad);
ASSERT(load_start >= addr && load_end <= end);
const u32 start_offset = load_start - addr;
if (!is_tiled) {
ASSERT(type == SurfaceType::Color);
if (need_swap) {
// TODO(liushuyu): check if the byteswap here is 100% correct
// cannot fully test this
if (pixel_format == PixelFormat::RGBA8) {
for (std::size_t i = start_offset; i < load_end - addr; i += 4) {
gl_buffer[i] = texture_src_data[i + 3];
gl_buffer[i + 1] = texture_src_data[i + 2];
gl_buffer[i + 2] = texture_src_data[i + 1];
gl_buffer[i + 3] = texture_src_data[i];
}
} else if (pixel_format == PixelFormat::RGB8) {
for (std::size_t i = start_offset; i < load_end - addr; i += 3) {
gl_buffer[i] = texture_src_data[i + 2];
gl_buffer[i + 1] = texture_src_data[i + 1];
gl_buffer[i + 2] = texture_src_data[i];
}
}
} else {
std::memcpy(&gl_buffer[start_offset], texture_src_data + start_offset,
load_end - load_start);
}
} else {
if (type == SurfaceType::Texture) {
Pica::Texture::TextureInfo tex_info{};
tex_info.width = width;
tex_info.height = height;
tex_info.format = static_cast<Pica::TexturingRegs::TextureFormat>(pixel_format);
tex_info.SetDefaultStride();
tex_info.physical_address = addr;
const SurfaceInterval load_interval(load_start, load_end);
const auto rect = GetSubRect(FromInterval(load_interval));
ASSERT(FromInterval(load_interval).GetInterval() == load_interval);
for (unsigned y = rect.bottom; y < rect.top; ++y) {
for (unsigned x = rect.left; x < rect.right; ++x) {
auto vec4 =
Pica::Texture::LookupTexture(texture_src_data, x, height - 1 - y, tex_info);
const std::size_t offset = (x + (width * y)) * 4;
std::memcpy(&gl_buffer[offset], vec4.AsArray(), 4);
}
}
} else {
morton_to_gl_fns[static_cast<std::size_t>(pixel_format)](stride, height, &gl_buffer[0],
addr, load_start, load_end);
}
}
}
MICROPROFILE_DEFINE(RasterizerCache_SurfaceFlush, "RasterizerCache", "Surface Flush",
MP_RGB(128, 192, 64));
void CachedSurface::FlushGLBuffer(PAddr flush_start, PAddr flush_end) {
u8* const dst_buffer = VideoCore::g_memory->GetPhysicalPointer(addr);
if (dst_buffer == nullptr)
return;
ASSERT(gl_buffer.size() == width * height * GetBytesPerPixel(pixel_format));
// TODO: Should probably be done in ::Memory:: and check for other regions too
// same as loadglbuffer()
if (flush_start < Memory::VRAM_VADDR_END && flush_end > Memory::VRAM_VADDR_END)
flush_end = Memory::VRAM_VADDR_END;
if (flush_start < Memory::VRAM_VADDR && flush_end > Memory::VRAM_VADDR)
flush_start = Memory::VRAM_VADDR;
MICROPROFILE_SCOPE(RasterizerCache_SurfaceFlush);
ASSERT(flush_start >= addr && flush_end <= end);
const u32 start_offset = flush_start - addr;
const u32 end_offset = flush_end - addr;
if (type == SurfaceType::Fill) {
const u32 coarse_start_offset = start_offset - (start_offset % fill_size);
const u32 backup_bytes = start_offset % fill_size;
std::array<u8, 4> backup_data;
if (backup_bytes)
std::memcpy(&backup_data[0], &dst_buffer[coarse_start_offset], backup_bytes);
for (u32 offset = coarse_start_offset; offset < end_offset; offset += fill_size) {
std::memcpy(&dst_buffer[offset], &fill_data[0],
std::min(fill_size, end_offset - offset));
}
if (backup_bytes)
std::memcpy(&dst_buffer[coarse_start_offset], &backup_data[0], backup_bytes);
} else if (!is_tiled) {
ASSERT(type == SurfaceType::Color);
if (pixel_format == PixelFormat::RGBA8 && GLES) {
for (std::size_t i = start_offset; i < flush_end - addr; i += 4) {
dst_buffer[i] = gl_buffer[i + 3];
dst_buffer[i + 1] = gl_buffer[i + 2];
dst_buffer[i + 2] = gl_buffer[i + 1];
dst_buffer[i + 3] = gl_buffer[i];
}
} else if (pixel_format == PixelFormat::RGB8 && GLES) {
for (std::size_t i = start_offset; i < flush_end - addr; i += 3) {
dst_buffer[i] = gl_buffer[i + 2];
dst_buffer[i + 1] = gl_buffer[i + 1];
dst_buffer[i + 2] = gl_buffer[i];
}
} else {
std::memcpy(dst_buffer + start_offset, &gl_buffer[start_offset],
flush_end - flush_start);
}
} else {
gl_to_morton_fns[static_cast<std::size_t>(pixel_format)](stride, height, &gl_buffer[0],
addr, flush_start, flush_end);
}
}
bool CachedSurface::LoadCustomTexture(u64 tex_hash) {
auto& custom_tex_cache = Core::System::GetInstance().CustomTexCache();
const auto& image_interface = Core::System::GetInstance().GetImageInterface();
if (custom_tex_cache.IsTextureCached(tex_hash)) {
custom_tex_info = custom_tex_cache.LookupTexture(tex_hash);
return true;
}
if (!custom_tex_cache.CustomTextureExists(tex_hash)) {
return false;
}
const auto& path_info = custom_tex_cache.LookupTexturePathInfo(tex_hash);
if (!image_interface->DecodePNG(custom_tex_info.tex, custom_tex_info.width,
custom_tex_info.height, path_info.path)) {
LOG_ERROR(Render_OpenGL, "Failed to load custom texture {}", path_info.path);
return false;
}
const std::bitset<32> width_bits(custom_tex_info.width);
const std::bitset<32> height_bits(custom_tex_info.height);
if (width_bits.count() != 1 || height_bits.count() != 1) {
LOG_ERROR(Render_OpenGL, "Texture {} size is not a power of 2", path_info.path);
return false;
}
LOG_DEBUG(Render_OpenGL, "Loaded custom texture from {}", path_info.path);
Common::FlipRGBA8Texture(custom_tex_info.tex, custom_tex_info.width, custom_tex_info.height);
custom_tex_cache.CacheTexture(tex_hash, custom_tex_info.tex, custom_tex_info.width,
custom_tex_info.height);
return true;
}
void CachedSurface::DumpTexture(GLuint target_tex, u64 tex_hash) {
// Make sure the texture size is a power of 2
// If not, the surface is actually a framebuffer
std::bitset<32> width_bits(width);
std::bitset<32> height_bits(height);
if (width_bits.count() != 1 || height_bits.count() != 1) {
LOG_WARNING(Render_OpenGL, "Not dumping {:016X} because size isn't a power of 2 ({}x{})",
tex_hash, width, height);
return;
}
// Dump texture to RGBA8 and encode as PNG
const auto& image_interface = Core::System::GetInstance().GetImageInterface();
auto& custom_tex_cache = Core::System::GetInstance().CustomTexCache();
std::string dump_path =
fmt::format("{}textures/{:016X}/", FileUtil::GetUserPath(FileUtil::UserPath::DumpDir),
Core::System::GetInstance().Kernel().GetCurrentProcess()->codeset->program_id);
if (!FileUtil::CreateFullPath(dump_path)) {
LOG_ERROR(Render, "Unable to create {}", dump_path);
return;
}
dump_path += fmt::format("tex1_{}x{}_{:016X}_{}.png", width, height, tex_hash, pixel_format);
if (!custom_tex_cache.IsTextureDumped(tex_hash) && !FileUtil::Exists(dump_path)) {
custom_tex_cache.SetTextureDumped(tex_hash);
LOG_INFO(Render_OpenGL, "Dumping texture to {}", dump_path);
std::vector<u8> decoded_texture;
decoded_texture.resize(width * height * 4);
OpenGLState state = OpenGLState::GetCurState();
GLuint old_texture = state.texture_units[0].texture_2d;
state.Apply();
/*
GetTexImageOES is used even if not using OpenGL ES to work around a small issue that
happens if using custom textures with texture dumping at the same.
Let's say there's 2 textures that are both 32x32 and one of them gets replaced with a
higher quality 256x256 texture. If the 256x256 texture is displayed first and the
32x32 texture gets uploaded to the same underlying OpenGL texture, the 32x32 texture
will appear in the corner of the 256x256 texture. If texture dumping is enabled and
the 32x32 is undumped, Citra will attempt to dump it. Since the underlying OpenGL
texture is still 256x256, Citra crashes because it thinks the texture is only 32x32.
GetTexImageOES conveniently only dumps the specified region, and works on both
desktop and ES.
*/
owner.texture_downloader_es->GetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_UNSIGNED_BYTE,
height, width, &decoded_texture[0]);
state.texture_units[0].texture_2d = old_texture;
state.Apply();
Common::FlipRGBA8Texture(decoded_texture, width, height);
if (!image_interface->EncodePNG(dump_path, decoded_texture, width, height))
LOG_ERROR(Render_OpenGL, "Failed to save decoded texture");
}
}
MICROPROFILE_DEFINE(RasterizerCache_TextureUL, "RasterizerCache", "Texture Upload",
MP_RGB(128, 192, 64));
void CachedSurface::UploadGLTexture(Common::Rectangle<u32> rect) {
if (type == SurfaceType::Fill) {
return;
}
MICROPROFILE_SCOPE(RasterizerCache_TextureUL);
ASSERT(gl_buffer.size() == width * height * GetBytesPerPixel(pixel_format));
u64 tex_hash = 0;
if (Settings::values.dump_textures || Settings::values.custom_textures) {
tex_hash = Common::ComputeHash64(gl_buffer.data(), gl_buffer.size());
}
if (Settings::values.custom_textures) {
is_custom = LoadCustomTexture(tex_hash);
}
// Load data from memory to the surface
GLint x0 = static_cast<GLint>(rect.left);
GLint y0 = static_cast<GLint>(rect.bottom);
std::size_t buffer_offset = (y0 * stride + x0) * GetBytesPerPixel(pixel_format);
const FormatTuple& tuple = GetFormatTuple(pixel_format);
GLuint target_tex = texture.handle;
// If not 1x scale, create 1x texture that we will blit from to replace texture subrect in
// surface
OGLTexture unscaled_tex;
if (res_scale != 1) {
x0 = 0;
y0 = 0;
if (is_custom) {
const auto& tuple = GetFormatTuple(PixelFormat::RGBA8);
unscaled_tex =
owner.AllocateSurfaceTexture(tuple, custom_tex_info.width, custom_tex_info.height);
} else {
unscaled_tex = owner.AllocateSurfaceTexture(tuple, rect.GetWidth(), rect.GetHeight());
}
target_tex = unscaled_tex.handle;
}
OpenGLState cur_state = OpenGLState::GetCurState();
GLuint old_tex = cur_state.texture_units[0].texture_2d;
cur_state.texture_units[0].texture_2d = target_tex;
cur_state.Apply();
// Ensure no bad interactions with GL_UNPACK_ALIGNMENT
ASSERT(stride * GetBytesPerPixel(pixel_format) % 4 == 0);
if (is_custom) {
if (res_scale == 1) {
texture = owner.AllocateSurfaceTexture(GetFormatTuple(PixelFormat::RGBA8),
custom_tex_info.width, custom_tex_info.height);
cur_state.texture_units[0].texture_2d = texture.handle;
cur_state.Apply();
}
// Always going to be using rgba8
glPixelStorei(GL_UNPACK_ROW_LENGTH, static_cast<GLint>(custom_tex_info.width));
glActiveTexture(GL_TEXTURE0);
glTexSubImage2D(GL_TEXTURE_2D, 0, x0, y0, custom_tex_info.width, custom_tex_info.height,
GL_RGBA, GL_UNSIGNED_BYTE, custom_tex_info.tex.data());
} else {
glPixelStorei(GL_UNPACK_ROW_LENGTH, static_cast<GLint>(stride));
glActiveTexture(GL_TEXTURE0);
glTexSubImage2D(GL_TEXTURE_2D, 0, x0, y0, static_cast<GLsizei>(rect.GetWidth()),
static_cast<GLsizei>(rect.GetHeight()), tuple.format, tuple.type,
&gl_buffer[buffer_offset]);
}
glPixelStorei(GL_UNPACK_ROW_LENGTH, 0);
if (Settings::values.dump_textures && !is_custom) {
DumpTexture(target_tex, tex_hash);
}
cur_state.texture_units[0].texture_2d = old_tex;
cur_state.Apply();
if (res_scale != 1) {
auto scaled_rect = rect;
scaled_rect.left *= res_scale;
scaled_rect.top *= res_scale;
scaled_rect.right *= res_scale;
scaled_rect.bottom *= res_scale;
const u32 width = is_custom ? custom_tex_info.width : rect.GetWidth();
const u32 height = is_custom ? custom_tex_info.height : rect.GetHeight();
const Common::Rectangle<u32> from_rect{0, height, width, 0};
if (is_custom ||
!owner.texture_filterer->Filter(unscaled_tex, from_rect, texture, scaled_rect, type)) {
const Aspect aspect = ToAspect(type);
runtime.BlitTextures(unscaled_tex, {aspect, from_rect}, texture, {aspect, scaled_rect});
}
}
InvalidateAllWatcher();
}
MICROPROFILE_DEFINE(RasterizerCache_TextureDL, "RasterizerCache", "Texture Download",
MP_RGB(128, 192, 64));
void CachedSurface::DownloadGLTexture(const Common::Rectangle<u32>& rect) {
if (type == SurfaceType::Fill) {
return;
}
MICROPROFILE_SCOPE(RasterizerCache_TextureDL);
if (gl_buffer.empty()) {
gl_buffer.resize(width * height * GetBytesPerPixel(pixel_format));
}
OpenGLState state = OpenGLState::GetCurState();
OpenGLState prev_state = state;
SCOPE_EXIT({ prev_state.Apply(); });
const FormatTuple& tuple = GetFormatTuple(pixel_format);
// Ensure no bad interactions with GL_PACK_ALIGNMENT
ASSERT(stride * GetBytesPerPixel(pixel_format) % 4 == 0);
glPixelStorei(GL_PACK_ROW_LENGTH, static_cast<GLint>(stride));
const std::size_t buffer_offset =
(rect.bottom * stride + rect.left) * GetBytesPerPixel(pixel_format);
// If not 1x scale, blit scaled texture to a new 1x texture and use that to flush
const Aspect aspect = ToAspect(type);
if (res_scale != 1) {
auto scaled_rect = rect;
scaled_rect.left *= res_scale;
scaled_rect.top *= res_scale;
scaled_rect.right *= res_scale;
scaled_rect.bottom *= res_scale;
const Common::Rectangle<u32> unscaled_tex_rect{0, rect.GetHeight(), rect.GetWidth(), 0};
auto unscaled_tex = owner.AllocateSurfaceTexture(tuple, rect.GetWidth(), rect.GetHeight());
// Blit scaled texture to the unscaled one
runtime.BlitTextures(texture, {aspect, scaled_rect}, unscaled_tex,
{aspect, unscaled_tex_rect});
state.texture_units[0].texture_2d = unscaled_tex.handle;
state.Apply();
glActiveTexture(GL_TEXTURE0);
if (GLES) {
owner.texture_downloader_es->GetTexImage(GL_TEXTURE_2D, 0, tuple.format, tuple.type,
rect.GetHeight(), rect.GetWidth(),
&gl_buffer[buffer_offset]);
} else {
glGetTexImage(GL_TEXTURE_2D, 0, tuple.format, tuple.type, &gl_buffer[buffer_offset]);
}
} else {
runtime.ReadTexture(texture, {aspect, rect}, tuple, gl_buffer.data());
}
glPixelStorei(GL_PACK_ROW_LENGTH, 0);
}
bool CachedSurface::CanFill(const SurfaceParams& dest_surface,
SurfaceInterval fill_interval) const {
if (type == SurfaceType::Fill && IsRegionValid(fill_interval) &&
boost::icl::first(fill_interval) >= addr &&
boost::icl::last_next(fill_interval) <= end && // dest_surface is within our fill range
dest_surface.FromInterval(fill_interval).GetInterval() ==
fill_interval) { // make sure interval is a rectangle in dest surface
if (fill_size * 8 != dest_surface.GetFormatBpp()) {
// Check if bits repeat for our fill_size
const u32 dest_bytes_per_pixel = std::max(dest_surface.GetFormatBpp() / 8, 1u);
std::vector<u8> fill_test(fill_size * dest_bytes_per_pixel);
for (u32 i = 0; i < dest_bytes_per_pixel; ++i)
std::memcpy(&fill_test[i * fill_size], &fill_data[0], fill_size);
for (u32 i = 0; i < fill_size; ++i)
if (std::memcmp(&fill_test[dest_bytes_per_pixel * i], &fill_test[0],
dest_bytes_per_pixel) != 0)
return false;
if (dest_surface.GetFormatBpp() == 4 && (fill_test[0] & 0xF) != (fill_test[0] >> 4))
return false;
}
return true;
}
return false;
}
bool CachedSurface::CanCopy(const SurfaceParams& dest_surface,
SurfaceInterval copy_interval) const {
SurfaceParams subrect_params = dest_surface.FromInterval(copy_interval);
ASSERT(subrect_params.GetInterval() == copy_interval);
if (CanSubRect(subrect_params))
return true;
if (CanFill(dest_surface, copy_interval))
return true;
return false;
}
} // namespace OpenGL

View File

@@ -1,137 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <list>
#include "common/assert.h"
#include "core/custom_tex_cache.h"
#include "video_core/rasterizer_cache/surface_params.h"
#include "video_core/rasterizer_cache/texture_runtime.h"
namespace OpenGL {
/**
* A watcher that notifies whether a cached surface has been changed. This is useful for caching
* surface collection objects, including texture cube and mipmap.
*/
class SurfaceWatcher {
friend class CachedSurface;
public:
explicit SurfaceWatcher(std::weak_ptr<CachedSurface>&& surface) : surface(std::move(surface)) {}
/// Checks whether the surface has been changed.
bool IsValid() const {
return !surface.expired() && valid;
}
/// Marks that the content of the referencing surface has been updated to the watcher user.
void Validate() {
ASSERT(!surface.expired());
valid = true;
}
/// Gets the referencing surface. Returns null if the surface has been destroyed
Surface Get() const {
return surface.lock();
}
private:
std::weak_ptr<CachedSurface> surface;
bool valid = false;
};
class RasterizerCacheOpenGL;
class CachedSurface : public SurfaceParams, public std::enable_shared_from_this<CachedSurface> {
public:
CachedSurface(SurfaceParams params, RasterizerCacheOpenGL& owner, TextureRuntime& runtime)
: SurfaceParams(params), owner(owner), runtime(runtime) {}
~CachedSurface();
/// Read/Write data in 3DS memory to/from gl_buffer
void LoadGLBuffer(PAddr load_start, PAddr load_end);
void FlushGLBuffer(PAddr flush_start, PAddr flush_end);
/// Custom texture loading and dumping
bool LoadCustomTexture(u64 tex_hash);
void DumpTexture(GLuint target_tex, u64 tex_hash);
/// Upload/Download data in gl_buffer in/to this surface's texture
void UploadGLTexture(Common::Rectangle<u32> rect);
void DownloadGLTexture(const Common::Rectangle<u32>& rect);
bool CanFill(const SurfaceParams& dest_surface, SurfaceInterval fill_interval) const;
bool CanCopy(const SurfaceParams& dest_surface, SurfaceInterval copy_interval) const;
bool IsRegionValid(SurfaceInterval interval) const {
return (invalid_regions.find(interval) == invalid_regions.end());
}
bool IsSurfaceFullyInvalid() const {
auto interval = GetInterval();
return *invalid_regions.equal_range(interval).first == interval;
}
std::shared_ptr<SurfaceWatcher> CreateWatcher() {
auto watcher = std::make_shared<SurfaceWatcher>(weak_from_this());
watchers.push_front(watcher);
return watcher;
}
void InvalidateAllWatcher() {
for (const auto& watcher : watchers) {
if (auto locked = watcher.lock()) {
locked->valid = false;
}
}
}
void UnlinkAllWatcher() {
for (const auto& watcher : watchers) {
if (auto locked = watcher.lock()) {
locked->valid = false;
locked->surface.reset();
}
}
watchers.clear();
}
public:
bool registered = false;
SurfaceRegions invalid_regions;
std::vector<u8> gl_buffer;
// Number of bytes to read from fill_data
u32 fill_size = 0;
std::array<u8, 4> fill_data;
OGLTexture texture;
// level_watchers[i] watches the (i+1)-th level mipmap source surface
std::array<std::shared_ptr<SurfaceWatcher>, 7> level_watchers;
u32 max_level = 0;
// Information about custom textures
bool is_custom = false;
Core::CustomTexInfo custom_tex_info;
private:
RasterizerCacheOpenGL& owner;
TextureRuntime& runtime;
std::list<std::weak_ptr<SurfaceWatcher>> watchers;
};
struct CachedTextureCube {
OGLTexture texture;
u16 res_scale = 1;
std::shared_ptr<SurfaceWatcher> px;
std::shared_ptr<SurfaceWatcher> nx;
std::shared_ptr<SurfaceWatcher> py;
std::shared_ptr<SurfaceWatcher> ny;
std::shared_ptr<SurfaceWatcher> pz;
std::shared_ptr<SurfaceWatcher> nz;
};
} // namespace OpenGL

View File

@@ -0,0 +1,73 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/rasterizer_cache/framebuffer_base.h"
#include "video_core/rasterizer_cache/surface_base.h"
#include "video_core/regs.h"
namespace VideoCore {
FramebufferBase::FramebufferBase() = default;
FramebufferBase::FramebufferBase(const Pica::Regs& regs, const SurfaceBase* const color,
u32 color_level, const SurfaceBase* const depth_stencil,
u32 depth_level, Common::Rectangle<u32> surfaces_rect) {
res_scale = color ? color->res_scale : (depth_stencil ? depth_stencil->res_scale : 1u);
// Determine the draw rectangle (render area + scissor)
const Common::Rectangle viewport_rect = regs.rasterizer.GetViewportRect();
draw_rect.left =
std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.left * res_scale,
surfaces_rect.left, surfaces_rect.right);
draw_rect.top =
std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) + viewport_rect.top * res_scale,
surfaces_rect.bottom, surfaces_rect.top);
draw_rect.right =
std::clamp<s32>(static_cast<s32>(surfaces_rect.left) + viewport_rect.right * res_scale,
surfaces_rect.left, surfaces_rect.right);
draw_rect.bottom =
std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) + viewport_rect.bottom * res_scale,
surfaces_rect.bottom, surfaces_rect.top);
// Update viewport
viewport.x = static_cast<f32>(surfaces_rect.left + viewport_rect.left * res_scale);
viewport.y = static_cast<f32>(surfaces_rect.bottom + viewport_rect.bottom * res_scale);
viewport.width = static_cast<f32>(viewport_rect.GetWidth() * res_scale);
viewport.height = static_cast<f32>(viewport_rect.GetHeight() * res_scale);
// Scissor checks are window-, not viewport-relative, which means that if the cached texture
// sub-rect changes, the scissor bounds also need to be updated.
scissor_rect.left =
static_cast<s32>(surfaces_rect.left + regs.rasterizer.scissor_test.x1 * res_scale);
scissor_rect.bottom =
static_cast<s32>(surfaces_rect.bottom + regs.rasterizer.scissor_test.y1 * res_scale);
// x2, y2 have +1 added to cover the entire pixel area, otherwise you might get cracks when
// scaling or doing multisampling.
scissor_rect.right =
static_cast<s32>(surfaces_rect.left + (regs.rasterizer.scissor_test.x2 + 1) * res_scale);
scissor_rect.top =
static_cast<s32>(surfaces_rect.bottom + (regs.rasterizer.scissor_test.y2 + 1) * res_scale);
// Rendering to mipmaps is something quite rare so log it when it occurs.
if (color_level != 0) {
LOG_WARNING(HW_GPU, "Game is rendering to color mipmap {}", color_level);
}
if (depth_level != 0) {
LOG_WARNING(HW_GPU, "Game is rendering to depth mipmap {}", depth_level);
}
// Query surface invalidation intervals
const Common::Rectangle draw_rect_unscaled{draw_rect / res_scale};
if (color) {
color_params = *color;
intervals[0] = color->GetSubRectInterval(draw_rect_unscaled, color_level);
}
if (depth_stencil) {
depth_params = *depth_stencil;
intervals[1] = depth_stencil->GetSubRectInterval(draw_rect_unscaled, depth_level);
}
}
} // namespace VideoCore

View File

@@ -0,0 +1,87 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/math_util.h"
#include "video_core/rasterizer_cache/surface_params.h"
namespace Pica {
struct Regs;
}
namespace VideoCore {
class SurfaceBase;
struct ViewportInfo {
f32 x;
f32 y;
f32 width;
f32 height;
};
/**
* A framebuffer is a lightweight abstraction over a pair of surfaces and provides
* metadata about them.
*/
class FramebufferBase {
public:
FramebufferBase();
FramebufferBase(const Pica::Regs& regs, const SurfaceBase* const color, u32 color_level,
const SurfaceBase* const depth_stencil, u32 depth_level,
Common::Rectangle<u32> surfaces_rect);
SurfaceParams ColorParams() const noexcept {
return color_params;
}
SurfaceParams DepthParams() const noexcept {
return depth_params;
}
SurfaceInterval Interval(SurfaceType type) const noexcept {
return intervals[Index(type)];
}
u32 ResolutionScale() const noexcept {
return res_scale;
}
Common::Rectangle<u32> DrawRect() const noexcept {
return draw_rect;
}
Common::Rectangle<s32> Scissor() const noexcept {
return scissor_rect;
}
ViewportInfo Viewport() const noexcept {
return viewport;
}
protected:
u32 Index(VideoCore::SurfaceType type) const noexcept {
switch (type) {
case VideoCore::SurfaceType::Color:
return 0;
case VideoCore::SurfaceType::DepthStencil:
return 1;
default:
LOG_CRITICAL(HW_GPU, "Unknown surface type in framebuffer");
return 0;
}
}
protected:
SurfaceParams color_params{};
SurfaceParams depth_params{};
std::array<SurfaceInterval, 2> intervals{};
Common::Rectangle<s32> scissor_rect{};
Common::Rectangle<u32> draw_rect{};
ViewportInfo viewport;
u32 res_scale{1};
};
} // namespace VideoCore

View File

@@ -1,171 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/alignment.h"
#include "core/memory.h"
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/renderer_opengl/gl_vars.h"
#include "video_core/utils.h"
#include "video_core/video_core.h"
namespace OpenGL {
template <bool morton_to_gl, PixelFormat format>
static void MortonCopyTile(u32 stride, u8* tile_buffer, u8* gl_buffer) {
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
constexpr u32 aligned_bytes_per_pixel = GetBytesPerPixel(format);
for (u32 y = 0; y < 8; ++y) {
for (u32 x = 0; x < 8; ++x) {
u8* tile_ptr = tile_buffer + VideoCore::MortonInterleave(x, y) * bytes_per_pixel;
u8* gl_ptr = gl_buffer + ((7 - y) * stride + x) * aligned_bytes_per_pixel;
if constexpr (morton_to_gl) {
if constexpr (format == PixelFormat::D24S8) {
gl_ptr[0] = tile_ptr[3];
std::memcpy(gl_ptr + 1, tile_ptr, 3);
} else if (format == PixelFormat::RGBA8 && GLES) {
// because GLES does not have ABGR format
// so we will do byteswapping here
gl_ptr[0] = tile_ptr[3];
gl_ptr[1] = tile_ptr[2];
gl_ptr[2] = tile_ptr[1];
gl_ptr[3] = tile_ptr[0];
} else if (format == PixelFormat::RGB8 && GLES) {
gl_ptr[0] = tile_ptr[2];
gl_ptr[1] = tile_ptr[1];
gl_ptr[2] = tile_ptr[0];
} else {
std::memcpy(gl_ptr, tile_ptr, bytes_per_pixel);
}
} else {
if constexpr (format == PixelFormat::D24S8) {
std::memcpy(tile_ptr, gl_ptr + 1, 3);
tile_ptr[3] = gl_ptr[0];
} else if (format == PixelFormat::RGBA8 && GLES) {
// because GLES does not have ABGR format
// so we will do byteswapping here
tile_ptr[0] = gl_ptr[3];
tile_ptr[1] = gl_ptr[2];
tile_ptr[2] = gl_ptr[1];
tile_ptr[3] = gl_ptr[0];
} else if (format == PixelFormat::RGB8 && GLES) {
tile_ptr[0] = gl_ptr[2];
tile_ptr[1] = gl_ptr[1];
tile_ptr[2] = gl_ptr[0];
} else {
std::memcpy(tile_ptr, gl_ptr, bytes_per_pixel);
}
}
}
}
}
template <bool morton_to_gl, PixelFormat format>
static void MortonCopy(u32 stride, u32 height, u8* gl_buffer, PAddr base, PAddr start, PAddr end) {
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
constexpr u32 tile_size = bytes_per_pixel * 64;
constexpr u32 aligned_bytes_per_pixel = GetBytesPerPixel(format);
static_assert(aligned_bytes_per_pixel >= bytes_per_pixel, "");
gl_buffer += aligned_bytes_per_pixel - bytes_per_pixel;
const PAddr aligned_down_start = base + Common::AlignDown(start - base, tile_size);
const PAddr aligned_start = base + Common::AlignUp(start - base, tile_size);
const PAddr aligned_end = base + Common::AlignDown(end - base, tile_size);
ASSERT(!morton_to_gl || (aligned_start == start && aligned_end == end));
const u32 begin_pixel_index = (aligned_down_start - base) / bytes_per_pixel;
u32 x = (begin_pixel_index % (stride * 8)) / 8;
u32 y = (begin_pixel_index / (stride * 8)) * 8;
gl_buffer += ((height - 8 - y) * stride + x) * aligned_bytes_per_pixel;
auto glbuf_next_tile = [&] {
x = (x + 8) % stride;
gl_buffer += 8 * aligned_bytes_per_pixel;
if (!x) {
y += 8;
gl_buffer -= stride * 9 * aligned_bytes_per_pixel;
}
};
u8* tile_buffer = VideoCore::g_memory->GetPhysicalPointer(start);
if (start < aligned_start && !morton_to_gl) {
std::array<u8, tile_size> tmp_buf;
MortonCopyTile<morton_to_gl, format>(stride, &tmp_buf[0], gl_buffer);
std::memcpy(tile_buffer, &tmp_buf[start - aligned_down_start],
std::min(aligned_start, end) - start);
tile_buffer += aligned_start - start;
glbuf_next_tile();
}
const u8* const buffer_end = tile_buffer + aligned_end - aligned_start;
PAddr current_paddr = aligned_start;
while (tile_buffer < buffer_end) {
// Pokemon Super Mystery Dungeon will try to use textures that go beyond
// the end address of VRAM. Stop reading if reaches invalid address
if (!VideoCore::g_memory->IsValidPhysicalAddress(current_paddr) ||
!VideoCore::g_memory->IsValidPhysicalAddress(current_paddr + tile_size)) {
LOG_ERROR(Render_OpenGL, "Out of bound texture");
break;
}
MortonCopyTile<morton_to_gl, format>(stride, tile_buffer, gl_buffer);
tile_buffer += tile_size;
current_paddr += tile_size;
glbuf_next_tile();
}
if (end > std::max(aligned_start, aligned_end) && !morton_to_gl) {
std::array<u8, tile_size> tmp_buf;
MortonCopyTile<morton_to_gl, format>(stride, &tmp_buf[0], gl_buffer);
std::memcpy(tile_buffer, &tmp_buf[0], end - aligned_end);
}
}
static constexpr std::array<void (*)(u32, u32, u8*, PAddr, PAddr, PAddr), 18> morton_to_gl_fns = {
MortonCopy<true, PixelFormat::RGBA8>, // 0
MortonCopy<true, PixelFormat::RGB8>, // 1
MortonCopy<true, PixelFormat::RGB5A1>, // 2
MortonCopy<true, PixelFormat::RGB565>, // 3
MortonCopy<true, PixelFormat::RGBA4>, // 4
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr, // 5 - 13
MortonCopy<true, PixelFormat::D16>, // 14
nullptr, // 15
MortonCopy<true, PixelFormat::D24>, // 16
MortonCopy<true, PixelFormat::D24S8> // 17
};
static constexpr std::array<void (*)(u32, u32, u8*, PAddr, PAddr, PAddr), 18> gl_to_morton_fns = {
MortonCopy<false, PixelFormat::RGBA8>, // 0
MortonCopy<false, PixelFormat::RGB8>, // 1
MortonCopy<false, PixelFormat::RGB5A1>, // 2
MortonCopy<false, PixelFormat::RGB565>, // 3
MortonCopy<false, PixelFormat::RGBA4>, // 4
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
nullptr, // 5 - 13
MortonCopy<false, PixelFormat::D16>, // 14
nullptr, // 15
MortonCopy<false, PixelFormat::D24>, // 16
MortonCopy<false, PixelFormat::D24S8> // 17
};
} // namespace OpenGL

View File

@@ -0,0 +1,154 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/rasterizer_cache/pixel_format.h"
namespace VideoCore {
std::string_view PixelFormatAsString(PixelFormat format) {
switch (format) {
case PixelFormat::RGBA8:
return "RGBA8";
case PixelFormat::RGB8:
return "RGB8";
case PixelFormat::RGB5A1:
return "RGB5A1";
case PixelFormat::RGB565:
return "RGB565";
case PixelFormat::RGBA4:
return "RGBA4";
case PixelFormat::IA8:
return "IA8";
case PixelFormat::RG8:
return "RG8";
case PixelFormat::I8:
return "I8";
case PixelFormat::A8:
return "A8";
case PixelFormat::IA4:
return "IA4";
case PixelFormat::I4:
return "I4";
case PixelFormat::A4:
return "A4";
case PixelFormat::ETC1:
return "ETC1";
case PixelFormat::ETC1A4:
return "ETC1A4";
case PixelFormat::D16:
return "D16";
case PixelFormat::D24:
return "D24";
case PixelFormat::D24S8:
return "D24S8";
default:
return "NotReal";
}
}
bool CheckFormatsBlittable(PixelFormat source_format, PixelFormat dest_format) {
SurfaceType source_type = GetFormatType(source_format);
SurfaceType dest_type = GetFormatType(dest_format);
if ((source_type == SurfaceType::Color || source_type == SurfaceType::Texture) &&
(dest_type == SurfaceType::Color || dest_type == SurfaceType::Texture)) {
return true;
}
if (source_type == SurfaceType::Depth && dest_type == SurfaceType::Depth) {
return true;
}
if (source_type == SurfaceType::DepthStencil && dest_type == SurfaceType::DepthStencil) {
return true;
}
LOG_WARNING(HW_GPU, "Unblittable format pair detected {} and {}",
PixelFormatAsString(source_format), PixelFormatAsString(dest_format));
return false;
}
PixelFormat PixelFormatFromTextureFormat(Pica::TexturingRegs::TextureFormat format) {
switch (format) {
case Pica::TexturingRegs::TextureFormat::RGBA8:
return PixelFormat::RGBA8;
case Pica::TexturingRegs::TextureFormat::RGB8:
return PixelFormat::RGB8;
case Pica::TexturingRegs::TextureFormat::RGB5A1:
return PixelFormat::RGB5A1;
case Pica::TexturingRegs::TextureFormat::RGB565:
return PixelFormat::RGB565;
case Pica::TexturingRegs::TextureFormat::RGBA4:
return PixelFormat::RGBA4;
case Pica::TexturingRegs::TextureFormat::IA8:
return PixelFormat::IA8;
case Pica::TexturingRegs::TextureFormat::RG8:
return PixelFormat::RG8;
case Pica::TexturingRegs::TextureFormat::I8:
return PixelFormat::I8;
case Pica::TexturingRegs::TextureFormat::A8:
return PixelFormat::A8;
case Pica::TexturingRegs::TextureFormat::IA4:
return PixelFormat::IA4;
case Pica::TexturingRegs::TextureFormat::I4:
return PixelFormat::I4;
case Pica::TexturingRegs::TextureFormat::A4:
return PixelFormat::A4;
case Pica::TexturingRegs::TextureFormat::ETC1:
return PixelFormat::ETC1;
case Pica::TexturingRegs::TextureFormat::ETC1A4:
return PixelFormat::ETC1A4;
default:
return PixelFormat::Invalid;
}
}
PixelFormat PixelFormatFromColorFormat(Pica::FramebufferRegs::ColorFormat format) {
switch (format) {
case Pica::FramebufferRegs::ColorFormat::RGBA8:
return PixelFormat::RGBA8;
case Pica::FramebufferRegs::ColorFormat::RGB8:
return PixelFormat::RGB8;
case Pica::FramebufferRegs::ColorFormat::RGB5A1:
return PixelFormat::RGB5A1;
case Pica::FramebufferRegs::ColorFormat::RGB565:
return PixelFormat::RGB565;
case Pica::FramebufferRegs::ColorFormat::RGBA4:
return PixelFormat::RGBA4;
default:
return PixelFormat::Invalid;
}
}
PixelFormat PixelFormatFromDepthFormat(Pica::FramebufferRegs::DepthFormat format) {
switch (format) {
case Pica::FramebufferRegs::DepthFormat::D16:
return PixelFormat::D16;
case Pica::FramebufferRegs::DepthFormat::D24:
return PixelFormat::D24;
case Pica::FramebufferRegs::DepthFormat::D24S8:
return PixelFormat::D24S8;
default:
return PixelFormat::Invalid;
}
}
PixelFormat PixelFormatFromGPUPixelFormat(GPU::Regs::PixelFormat format) {
switch (format) {
case GPU::Regs::PixelFormat::RGBA8:
return PixelFormat::RGBA8;
case GPU::Regs::PixelFormat::RGB8:
return PixelFormat::RGB8;
case GPU::Regs::PixelFormat::RGB565:
return PixelFormat::RGB565;
case GPU::Regs::PixelFormat::RGB5A1:
return PixelFormat::RGB5A1;
case GPU::Regs::PixelFormat::RGBA4:
return PixelFormat::RGBA4;
default:
return PixelFormat::Invalid;
}
}
} // namespace VideoCore

View File

@@ -1,25 +1,23 @@
// Copyright 2022 Citra Emulator Project
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <limits>
#include <string_view>
#include "core/hw/gpu.h"
#include "video_core/regs_framebuffer.h"
#include "video_core/regs_texturing.h"
namespace OpenGL {
namespace VideoCore {
constexpr u32 PIXEL_FORMAT_COUNT = 18;
enum class PixelFormat : u8 {
// First 5 formats are shared between textures and color buffers
enum class PixelFormat : u32 {
RGBA8 = 0,
RGB8 = 1,
RGB5A1 = 2,
RGB565 = 3,
RGBA4 = 4,
// Texture-only formats
IA8 = 5,
RG8 = 6,
I8 = 7,
@@ -29,168 +27,88 @@ enum class PixelFormat : u8 {
A4 = 11,
ETC1 = 12,
ETC1A4 = 13,
// Depth buffer-only formats
D16 = 14,
D24 = 16,
D24S8 = 17,
Invalid = 255,
MaxPixelFormat = 18,
Invalid = std::numeric_limits<u32>::max(),
};
constexpr std::size_t PIXEL_FORMAT_COUNT = static_cast<std::size_t>(PixelFormat::MaxPixelFormat);
enum class SurfaceType {
enum class SurfaceType : u32 {
Color = 0,
Texture = 1,
Depth = 2,
DepthStencil = 3,
Fill = 4,
Invalid = 5
Invalid = 5,
};
constexpr std::string_view PixelFormatAsString(PixelFormat format) {
switch (format) {
case PixelFormat::RGBA8:
return "RGBA8";
case PixelFormat::RGB8:
return "RGB8";
case PixelFormat::RGB5A1:
return "RGB5A1";
case PixelFormat::RGB565:
return "RGB565";
case PixelFormat::RGBA4:
return "RGBA4";
case PixelFormat::IA8:
return "IA8";
case PixelFormat::RG8:
return "RG8";
case PixelFormat::I8:
return "I8";
case PixelFormat::A8:
return "A8";
case PixelFormat::IA4:
return "IA4";
case PixelFormat::I4:
return "I4";
case PixelFormat::A4:
return "A4";
case PixelFormat::ETC1:
return "ETC1";
case PixelFormat::ETC1A4:
return "ETC1A4";
case PixelFormat::D16:
return "D16";
case PixelFormat::D24:
return "D24";
case PixelFormat::D24S8:
return "D24S8";
default:
return "NotReal";
}
}
enum class TextureType : u32 {
Texture2D = 0,
CubeMap = 1,
};
constexpr PixelFormat PixelFormatFromTextureFormat(Pica::TexturingRegs::TextureFormat format) {
const u32 format_index = static_cast<u32>(format);
return (format_index < 14) ? static_cast<PixelFormat>(format) : PixelFormat::Invalid;
}
struct PixelFormatInfo {
SurfaceType type;
u32 bits_per_block;
u32 bytes_per_pixel;
};
constexpr PixelFormat PixelFormatFromColorFormat(Pica::FramebufferRegs::ColorFormat format) {
const u32 format_index = static_cast<u32>(format);
return (format_index < 5) ? static_cast<PixelFormat>(format) : PixelFormat::Invalid;
}
constexpr PixelFormat PixelFormatFromDepthFormat(Pica::FramebufferRegs::DepthFormat format) {
const u32 format_index = static_cast<u32>(format);
return (format_index < 4) ? static_cast<PixelFormat>(format_index + 14) : PixelFormat::Invalid;
}
constexpr PixelFormat PixelFormatFromGPUPixelFormat(GPU::Regs::PixelFormat format) {
const u32 format_index = static_cast<u32>(format);
switch (format) {
// RGB565 and RGB5A1 are switched in PixelFormat compared to ColorFormat
case GPU::Regs::PixelFormat::RGB565:
return PixelFormat::RGB565;
case GPU::Regs::PixelFormat::RGB5A1:
return PixelFormat::RGB5A1;
default:
return (format_index < 5) ? static_cast<PixelFormat>(format) : PixelFormat::Invalid;
}
}
constexpr SurfaceType GetFormatType(PixelFormat pixel_format) {
const u32 format_index = static_cast<u32>(pixel_format);
if (format_index < 5) {
return SurfaceType::Color;
}
if (format_index < 14) {
return SurfaceType::Texture;
}
if (pixel_format == PixelFormat::D16 || pixel_format == PixelFormat::D24) {
return SurfaceType::Depth;
}
if (pixel_format == PixelFormat::D24S8) {
return SurfaceType::DepthStencil;
}
return SurfaceType::Invalid;
}
constexpr bool CheckFormatsBlittable(PixelFormat source_format, PixelFormat dest_format) {
SurfaceType source_type = GetFormatType(source_format);
SurfaceType dest_type = GetFormatType(dest_format);
if ((source_type == SurfaceType::Color || source_type == SurfaceType::Texture) &&
(dest_type == SurfaceType::Color || dest_type == SurfaceType::Texture)) {
return true;
}
if (source_type == SurfaceType::Depth && dest_type == SurfaceType::Depth) {
return true;
}
if (source_type == SurfaceType::DepthStencil && dest_type == SurfaceType::DepthStencil) {
return true;
}
return false;
}
/**
* Lookup table for querying pixel format properties (type, name, etc)
* @note Modern GPUs require 4 byte alignment for D24
* @note Texture formats are automatically converted to RGBA8
**/
constexpr std::array<PixelFormatInfo, PIXEL_FORMAT_COUNT> FORMAT_MAP = {{
{SurfaceType::Color, 32, 4},
{SurfaceType::Color, 24, 3},
{SurfaceType::Color, 16, 2},
{SurfaceType::Color, 16, 2},
{SurfaceType::Color, 16, 2},
{SurfaceType::Texture, 16, 4},
{SurfaceType::Texture, 16, 4},
{SurfaceType::Texture, 8, 4},
{SurfaceType::Texture, 8, 4},
{SurfaceType::Texture, 8, 4},
{SurfaceType::Texture, 4, 4},
{SurfaceType::Texture, 4, 4},
{SurfaceType::Texture, 4, 4},
{SurfaceType::Texture, 8, 4},
{SurfaceType::Depth, 16, 2},
{SurfaceType::Invalid, 0, 0},
{SurfaceType::Depth, 24, 4},
{SurfaceType::DepthStencil, 32, 4},
}};
constexpr u32 GetFormatBpp(PixelFormat format) {
switch (format) {
case PixelFormat::RGBA8:
case PixelFormat::D24S8:
return 32;
case PixelFormat::RGB8:
case PixelFormat::D24:
return 24;
case PixelFormat::RGB5A1:
case PixelFormat::RGB565:
case PixelFormat::RGBA4:
case PixelFormat::IA8:
case PixelFormat::RG8:
case PixelFormat::D16:
return 16;
case PixelFormat::I8:
case PixelFormat::A8:
case PixelFormat::IA4:
case PixelFormat::ETC1A4:
return 8;
case PixelFormat::I4:
case PixelFormat::A4:
case PixelFormat::ETC1:
return 4;
default:
return 0;
}
const std::size_t index = static_cast<std::size_t>(format);
ASSERT(index < FORMAT_MAP.size());
return FORMAT_MAP[index].bits_per_block;
}
constexpr u32 GetBytesPerPixel(PixelFormat format) {
// OpenGL needs 4 bpp alignment for D24 since using GL_UNSIGNED_INT as type
if (format == PixelFormat::D24 || GetFormatType(format) == SurfaceType::Texture) {
return 4;
}
return GetFormatBpp(format) / 8;
constexpr u32 GetFormatBytesPerPixel(PixelFormat format) {
const std::size_t index = static_cast<std::size_t>(format);
ASSERT(index < FORMAT_MAP.size());
return FORMAT_MAP[index].bytes_per_pixel;
}
} // namespace OpenGL
constexpr SurfaceType GetFormatType(PixelFormat format) {
const std::size_t index = static_cast<std::size_t>(format);
ASSERT(index < FORMAT_MAP.size());
return FORMAT_MAP[index].type;
}
std::string_view PixelFormatAsString(PixelFormat format);
bool CheckFormatsBlittable(PixelFormat source_format, PixelFormat dest_format);
PixelFormat PixelFormatFromTextureFormat(Pica::TexturingRegs::TextureFormat format);
PixelFormat PixelFormatFromColorFormat(Pica::FramebufferRegs::ColorFormat format);
PixelFormat PixelFormatFromDepthFormat(Pica::FramebufferRegs::DepthFormat format);
PixelFormat PixelFormatFromGPUPixelFormat(GPU::Regs::PixelFormat format);
} // namespace VideoCore

File diff suppressed because it is too large Load Diff

View File

@@ -3,17 +3,24 @@
// Refer to the license.txt file included.
#pragma once
#include <unordered_map>
#include "video_core/rasterizer_cache/cached_surface.h"
#include "video_core/rasterizer_cache/rasterizer_cache_utils.h"
#include <boost/icl/interval_map.hpp>
#include <boost/icl/interval_set.hpp>
#include "video_core/rasterizer_cache/surface_base.h"
#include "video_core/rasterizer_cache/surface_params.h"
#include "video_core/renderer_opengl/gl_texture_runtime.h"
#include "video_core/texture/texture_decode.h"
namespace VideoCore {
class RendererBase;
namespace Memory {
class MemorySystem;
}
namespace OpenGL {
namespace Pica {
struct Regs;
}
namespace VideoCore {
enum class ScaleMatch {
Exact, // only accept same res scale
@@ -21,26 +28,60 @@ enum class ScaleMatch {
Ignore // accept every scaled res
};
class TextureDownloaderES;
class TextureFilterer;
class FormatReinterpreterOpenGL;
class RendererBase;
class RasterizerCacheOpenGL : NonCopyable {
class RasterizerCache : NonCopyable {
public:
RasterizerCacheOpenGL(VideoCore::RendererBase& renderer);
~RasterizerCacheOpenGL();
using SurfaceRef = std::shared_ptr<OpenGL::Surface>;
/// Blit one surface's texture to another
bool BlitSurfaces(const Surface& src_surface, const Common::Rectangle<u32>& src_rect,
const Surface& dst_surface, const Common::Rectangle<u32>& dst_rect);
// Declare rasterizer interval types
using SurfaceSet = std::set<SurfaceRef>;
using SurfaceMap = boost::icl::interval_map<PAddr, SurfaceRef, boost::icl::partial_absorber,
std::less, boost::icl::inplace_plus,
boost::icl::inter_section, SurfaceInterval>;
using SurfaceCache = boost::icl::interval_map<PAddr, SurfaceSet, boost::icl::partial_absorber,
std::less, boost::icl::inplace_plus,
boost::icl::inter_section, SurfaceInterval>;
static_assert(std::is_same<SurfaceRegions::interval_type, SurfaceCache::interval_type>() &&
std::is_same<SurfaceMap::interval_type, SurfaceCache::interval_type>(),
"Incorrect interval types");
using SurfaceRect_Tuple = std::tuple<SurfaceRef, Common::Rectangle<u32>>;
using PageMap = boost::icl::interval_map<u32, int>;
struct RenderTargets {
SurfaceRef color_surface;
SurfaceRef depth_surface;
};
struct TextureCube {
SurfaceRef surface{};
std::array<SurfaceRef, 6> faces{};
std::array<u64, 6> ticks{};
};
public:
RasterizerCache(Memory::MemorySystem& memory, OpenGL::TextureRuntime& runtime, Pica::Regs& regs,
RendererBase& renderer);
~RasterizerCache();
/// Perform hardware accelerated texture copy according to the provided configuration
bool AccelerateTextureCopy(const GPU::Regs::DisplayTransferConfig& config);
/// Perform hardware accelerated display transfer according to the provided configuration
bool AccelerateDisplayTransfer(const GPU::Regs::DisplayTransferConfig& config);
/// Perform hardware accelerated memory fill according to the provided configuration
bool AccelerateFill(const GPU::Regs::MemoryFillConfig& config);
/// Copy one surface's region to another
void CopySurface(const Surface& src_surface, const Surface& dst_surface,
void CopySurface(const SurfaceRef& src_surface, const SurfaceRef& dst_surface,
SurfaceInterval copy_interval);
/// Load a texture from 3DS memory to OpenGL and cache it (if not already cached)
Surface GetSurface(const SurfaceParams& params, ScaleMatch match_res_scale,
bool load_if_create);
SurfaceRef GetSurface(const SurfaceParams& params, ScaleMatch match_res_scale,
bool load_if_create);
/// Attempt to find a subrect (resolution scaled) of a surface, otherwise loads a texture from
/// 3DS memory to OpenGL and caches it (if not already cached)
@@ -48,27 +89,26 @@ public:
bool load_if_create);
/// Get a surface based on the texture configuration
Surface GetTextureSurface(const Pica::TexturingRegs::FullTextureConfig& config);
Surface GetTextureSurface(const Pica::Texture::TextureInfo& info, u32 max_level = 0);
SurfaceRef GetTextureSurface(const Pica::TexturingRegs::FullTextureConfig& config);
SurfaceRef GetTextureSurface(const Pica::Texture::TextureInfo& info, u32 max_level = 0);
/// Get a texture cube based on the texture configuration
const CachedTextureCube& GetTextureCube(const TextureCubeConfig& config);
SurfaceRef GetTextureCube(const TextureCubeConfig& config);
/// Get the color and depth surfaces based on the framebuffer configuration
SurfaceSurfaceRect_Tuple GetFramebufferSurfaces(bool using_color_fb, bool using_depth_fb,
const Common::Rectangle<s32>& viewport_rect);
OpenGL::Framebuffer GetFramebufferSurfaces(bool using_color_fb, bool using_depth_fb);
/// Get a surface that matches the fill config
Surface GetFillSurface(const GPU::Regs::MemoryFillConfig& config);
/// Marks the draw rectangle defined in framebuffer as invalid
void InvalidateFramebuffer(const OpenGL::Framebuffer& framebuffer);
/// Get a surface that matches a "texture copy" display transfer config
SurfaceRect_Tuple GetTexCopySurface(const SurfaceParams& params);
/// Write any cached resources overlapping the region back to memory (if dirty)
void FlushRegion(PAddr addr, u32 size, Surface flush_surface = nullptr);
void FlushRegion(PAddr addr, u32 size, SurfaceRef flush_surface = nullptr);
/// Mark region as being invalidated by region_owner (nullptr if 3DS memory)
void InvalidateRegion(PAddr addr, u32 size, const Surface& region_owner);
void InvalidateRegion(PAddr addr, u32 size, const SurfaceRef& region_owner);
/// Flush all cached resources tracked by this cache manager
void FlushAll();
@@ -76,61 +116,59 @@ public:
/// Clear all cached resources tracked by this cache manager
void ClearAll(bool flush);
// Textures from destroyed surfaces are stored here to be recyled to reduce allocation overhead
// in the driver
// this must be placed above the surface_cache to ensure all cached surfaces are destroyed
// before destroying the recycler
std::unordered_multimap<HostTextureTag, OGLTexture> host_texture_recycler;
private:
void DuplicateSurface(const Surface& src_surface, const Surface& dest_surface);
/// Transfers ownership of a memory region from src_surface to dest_surface
void DuplicateSurface(const SurfaceRef& src_surface, const SurfaceRef& dest_surface);
/// Update surface's texture for given region when necessary
void ValidateSurface(const Surface& surface, PAddr addr, u32 size);
void ValidateSurface(const SurfaceRef& surface, PAddr addr, u32 size);
// Returns false if there is a surface in the cache at the interval with the same bit-width,
bool NoUnimplementedReinterpretations(const OpenGL::Surface& surface,
OpenGL::SurfaceParams& params,
const OpenGL::SurfaceInterval& interval);
/// Copies pixel data in interval from the guest VRAM to the host GPU surface
void UploadSurface(const SurfaceRef& surface, SurfaceInterval interval);
// Return true if a surface with an invalid pixel format exists at the interval
bool IntervalHasInvalidPixelFormat(SurfaceParams& params, const SurfaceInterval& interval);
/// Copies pixel data in interval from the host GPU surface to the guest VRAM
void DownloadSurface(const SurfaceRef& surface, SurfaceInterval interval);
// Attempt to find a reinterpretable surface in the cache and use it to copy for validation
bool ValidateByReinterpretation(const Surface& surface, SurfaceParams& params,
/// Downloads a fill surface to guest VRAM
void DownloadFillSurface(const SurfaceRef& surface, SurfaceInterval interval);
/// Returns false if there is a surface in the cache at the interval with the same bit-width,
bool NoUnimplementedReinterpretations(const SurfaceRef& surface, SurfaceParams params,
const SurfaceInterval& interval);
/// Return true if a surface with an invalid pixel format exists at the interval
bool IntervalHasInvalidPixelFormat(const SurfaceParams& params,
const SurfaceInterval& interval);
/// Attempt to find a reinterpretable surface in the cache and use it to copy for validation
bool ValidateByReinterpretation(const SurfaceRef& surface, SurfaceParams params,
const SurfaceInterval& interval);
/// Create a new surface
Surface CreateSurface(const SurfaceParams& params);
SurfaceRef CreateSurface(const SurfaceParams& params);
/// Register surface into the cache
void RegisterSurface(const Surface& surface);
void RegisterSurface(const SurfaceRef& surface);
/// Remove surface from the cache
void UnregisterSurface(const Surface& surface);
void UnregisterSurface(const SurfaceRef& surface);
/// Increase/decrease the number of surface in pages touching the specified region
void UpdatePagesCachedCount(PAddr addr, u32 size, int delta);
VideoCore::RendererBase& renderer;
TextureRuntime runtime;
private:
Memory::MemorySystem& memory;
OpenGL::TextureRuntime& runtime;
Pica::Regs& regs;
RendererBase& renderer;
SurfaceCache surface_cache;
PageMap cached_pages;
SurfaceMap dirty_regions;
SurfaceSet remove_surfaces;
u16 resolution_scale_factor;
std::unordered_map<TextureCubeConfig, CachedTextureCube> texture_cube_cache;
std::recursive_mutex mutex;
public:
OGLTexture AllocateSurfaceTexture(const FormatTuple& format_tuple, u32 width, u32 height);
std::unique_ptr<TextureFilterer> texture_filterer;
std::unique_ptr<FormatReinterpreterOpenGL> format_reinterpreter;
std::unique_ptr<TextureDownloaderES> texture_downloader_es;
std::vector<SurfaceRef> remove_surfaces;
u32 resolution_scale_factor;
RenderTargets render_targets;
std::unordered_map<TextureCubeConfig, TextureCube> texture_cube_cache;
bool use_filter{};
};
} // namespace OpenGL
} // namespace VideoCore

View File

@@ -1,38 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <memory>
#include <set>
#include <tuple>
#include <boost/icl/interval_map.hpp>
#include <boost/icl/interval_set.hpp>
#include "common/common_types.h"
#include "common/math_util.h"
namespace OpenGL {
class CachedSurface;
using Surface = std::shared_ptr<CachedSurface>;
// Declare rasterizer interval types
using SurfaceInterval = boost::icl::right_open_interval<PAddr>;
using SurfaceSet = std::set<Surface>;
using SurfaceRegions = boost::icl::interval_set<PAddr, std::less, SurfaceInterval>;
using SurfaceMap =
boost::icl::interval_map<PAddr, Surface, boost::icl::partial_absorber, std::less,
boost::icl::inplace_plus, boost::icl::inter_section, SurfaceInterval>;
using SurfaceCache =
boost::icl::interval_map<PAddr, SurfaceSet, boost::icl::partial_absorber, std::less,
boost::icl::inplace_plus, boost::icl::inter_section, SurfaceInterval>;
static_assert(std::is_same<SurfaceRegions::interval_type, SurfaceCache::interval_type>() &&
std::is_same<SurfaceMap::interval_type, SurfaceCache::interval_type>(),
"Incorrect interval types");
using SurfaceRect_Tuple = std::tuple<Surface, Common::Rectangle<u32>>;
using SurfaceSurfaceRect_Tuple = std::tuple<Surface, Surface, Common::Rectangle<u32>>;
using PageMap = boost::icl::interval_map<u32, int>;
} // namespace OpenGL

View File

@@ -1,56 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <glad/glad.h>
#include "video_core/rasterizer_cache/rasterizer_cache_utils.h"
#include "video_core/renderer_opengl/gl_vars.h"
namespace OpenGL {
constexpr FormatTuple tex_tuple = {GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE};
static constexpr std::array<FormatTuple, 4> depth_format_tuples = {{
{GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT}, // D16
{},
{GL_DEPTH_COMPONENT24, GL_DEPTH_COMPONENT, GL_UNSIGNED_INT}, // D24
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8}, // D24S8
}};
static constexpr std::array<FormatTuple, 5> fb_format_tuples = {{
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8}, // RGBA8
{GL_RGB8, GL_BGR, GL_UNSIGNED_BYTE}, // RGB8
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_5_5_5_1}, // RGB5A1
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5}, // RGB565
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4}, // RGBA4
}};
// Same as above, with minor changes for OpenGL ES. Replaced
// GL_UNSIGNED_INT_8_8_8_8 with GL_UNSIGNED_BYTE and
// GL_BGR with GL_RGB
static constexpr std::array<FormatTuple, 5> fb_format_tuples_oes = {{
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE}, // RGBA8
{GL_RGB8, GL_RGB, GL_UNSIGNED_BYTE}, // RGB8
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_5_5_5_1}, // RGB5A1
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5}, // RGB565
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4}, // RGBA4
}};
const FormatTuple& GetFormatTuple(PixelFormat pixel_format) {
const SurfaceType type = GetFormatType(pixel_format);
const std::size_t format_index = static_cast<std::size_t>(pixel_format);
if (type == SurfaceType::Color) {
ASSERT(format_index < fb_format_tuples.size());
return (GLES ? fb_format_tuples_oes : fb_format_tuples)[format_index];
} else if (type == SurfaceType::Depth || type == SurfaceType::DepthStencil) {
const std::size_t tuple_idx = format_index - 14;
ASSERT(tuple_idx < depth_format_tuples.size());
return depth_format_tuples[tuple_idx];
}
return tex_tuple;
}
} // namespace OpenGL

View File

@@ -1,73 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <functional>
#include "common/hash.h"
#include "video_core/rasterizer_cache/pixel_format.h"
namespace OpenGL {
struct FormatTuple {
int internal_format;
u32 format;
u32 type;
};
const FormatTuple& GetFormatTuple(PixelFormat pixel_format);
struct HostTextureTag {
FormatTuple format_tuple{};
u32 width = 0;
u32 height = 0;
bool operator==(const HostTextureTag& rhs) const noexcept {
return std::memcmp(this, &rhs, sizeof(HostTextureTag)) == 0;
};
const u64 Hash() const {
return Common::ComputeHash64(this, sizeof(HostTextureTag));
}
};
struct TextureCubeConfig {
PAddr px;
PAddr nx;
PAddr py;
PAddr ny;
PAddr pz;
PAddr nz;
u32 width;
Pica::TexturingRegs::TextureFormat format;
bool operator==(const TextureCubeConfig& rhs) const {
return std::memcmp(this, &rhs, sizeof(TextureCubeConfig)) == 0;
}
bool operator!=(const TextureCubeConfig& rhs) const {
return std::memcmp(this, &rhs, sizeof(TextureCubeConfig)) != 0;
}
const u64 Hash() const {
return Common::ComputeHash64(this, sizeof(TextureCubeConfig));
}
};
} // namespace OpenGL
namespace std {
template <>
struct hash<OpenGL::HostTextureTag> {
std::size_t operator()(const OpenGL::HostTextureTag& tag) const noexcept {
return tag.Hash();
}
};
template <>
struct hash<OpenGL::TextureCubeConfig> {
std::size_t operator()(const OpenGL::TextureCubeConfig& config) const noexcept {
return config.Hash();
}
};
} // namespace std

View File

@@ -0,0 +1,156 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/alignment.h"
#include "video_core/rasterizer_cache/surface_base.h"
#include "video_core/texture/texture_decode.h"
namespace VideoCore {
SurfaceBase::SurfaceBase(const SurfaceParams& params) : SurfaceParams{params} {}
SurfaceBase::~SurfaceBase() = default;
bool SurfaceBase::CanFill(const SurfaceParams& dest_surface, SurfaceInterval fill_interval) const {
if (type == SurfaceType::Fill && IsRegionValid(fill_interval) &&
boost::icl::first(fill_interval) >= addr &&
boost::icl::last_next(fill_interval) <= end && // dest_surface is within our fill range
dest_surface.FromInterval(fill_interval).GetInterval() ==
fill_interval) { // make sure interval is a rectangle in dest surface
if (fill_size * 8 != dest_surface.GetFormatBpp()) {
// Check if bits repeat for our fill_size
const u32 dest_bytes_per_pixel = std::max(dest_surface.GetFormatBpp() / 8, 1u);
std::vector<u8> fill_test(fill_size * dest_bytes_per_pixel);
for (u32 i = 0; i < dest_bytes_per_pixel; ++i) {
std::memcpy(&fill_test[i * fill_size], &fill_data[0], fill_size);
}
for (u32 i = 0; i < fill_size; ++i) {
if (std::memcmp(&fill_test[dest_bytes_per_pixel * i], &fill_test[0],
dest_bytes_per_pixel) != 0) {
return false;
}
}
if (dest_surface.GetFormatBpp() == 4 && (fill_test[0] & 0xF) != (fill_test[0] >> 4)) {
return false;
}
}
return true;
}
return false;
}
bool SurfaceBase::CanCopy(const SurfaceParams& dest_surface, SurfaceInterval copy_interval) const {
SurfaceParams subrect_params = dest_surface.FromInterval(copy_interval);
ASSERT(subrect_params.GetInterval() == copy_interval);
if (CanSubRect(subrect_params))
return true;
if (CanFill(dest_surface, copy_interval))
return true;
return false;
}
SurfaceInterval SurfaceBase::GetCopyableInterval(const SurfaceParams& params) const {
SurfaceInterval result{};
const u32 tile_align = params.BytesInPixels(params.is_tiled ? 8 * 8 : 1);
const auto valid_regions =
SurfaceRegions{params.GetInterval() & GetInterval()} - invalid_regions;
for (auto& valid_interval : valid_regions) {
const SurfaceInterval aligned_interval{
params.addr +
Common::AlignUp(boost::icl::first(valid_interval) - params.addr, tile_align),
params.addr +
Common::AlignDown(boost::icl::last_next(valid_interval) - params.addr, tile_align)};
if (tile_align > boost::icl::length(valid_interval) ||
boost::icl::length(aligned_interval) == 0) {
continue;
}
// Get the rectangle within aligned_interval
const u32 stride_bytes = params.BytesInPixels(params.stride) * (params.is_tiled ? 8 : 1);
SurfaceInterval rect_interval{
params.addr +
Common::AlignUp(boost::icl::first(aligned_interval) - params.addr, stride_bytes),
params.addr + Common::AlignDown(boost::icl::last_next(aligned_interval) - params.addr,
stride_bytes),
};
if (boost::icl::first(rect_interval) > boost::icl::last_next(rect_interval)) {
// 1 row
rect_interval = aligned_interval;
} else if (boost::icl::length(rect_interval) == 0) {
// 2 rows that do not make a rectangle, return the larger one
const SurfaceInterval row1{boost::icl::first(aligned_interval),
boost::icl::first(rect_interval)};
const SurfaceInterval row2{boost::icl::first(rect_interval),
boost::icl::last_next(aligned_interval)};
rect_interval = (boost::icl::length(row1) > boost::icl::length(row2)) ? row1 : row2;
}
if (boost::icl::length(rect_interval) > boost::icl::length(result)) {
result = rect_interval;
}
}
return result;
}
ClearValue SurfaceBase::MakeClearValue(PAddr copy_addr, PixelFormat dst_format) {
const SurfaceType dst_type = GetFormatType(dst_format);
const std::array fill_buffer = MakeFillBuffer(copy_addr);
ClearValue result{};
switch (dst_type) {
case SurfaceType::Color:
case SurfaceType::Texture:
case SurfaceType::Fill: {
Pica::Texture::TextureInfo tex_info{};
tex_info.format = static_cast<Pica::TexturingRegs::TextureFormat>(dst_format);
const auto color = Pica::Texture::LookupTexture(fill_buffer.data(), 0, 0, tex_info);
result.color = color / 255.f;
break;
}
case SurfaceType::Depth: {
u32 depth_uint = 0;
if (dst_format == PixelFormat::D16) {
std::memcpy(&depth_uint, fill_buffer.data(), 2);
result.depth = depth_uint / 65535.0f; // 2^16 - 1
} else if (dst_format == PixelFormat::D24) {
std::memcpy(&depth_uint, fill_buffer.data(), 3);
result.depth = depth_uint / 16777215.0f; // 2^24 - 1
}
break;
}
case SurfaceType::DepthStencil: {
u32 clear_value_uint;
std::memcpy(&clear_value_uint, fill_buffer.data(), sizeof(u32));
result.depth = (clear_value_uint & 0xFFFFFF) / 16777215.0f; // 2^24 - 1
result.stencil = (clear_value_uint >> 24);
break;
}
default:
UNREACHABLE_MSG("Invalid surface type!");
}
return result;
}
std::array<u8, 4> SurfaceBase::MakeFillBuffer(PAddr copy_addr) {
const PAddr fill_offset = (copy_addr - addr) % fill_size;
std::array<u8, 4> fill_buffer;
u32 fill_buff_pos = fill_offset;
for (std::size_t i = 0; i < fill_buffer.size(); i++) {
fill_buffer[i] = fill_data[fill_buff_pos++ % fill_size];
}
return fill_buffer;
}
} // namespace VideoCore

View File

@@ -0,0 +1,66 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <boost/icl/interval_set.hpp>
#include "video_core/rasterizer_cache/surface_params.h"
namespace VideoCore {
using SurfaceRegions = boost::icl::interval_set<PAddr, std::less, SurfaceInterval>;
class SurfaceBase : public SurfaceParams {
public:
SurfaceBase(const SurfaceParams& params);
~SurfaceBase();
/// Returns true when this surface can be used to fill the fill_interval of dest_surface
bool CanFill(const SurfaceParams& dest_surface, SurfaceInterval fill_interval) const;
/// Returns true when surface can validate copy_interval of dest_surface
bool CanCopy(const SurfaceParams& dest_surface, SurfaceInterval copy_interval) const;
/// Returns the region of the biggest valid rectange within interval
SurfaceInterval GetCopyableInterval(const SurfaceParams& params) const;
/// Returns the clear value used to validate another surface from this fill surface
ClearValue MakeClearValue(PAddr copy_addr, PixelFormat dst_format);
u64 ModificationTick() const noexcept {
return modification_tick;
}
bool IsRegionValid(SurfaceInterval interval) const {
return (invalid_regions.find(interval) == invalid_regions.end());
}
void MarkValid(SurfaceInterval interval) {
invalid_regions.erase(interval);
modification_tick++;
}
void MarkInvalid(SurfaceInterval interval) {
invalid_regions.insert(interval);
modification_tick++;
}
bool IsFullyInvalid() const {
auto interval = GetInterval();
return *invalid_regions.equal_range(interval).first == interval;
}
private:
/// Returns the fill buffer value starting from copy_addr
std::array<u8, 4> MakeFillBuffer(PAddr copy_addr);
public:
bool registered = false;
SurfaceRegions invalid_regions;
u32 fill_size = 0;
std::array<u8, 4> fill_data;
u64 modification_tick = 1;
};
} // namespace VideoCore

View File

@@ -3,133 +3,9 @@
// Refer to the license.txt file included.
#include "common/alignment.h"
#include "video_core/rasterizer_cache/rasterizer_cache.h"
#include "video_core/rasterizer_cache/surface_params.h"
namespace OpenGL {
SurfaceParams SurfaceParams::FromInterval(SurfaceInterval interval) const {
SurfaceParams params = *this;
const u32 tiled_size = is_tiled ? 8 : 1;
const u32 stride_tiled_bytes = BytesInPixels(stride * tiled_size);
PAddr aligned_start =
addr + Common::AlignDown(boost::icl::first(interval) - addr, stride_tiled_bytes);
PAddr aligned_end =
addr + Common::AlignUp(boost::icl::last_next(interval) - addr, stride_tiled_bytes);
if (aligned_end - aligned_start > stride_tiled_bytes) {
params.addr = aligned_start;
params.height = (aligned_end - aligned_start) / BytesInPixels(stride);
} else {
// 1 row
ASSERT(aligned_end - aligned_start == stride_tiled_bytes);
const u32 tiled_alignment = BytesInPixels(is_tiled ? 8 * 8 : 1);
aligned_start =
addr + Common::AlignDown(boost::icl::first(interval) - addr, tiled_alignment);
aligned_end =
addr + Common::AlignUp(boost::icl::last_next(interval) - addr, tiled_alignment);
params.addr = aligned_start;
params.width = PixelsInBytes(aligned_end - aligned_start) / tiled_size;
params.stride = params.width;
params.height = tiled_size;
}
params.UpdateParams();
return params;
}
SurfaceInterval SurfaceParams::GetSubRectInterval(Common::Rectangle<u32> unscaled_rect) const {
if (unscaled_rect.GetHeight() == 0 || unscaled_rect.GetWidth() == 0) {
return {};
}
if (is_tiled) {
unscaled_rect.left = Common::AlignDown(unscaled_rect.left, 8) * 8;
unscaled_rect.bottom = Common::AlignDown(unscaled_rect.bottom, 8) / 8;
unscaled_rect.right = Common::AlignUp(unscaled_rect.right, 8) * 8;
unscaled_rect.top = Common::AlignUp(unscaled_rect.top, 8) / 8;
}
const u32 stride_tiled = !is_tiled ? stride : stride * 8;
const u32 pixel_offset =
stride_tiled * (!is_tiled ? unscaled_rect.bottom : (height / 8) - unscaled_rect.top) +
unscaled_rect.left;
const u32 pixels = (unscaled_rect.GetHeight() - 1) * stride_tiled + unscaled_rect.GetWidth();
return {addr + BytesInPixels(pixel_offset), addr + BytesInPixels(pixel_offset + pixels)};
}
SurfaceInterval SurfaceParams::GetCopyableInterval(const Surface& src_surface) const {
SurfaceInterval result{};
const auto valid_regions =
SurfaceRegions(GetInterval() & src_surface->GetInterval()) - src_surface->invalid_regions;
for (auto& valid_interval : valid_regions) {
const SurfaceInterval aligned_interval{
addr + Common::AlignUp(boost::icl::first(valid_interval) - addr,
BytesInPixels(is_tiled ? 8 * 8 : 1)),
addr + Common::AlignDown(boost::icl::last_next(valid_interval) - addr,
BytesInPixels(is_tiled ? 8 * 8 : 1))};
if (BytesInPixels(is_tiled ? 8 * 8 : 1) > boost::icl::length(valid_interval) ||
boost::icl::length(aligned_interval) == 0) {
continue;
}
// Get the rectangle within aligned_interval
const u32 stride_bytes = BytesInPixels(stride) * (is_tiled ? 8 : 1);
SurfaceInterval rect_interval{
addr + Common::AlignUp(boost::icl::first(aligned_interval) - addr, stride_bytes),
addr + Common::AlignDown(boost::icl::last_next(aligned_interval) - addr, stride_bytes),
};
if (boost::icl::first(rect_interval) > boost::icl::last_next(rect_interval)) {
// 1 row
rect_interval = aligned_interval;
} else if (boost::icl::length(rect_interval) == 0) {
// 2 rows that do not make a rectangle, return the larger one
const SurfaceInterval row1{boost::icl::first(aligned_interval),
boost::icl::first(rect_interval)};
const SurfaceInterval row2{boost::icl::first(rect_interval),
boost::icl::last_next(aligned_interval)};
rect_interval = (boost::icl::length(row1) > boost::icl::length(row2)) ? row1 : row2;
}
if (boost::icl::length(rect_interval) > boost::icl::length(result)) {
result = rect_interval;
}
}
return result;
}
Common::Rectangle<u32> SurfaceParams::GetSubRect(const SurfaceParams& sub_surface) const {
const u32 begin_pixel_index = PixelsInBytes(sub_surface.addr - addr);
if (is_tiled) {
const int x0 = (begin_pixel_index % (stride * 8)) / 8;
const int y0 = (begin_pixel_index / (stride * 8)) * 8;
// Top to bottom
return Common::Rectangle<u32>(x0, height - y0, x0 + sub_surface.width,
height - (y0 + sub_surface.height));
}
const int x0 = begin_pixel_index % stride;
const int y0 = begin_pixel_index / stride;
// Bottom to top
return Common::Rectangle<u32>(x0, y0 + sub_surface.height, x0 + sub_surface.width, y0);
}
Common::Rectangle<u32> SurfaceParams::GetScaledSubRect(const SurfaceParams& sub_surface) const {
auto rect = GetSubRect(sub_surface);
rect.left = rect.left * res_scale;
rect.right = rect.right * res_scale;
rect.top = rect.top * res_scale;
rect.bottom = rect.bottom * res_scale;
return rect;
}
namespace VideoCore {
bool SurfaceParams::ExactMatch(const SurfaceParams& other_surface) const {
return std::tie(other_surface.addr, other_surface.width, other_surface.height,
@@ -157,6 +33,7 @@ bool SurfaceParams::CanExpand(const SurfaceParams& expanded_surface) const {
}
bool SurfaceParams::CanTexCopy(const SurfaceParams& texcopy_params) const {
const SurfaceInterval copy_interval = texcopy_params.GetInterval();
if (pixel_format == PixelFormat::Invalid || addr > texcopy_params.addr ||
end < texcopy_params.end) {
return false;
@@ -170,7 +47,180 @@ bool SurfaceParams::CanTexCopy(const SurfaceParams& texcopy_params) const {
((texcopy_params.addr - addr) % tile_stride) + texcopy_params.width <= tile_stride;
}
return FromInterval(texcopy_params.GetInterval()).GetInterval() == texcopy_params.GetInterval();
const u32 target_level = LevelOf(texcopy_params.addr);
if ((LevelInterval(target_level) & copy_interval) != copy_interval) {
return false;
}
return FromInterval(copy_interval).GetInterval() == copy_interval;
}
} // namespace OpenGL
void SurfaceParams::UpdateParams() {
if (stride == 0) {
stride = width;
}
type = GetFormatType(pixel_format);
if (levels != 1) {
ASSERT(stride == width);
CalculateMipLevelOffsets();
size = CalculateSurfaceSize();
} else {
mipmap_offsets[0] = addr;
size = !is_tiled ? BytesInPixels(stride * (height - 1) + width)
: BytesInPixels(stride * 8 * (height / 8 - 1) + width * 8);
}
end = addr + size;
}
Common::Rectangle<u32> SurfaceParams::GetSubRect(const SurfaceParams& sub_surface) const {
const u32 level = LevelOf(sub_surface.addr);
const u32 begin_pixel_index = PixelsInBytes(sub_surface.addr - mipmap_offsets[level]);
ASSERT(stride == width || level == 0);
const u32 stride_lod = stride >> level;
if (is_tiled) {
const u32 x0 = (begin_pixel_index % (stride_lod * 8)) / 8;
const u32 y0 = (begin_pixel_index / (stride_lod * 8)) * 8;
const u32 height_lod = height >> level;
// Top to bottom
return {x0, height_lod - y0, x0 + sub_surface.width,
height_lod - (y0 + sub_surface.height)};
}
const u32 x0 = begin_pixel_index % stride_lod;
const u32 y0 = begin_pixel_index / stride_lod;
// Bottom to top
return {x0, y0 + sub_surface.height, x0 + sub_surface.width, y0};
}
Common::Rectangle<u32> SurfaceParams::GetScaledSubRect(const SurfaceParams& sub_surface) const {
return GetSubRect(sub_surface) * res_scale;
}
SurfaceParams SurfaceParams::FromInterval(SurfaceInterval interval) const {
SurfaceParams params = *this;
const u32 level = LevelOf(interval.lower());
const PAddr end_addr = interval.upper();
// Ensure provided interval is contained in a single level
ASSERT(level == LevelOf(end_addr) || end_addr == end || end_addr == mipmap_offsets[level + 1]);
params.width >>= level;
params.stride >>= level;
const u32 tiled_size = is_tiled ? 8 : 1;
const u32 stride_tiled_bytes = BytesInPixels(params.stride * tiled_size);
ASSERT(stride == width || level == 0);
const PAddr start = mipmap_offsets[level];
PAddr aligned_start =
start + Common::AlignDown(boost::icl::first(interval) - start, stride_tiled_bytes);
PAddr aligned_end =
start + Common::AlignUp(boost::icl::last_next(interval) - start, stride_tiled_bytes);
if (aligned_end - aligned_start > stride_tiled_bytes) {
params.addr = aligned_start;
params.height = (aligned_end - aligned_start) / BytesInPixels(params.stride);
} else {
// 1 row
ASSERT(aligned_end - aligned_start == stride_tiled_bytes);
const u32 tiled_alignment = BytesInPixels(is_tiled ? 8 * 8 : 1);
aligned_start =
start + Common::AlignDown(boost::icl::first(interval) - start, tiled_alignment);
aligned_end =
start + Common::AlignUp(boost::icl::last_next(interval) - start, tiled_alignment);
params.addr = aligned_start;
params.width = PixelsInBytes(aligned_end - aligned_start) / tiled_size;
params.stride = params.width;
params.height = tiled_size;
}
params.levels = 1;
params.UpdateParams();
return params;
}
SurfaceInterval SurfaceParams::GetSubRectInterval(Common::Rectangle<u32> unscaled_rect,
u32 level) const {
if (unscaled_rect.GetHeight() == 0 || unscaled_rect.GetWidth() == 0) [[unlikely]] {
return {};
}
if (is_tiled) {
unscaled_rect.left = Common::AlignDown(unscaled_rect.left, 8) * 8;
unscaled_rect.bottom = Common::AlignDown(unscaled_rect.bottom, 8) / 8;
unscaled_rect.right = Common::AlignUp(unscaled_rect.right, 8) * 8;
unscaled_rect.top = Common::AlignUp(unscaled_rect.top, 8) / 8;
}
const u32 stride_tiled = (!is_tiled ? stride : stride * 8) >> level;
const u32 pixels = (unscaled_rect.GetHeight() - 1) * stride_tiled + unscaled_rect.GetWidth();
const u32 pixel_offset =
stride_tiled * (!is_tiled ? unscaled_rect.bottom : (height / 8) - unscaled_rect.top) +
unscaled_rect.left;
const PAddr start = mipmap_offsets[level];
return {start + BytesInPixels(pixel_offset), start + BytesInPixels(pixel_offset + pixels)};
}
void SurfaceParams::CalculateMipLevelOffsets() {
ASSERT(levels <= MAX_PICA_LEVELS && stride == width);
u32 level_width = width;
u32 level_height = height;
u32 offset = addr;
for (u32 level = 0; level < levels; level++) {
mipmap_offsets[level] = offset;
offset += BytesInPixels(level_width * level_height);
level_width >>= 1;
level_height >>= 1;
}
}
u32 SurfaceParams::CalculateSurfaceSize() const {
ASSERT(levels <= MAX_PICA_LEVELS && stride == width);
u32 level_width = width;
u32 level_height = height;
u32 size = 0;
for (u32 level = 0; level < levels; level++) {
size += BytesInPixels(level_width * level_height);
level_width >>= 1;
level_height >>= 1;
}
return size;
}
SurfaceInterval SurfaceParams::LevelInterval(u32 level) const {
ASSERT(levels > level);
const PAddr start_addr = mipmap_offsets[level];
const PAddr end_addr = level == (levels - 1) ? end : mipmap_offsets[level + 1];
return {start_addr, end_addr};
}
u32 SurfaceParams::LevelOf(PAddr level_addr) const {
ASSERT(level_addr >= addr && level_addr <= end);
u32 level = levels - 1;
while (mipmap_offsets[level] > level_addr) {
level--;
}
return level;
}
std::string SurfaceParams::DebugName(bool scaled) const noexcept {
const u32 scaled_width = scaled ? GetScaledWidth() : width;
const u32 scaled_height = scaled ? GetScaledHeight() : height;
return fmt::format("Surface: {}x{} {} {} levels from {:#x} to {:#x} ({})", scaled_width,
scaled_height, PixelFormatAsString(pixel_format), levels, addr, end,
scaled ? "scaled" : "unscaled");
}
} // namespace VideoCore

View File

@@ -4,75 +4,89 @@
#pragma once
#include <array>
#include <climits>
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/rasterizer_cache/rasterizer_cache_types.h"
#include "video_core/rasterizer_cache/utils.h"
namespace OpenGL {
namespace VideoCore {
constexpr std::size_t MAX_PICA_LEVELS = 8;
class SurfaceParams {
public:
// Surface match traits
/// Returns true if other_surface matches exactly params
bool ExactMatch(const SurfaceParams& other_surface) const;
/// Returns true if sub_surface is a subrect of params
bool CanSubRect(const SurfaceParams& sub_surface) const;
/// Returns true if params can be expanded to match expanded_surface
bool CanExpand(const SurfaceParams& expanded_surface) const;
/// Returns true if params can be used for texcopy
bool CanTexCopy(const SurfaceParams& texcopy_params) const;
/// Updates remaining members from the already set addr, width, height and pixel_format
void UpdateParams();
/// Returns the unscaled rectangle referenced by sub_surface
Common::Rectangle<u32> GetSubRect(const SurfaceParams& sub_surface) const;
/// Returns the scaled rectangle referenced by sub_surface
Common::Rectangle<u32> GetScaledSubRect(const SurfaceParams& sub_surface) const;
// Returns the outer rectangle containing "interval"
/// Returns the outer rectangle containing interval
SurfaceParams FromInterval(SurfaceInterval interval) const;
SurfaceInterval GetSubRectInterval(Common::Rectangle<u32> unscaled_rect) const;
// Returns the region of the biggest valid rectange within interval
SurfaceInterval GetCopyableInterval(const Surface& src_surface) const;
/// Returns the address interval referenced by unscaled_rect
SurfaceInterval GetSubRectInterval(Common::Rectangle<u32> unscaled_rect, u32 level = 0) const;
/// Updates remaining members from the already set addr, width, height and pixel_format
void UpdateParams() {
if (stride == 0) {
stride = width;
}
/// Return the address interval of the provided level
SurfaceInterval LevelInterval(u32 level) const;
type = GetFormatType(pixel_format);
size = !is_tiled ? BytesInPixels(stride * (height - 1) + width)
: BytesInPixels(stride * 8 * (height / 8 - 1) + width * 8);
end = addr + size;
/// Returns the level of the provided address
u32 LevelOf(PAddr addr) const;
/// Returns a string identifier of the params object
std::string DebugName(bool scaled) const noexcept;
[[nodiscard]] SurfaceInterval GetInterval() const noexcept {
return SurfaceInterval{addr, end};
}
SurfaceInterval GetInterval() const {
return SurfaceInterval(addr, end);
[[nodiscard]] u32 GetFormatBpp() const noexcept {
return VideoCore::GetFormatBpp(pixel_format);
}
u32 GetFormatBpp() const {
return OpenGL::GetFormatBpp(pixel_format);
}
u32 GetScaledWidth() const {
[[nodiscard]] u32 GetScaledWidth() const noexcept {
return width * res_scale;
}
u32 GetScaledHeight() const {
[[nodiscard]] u32 GetScaledHeight() const noexcept {
return height * res_scale;
}
Common::Rectangle<u32> GetRect() const {
[[nodiscard]] Common::Rectangle<u32> GetRect() const noexcept {
return {0, height, width, 0};
}
Common::Rectangle<u32> GetScaledRect() const {
[[nodiscard]] Common::Rectangle<u32> GetScaledRect() const noexcept {
return {0, GetScaledHeight(), GetScaledWidth(), 0};
}
u32 PixelsInBytes(u32 size) const {
[[nodiscard]] u32 PixelsInBytes(u32 size) const noexcept {
return size * 8 / GetFormatBpp();
}
u32 BytesInPixels(u32 pixels) const {
[[nodiscard]] u32 BytesInPixels(u32 pixels) const noexcept {
return pixels * GetFormatBpp() / 8;
}
private:
/// Computes the offset of each mipmap level
void CalculateMipLevelOffsets();
/// Calculates total surface size taking mipmaps into account
u32 CalculateSurfaceSize() const;
public:
PAddr addr = 0;
PAddr end = 0;
@@ -81,11 +95,15 @@ public:
u32 width = 0;
u32 height = 0;
u32 stride = 0;
u16 res_scale = 1;
u32 levels = 1;
u32 res_scale = 1;
bool is_tiled = false;
TextureType texture_type = TextureType::Texture2D;
PixelFormat pixel_format = PixelFormat::Invalid;
SurfaceType type = SurfaceType::Invalid;
std::array<u32, MAX_PICA_LEVELS> mipmap_offsets{};
};
} // namespace OpenGL
} // namespace VideoCore

View File

@@ -0,0 +1,548 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <algorithm>
#include <bit>
#include <span>
#include "common/alignment.h"
#include "common/color.h"
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/texture/etc1.h"
#include "video_core/utils.h"
namespace VideoCore {
template <typename T>
inline T MakeInt(const u8* bytes) {
T integer{};
std::memcpy(&integer, bytes, sizeof(T));
return integer;
}
template <PixelFormat format, bool converted>
constexpr void DecodePixel(const u8* source, u8* dest) {
using namespace Common::Color;
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
if constexpr (format == PixelFormat::RGBA8 && converted) {
const auto abgr = DecodeRGBA8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::RGB8 && converted) {
const auto abgr = DecodeRGB8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::RGB565 && converted) {
const auto abgr = DecodeRGB565(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::RGB5A1 && converted) {
const auto abgr = DecodeRGB5A1(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::RGBA4 && converted) {
const auto abgr = DecodeRGBA4(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::IA8) {
const auto abgr = DecodeIA8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::RG8) {
const auto abgr = DecodeRG8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::I8) {
const auto abgr = DecodeI8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::A8) {
const auto abgr = DecodeA8(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::IA4) {
const auto abgr = DecodeIA4(source);
std::memcpy(dest, abgr.AsArray(), 4);
} else if constexpr (format == PixelFormat::D24 && converted) {
const auto d32 = DecodeD24(source) / 16777215.f;
std::memcpy(dest, &d32, sizeof(d32));
} else if constexpr (format == PixelFormat::D24S8) {
const u32 d24s8 = std::rotl(MakeInt<u32>(source), 8);
std::memcpy(dest, &d24s8, sizeof(u32));
} else {
std::memcpy(dest, source, bytes_per_pixel);
}
}
template <PixelFormat format>
constexpr void DecodePixel4(u32 x, u32 y, const u8* source_tile, u8* dest_pixel) {
const u32 morton_offset = VideoCore::MortonInterleave(x, y);
const u8 value = source_tile[morton_offset >> 1];
const u8 pixel = Common::Color::Convert4To8((morton_offset % 2) ? (value >> 4) : (value & 0xF));
if constexpr (format == PixelFormat::I4) {
std::memset(dest_pixel, pixel, 3);
dest_pixel[3] = 255;
} else {
std::memset(dest_pixel, 0, 3);
dest_pixel[3] = pixel;
}
}
template <PixelFormat format>
constexpr void DecodePixelETC1(u32 x, u32 y, const u8* source_tile, u8* dest_pixel) {
constexpr u32 subtile_width = 4;
constexpr u32 subtile_height = 4;
constexpr bool has_alpha = format == PixelFormat::ETC1A4;
constexpr std::size_t subtile_size = has_alpha ? 16 : 8;
const u32 subtile_index = (x / subtile_width) + 2 * (y / subtile_height);
x %= subtile_width;
y %= subtile_height;
const u8* subtile_ptr = source_tile + subtile_index * subtile_size;
u8 alpha = 255;
if constexpr (has_alpha) {
u64_le packed_alpha;
std::memcpy(&packed_alpha, subtile_ptr, sizeof(u64));
subtile_ptr += sizeof(u64);
alpha = Common::Color::Convert4To8((packed_alpha >> (4 * (x * subtile_width + y))) & 0xF);
}
const u64_le subtile_data = MakeInt<u64_le>(subtile_ptr);
const auto rgb = Pica::Texture::SampleETC1Subtile(subtile_data, x, y);
// Copy the uncompressed pixel to the destination
std::memcpy(dest_pixel, rgb.AsArray(), 3);
dest_pixel[3] = alpha;
}
template <PixelFormat format, bool converted>
constexpr void EncodePixel(const u8* source, u8* dest) {
using namespace Common::Color;
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
if constexpr (format == PixelFormat::RGBA8 && converted) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRGBA8(rgba, dest);
} else if constexpr (format == PixelFormat::RGB8 && converted) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRGB8(rgba, dest);
} else if constexpr (format == PixelFormat::RGB565 && converted) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRGB565(rgba, dest);
} else if constexpr (format == PixelFormat::RGB5A1 && converted) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRGB5A1(rgba, dest);
} else if constexpr (format == PixelFormat::RGBA4 && converted) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRGBA4(rgba, dest);
} else if constexpr (format == PixelFormat::IA8) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeIA8(rgba, dest);
} else if constexpr (format == PixelFormat::RG8) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeRG8(rgba, dest);
} else if constexpr (format == PixelFormat::I8) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeI8(rgba, dest);
} else if constexpr (format == PixelFormat::A8) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeA8(rgba, dest);
} else if constexpr (format == PixelFormat::IA4) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source, 4);
EncodeIA4(rgba, dest);
} else if constexpr (format == PixelFormat::D24 && converted) {
float d32;
std::memcpy(&d32, source, sizeof(d32));
EncodeD24(d32 * 0xFFFFFF, dest);
} else if constexpr (format == PixelFormat::D24S8) {
const u32 s8d24 = std::rotr(MakeInt<u32>(source), 8);
std::memcpy(dest, &s8d24, sizeof(u32));
} else {
std::memcpy(dest, source, bytes_per_pixel);
}
}
template <PixelFormat format>
constexpr void EncodePixel4(u32 x, u32 y, const u8* source_pixel, u8* dest_tile_buffer) {
Common::Vec4<u8> rgba;
std::memcpy(rgba.AsArray(), source_pixel, 4);
u8 pixel;
if constexpr (format == PixelFormat::I4) {
pixel = Common::Color::AverageRgbComponents(rgba);
} else {
pixel = rgba.a();
}
const u32 morton_offset = VideoCore::MortonInterleave(x, y);
const u32 byte_offset = morton_offset >> 1;
const u8 current_values = dest_tile_buffer[byte_offset];
const u8 new_value = Common::Color::Convert8To4(pixel);
if (morton_offset % 2) {
dest_tile_buffer[byte_offset] = (new_value << 4) | (current_values & 0x0F);
} else {
dest_tile_buffer[byte_offset] = (current_values & 0xF0) | new_value;
}
}
template <bool morton_to_linear, PixelFormat format, bool converted>
constexpr void MortonCopyTile(u32 stride, std::span<u8> tile_buffer, std::span<u8> linear_buffer) {
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
constexpr u32 linear_bytes_per_pixel = converted ? 4 : GetFormatBytesPerPixel(format);
constexpr bool is_compressed = format == PixelFormat::ETC1 || format == PixelFormat::ETC1A4;
constexpr bool is_4bit = format == PixelFormat::I4 || format == PixelFormat::A4;
for (u32 y = 0; y < 8; y++) {
for (u32 x = 0; x < 8; x++) {
const auto tiled_pixel = tile_buffer.subspan(
VideoCore::MortonInterleave(x, y) * bytes_per_pixel, bytes_per_pixel);
const auto linear_pixel = linear_buffer.subspan(
((7 - y) * stride + x) * linear_bytes_per_pixel, linear_bytes_per_pixel);
if constexpr (morton_to_linear) {
if constexpr (is_compressed) {
DecodePixelETC1<format>(x, y, tile_buffer.data(), linear_pixel.data());
} else if constexpr (is_4bit) {
DecodePixel4<format>(x, y, tile_buffer.data(), linear_pixel.data());
} else {
DecodePixel<format, converted>(tiled_pixel.data(), linear_pixel.data());
}
} else {
if constexpr (is_4bit) {
EncodePixel4<format>(x, y, linear_pixel.data(), tile_buffer.data());
} else {
EncodePixel<format, converted>(linear_pixel.data(), tiled_pixel.data());
}
}
}
}
}
/**
* @brief Performs morton to/from linear convertions on the provided pixel data
* @param converted If true performs RGBA8 to/from convertion to all color formats
* @param width, height The dimentions of the rectangular region of pixels in linear_buffer
* @param start_offset The number of bytes from the start of the first tile to the start of
* tiled_buffer
* @param end_offset The number of bytes from the start of the first tile to the end of tiled_buffer
* @param linear_buffer The linear pixel data
* @param tiled_buffer The tiled pixel data
*
* The MortonCopy is at the heart of the PICA texture implementation, as it's responsible for
* converting between linear and morton tiled layouts. The function handles both convertions but
* there are slightly different paths and inputs for each:
*
* Morton to Linear:
* During uploads, tiled_buffer is always aligned to the tile or scanline boundary depending if the
* linear rectangle spans multiple vertical tiles. linear_buffer does not reference the entire
* texture area, but rather the specific rectangle affected by the upload.
*
* Linear to Morton:
* This is similar to the other convertion but with some differences. In this case tiled_buffer is
* not required to be aligned to any specific boundary which requires special care.
* start_offset/end_offset are useful here as they tell us exactly where the data should be placed
* in the linear_buffer.
*/
template <bool morton_to_linear, PixelFormat format, bool converted = false>
static constexpr void MortonCopy(u32 width, u32 height, u32 start_offset, u32 end_offset,
std::span<u8> linear_buffer, std::span<u8> tiled_buffer) {
constexpr u32 bytes_per_pixel = GetFormatBpp(format) / 8;
constexpr u32 aligned_bytes_per_pixel = converted ? 4 : GetFormatBytesPerPixel(format);
constexpr u32 tile_size = GetFormatBpp(format) * 64 / 8;
static_assert(aligned_bytes_per_pixel >= bytes_per_pixel, "");
const u32 linear_tile_stride = (7 * width + 8) * aligned_bytes_per_pixel;
const u32 aligned_down_start_offset = Common::AlignDown(start_offset, tile_size);
const u32 aligned_start_offset = Common::AlignUp(start_offset, tile_size);
const u32 aligned_end_offset = Common::AlignDown(end_offset, tile_size);
ASSERT(!morton_to_linear ||
(aligned_start_offset == start_offset && aligned_end_offset == end_offset));
// In OpenGL the texture origin is in the bottom left corner as opposed to other
// APIs that have it at the top left. To avoid flipping texture coordinates in
// the shader we read/write the linear buffer from the bottom up
u32 linear_offset = ((height - 8) * width) * aligned_bytes_per_pixel;
u32 tiled_offset = 0;
u32 x = 0;
u32 y = 0;
const auto LinearNextTile = [&] {
x = (x + 8) % width;
linear_offset += 8 * aligned_bytes_per_pixel;
if (!x) {
y = (y + 8) % height;
if (!y) {
return;
}
linear_offset -= width * 9 * aligned_bytes_per_pixel;
}
};
// If during a texture download the start coordinate is not tile aligned, swizzle
// the tile affected to a temporary buffer and copy the part we are interested in
if (start_offset < aligned_start_offset && !morton_to_linear) {
std::array<u8, tile_size> tmp_buf;
auto linear_data = linear_buffer.subspan(linear_offset, linear_tile_stride);
MortonCopyTile<morton_to_linear, format, converted>(width, tmp_buf, linear_data);
std::memcpy(tiled_buffer.data(), tmp_buf.data() + start_offset - aligned_down_start_offset,
std::min(aligned_start_offset, end_offset) - start_offset);
tiled_offset += aligned_start_offset - start_offset;
LinearNextTile();
}
// If the copy spans multiple tiles, copy the fully aligned tiles in between.
if (aligned_start_offset < aligned_end_offset) {
const u32 buffer_end = tiled_offset + aligned_end_offset - aligned_start_offset;
while (tiled_offset < buffer_end) {
auto linear_data = linear_buffer.subspan(linear_offset, linear_tile_stride);
auto tiled_data = tiled_buffer.subspan(tiled_offset, tile_size);
MortonCopyTile<morton_to_linear, format, converted>(width, tiled_data, linear_data);
tiled_offset += tile_size;
LinearNextTile();
}
}
// If during a texture download the end coordinate is not tile aligned, swizzle
// the tile affected to a temporary buffer and copy the part we are interested in
if (end_offset > std::max(aligned_start_offset, aligned_end_offset) && !morton_to_linear) {
std::array<u8, tile_size> tmp_buf;
auto linear_data = linear_buffer.subspan(linear_offset, linear_tile_stride);
MortonCopyTile<morton_to_linear, format, converted>(width, tmp_buf, linear_data);
std::memcpy(tiled_buffer.data() + tiled_offset, tmp_buf.data(),
end_offset - aligned_end_offset);
}
}
/**
* Performs a linear copy, converting pixel formats if required.
* @tparam decode If true, decodes the texture if needed. Otherwise, encodes if needed.
* @tparam format Pixel format to copy.
* @tparam converted If true, converts the texture to/from the appropriate format.
* @param src_buffer The source pixel data
* @param dst_buffer The destination pixel data
* @return
*/
template <bool decode, PixelFormat format, bool converted = false>
static constexpr void LinearCopy(std::span<u8> src_buffer, std::span<u8> dst_buffer) {
const std::size_t src_size = src_buffer.size();
const std::size_t dst_size = dst_buffer.size();
if constexpr (converted) {
constexpr u32 encoded_bytes_per_pixel = GetFormatBpp(format) / 8;
constexpr u32 decoded_bytes_per_pixel = 4;
constexpr u32 src_bytes_per_pixel =
decode ? encoded_bytes_per_pixel : decoded_bytes_per_pixel;
constexpr u32 dst_bytes_per_pixel =
decode ? decoded_bytes_per_pixel : encoded_bytes_per_pixel;
for (std::size_t src_index = 0, dst_index = 0; src_index < src_size && dst_index < dst_size;
src_index += src_bytes_per_pixel, dst_index += dst_bytes_per_pixel) {
const auto src_pixel = src_buffer.subspan(src_index, src_bytes_per_pixel);
const auto dst_pixel = dst_buffer.subspan(dst_index, dst_bytes_per_pixel);
if constexpr (decode) {
DecodePixel<format, converted>(src_pixel.data(), dst_pixel.data());
} else {
EncodePixel<format, converted>(src_pixel.data(), dst_pixel.data());
}
}
} else {
std::memcpy(dst_buffer.data(), src_buffer.data(), std::min(src_size, dst_size));
}
}
using MortonFunc = void (*)(u32, u32, u32, u32, std::span<u8>, std::span<u8>);
static constexpr std::array<MortonFunc, 18> UNSWIZZLE_TABLE = {
MortonCopy<true, PixelFormat::RGBA8>, // 0
MortonCopy<true, PixelFormat::RGB8>, // 1
MortonCopy<true, PixelFormat::RGB5A1>, // 2
MortonCopy<true, PixelFormat::RGB565>, // 3
MortonCopy<true, PixelFormat::RGBA4>, // 4
MortonCopy<true, PixelFormat::IA8>, // 5
MortonCopy<true, PixelFormat::RG8>, // 6
MortonCopy<true, PixelFormat::I8>, // 7
MortonCopy<true, PixelFormat::A8>, // 8
MortonCopy<true, PixelFormat::IA4>, // 9
MortonCopy<true, PixelFormat::I4>, // 10
MortonCopy<true, PixelFormat::A4>, // 11
MortonCopy<true, PixelFormat::ETC1>, // 12
MortonCopy<true, PixelFormat::ETC1A4>, // 13
MortonCopy<true, PixelFormat::D16>, // 14
nullptr, // 15
MortonCopy<true, PixelFormat::D24>, // 16
MortonCopy<true, PixelFormat::D24S8>, // 17
};
static constexpr std::array<MortonFunc, 18> UNSWIZZLE_TABLE_CONVERTED = {
MortonCopy<true, PixelFormat::RGBA8, true>, // 0
MortonCopy<true, PixelFormat::RGB8, true>, // 1
MortonCopy<true, PixelFormat::RGB5A1, true>, // 2
MortonCopy<true, PixelFormat::RGB565, true>, // 3
MortonCopy<true, PixelFormat::RGBA4, true>, // 4
// The following formats are implicitly converted to RGBA regardless, so ignore them.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
MortonCopy<true, PixelFormat::D16, true>, // 14
nullptr, // 15
MortonCopy<true, PixelFormat::D24, true>, // 16
// No conversion here as we need to do a special deinterleaving conversion elsewhere.
nullptr, // 17
};
static constexpr std::array<MortonFunc, 18> SWIZZLE_TABLE = {
MortonCopy<false, PixelFormat::RGBA8>, // 0
MortonCopy<false, PixelFormat::RGB8>, // 1
MortonCopy<false, PixelFormat::RGB5A1>, // 2
MortonCopy<false, PixelFormat::RGB565>, // 3
MortonCopy<false, PixelFormat::RGBA4>, // 4
MortonCopy<false, PixelFormat::IA8>, // 5
MortonCopy<false, PixelFormat::RG8>, // 6
MortonCopy<false, PixelFormat::I8>, // 7
MortonCopy<false, PixelFormat::A8>, // 8
MortonCopy<false, PixelFormat::IA4>, // 9
MortonCopy<false, PixelFormat::I4>, // 10
MortonCopy<false, PixelFormat::A4>, // 11
nullptr, // 12
nullptr, // 13
MortonCopy<false, PixelFormat::D16>, // 14
nullptr, // 15
MortonCopy<false, PixelFormat::D24>, // 16
MortonCopy<false, PixelFormat::D24S8>, // 17
};
static constexpr std::array<MortonFunc, 18> SWIZZLE_TABLE_CONVERTED = {
MortonCopy<false, PixelFormat::RGBA8, true>, // 0
MortonCopy<false, PixelFormat::RGB8, true>, // 1
MortonCopy<false, PixelFormat::RGB5A1, true>, // 2
MortonCopy<false, PixelFormat::RGB565, true>, // 3
MortonCopy<false, PixelFormat::RGBA4, true>, // 4
// The following formats are implicitly converted from RGBA regardless, so ignore them.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
MortonCopy<false, PixelFormat::D16, true>, // 14
nullptr, // 15
MortonCopy<false, PixelFormat::D24, true>, // 16
// No conversion here as we need to do a special interleaving conversion elsewhere.
nullptr, // 17
};
using LinearFunc = void (*)(std::span<u8>, std::span<u8>);
static constexpr std::array<LinearFunc, 18> LINEAR_DECODE_TABLE = {
LinearCopy<true, PixelFormat::RGBA8>, // 0
LinearCopy<true, PixelFormat::RGB8>, // 1
LinearCopy<true, PixelFormat::RGB5A1>, // 2
LinearCopy<true, PixelFormat::RGB565>, // 3
LinearCopy<true, PixelFormat::RGBA4>, // 4
// These formats cannot be used linearly and can be ignored.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
LinearCopy<true, PixelFormat::D16>, // 14
nullptr, // 15
LinearCopy<true, PixelFormat::D24>, // 16
LinearCopy<true, PixelFormat::D24S8>, // 17
};
static constexpr std::array<LinearFunc, 18> LINEAR_DECODE_TABLE_CONVERTED = {
LinearCopy<true, PixelFormat::RGBA8, true>, // 0
LinearCopy<true, PixelFormat::RGB8, true>, // 1
LinearCopy<true, PixelFormat::RGB5A1, true>, // 2
LinearCopy<true, PixelFormat::RGB565, true>, // 3
LinearCopy<true, PixelFormat::RGBA4, true>, // 4
// These formats cannot be used linearly and can be ignored.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
LinearCopy<true, PixelFormat::D16, true>, // 14
nullptr, // 15
LinearCopy<true, PixelFormat::D24, true>, // 16
// No conversion here as we need to do a special deinterleaving conversion elsewhere.
nullptr, // 17
};
static constexpr std::array<LinearFunc, 18> LINEAR_ENCODE_TABLE = {
LinearCopy<false, PixelFormat::RGBA8>, // 0
LinearCopy<false, PixelFormat::RGB8>, // 1
LinearCopy<false, PixelFormat::RGB5A1>, // 2
LinearCopy<false, PixelFormat::RGB565>, // 3
LinearCopy<false, PixelFormat::RGBA4>, // 4
// These formats cannot be used linearly and can be ignored.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
LinearCopy<false, PixelFormat::D16>, // 14
nullptr, // 15
LinearCopy<false, PixelFormat::D24>, // 16
LinearCopy<false, PixelFormat::D24S8>, // 17
};
static constexpr std::array<LinearFunc, 18> LINEAR_ENCODE_TABLE_CONVERTED = {
LinearCopy<false, PixelFormat::RGBA8, true>, // 0
LinearCopy<false, PixelFormat::RGB8, true>, // 1
LinearCopy<false, PixelFormat::RGB5A1, true>, // 2
LinearCopy<false, PixelFormat::RGB565, true>, // 3
LinearCopy<false, PixelFormat::RGBA4, true>, // 4
// These formats cannot be used linearly and can be ignored.
nullptr, // 5
nullptr, // 6
nullptr, // 7
nullptr, // 8
nullptr, // 9
nullptr, // 10
nullptr, // 11
nullptr, // 12
nullptr, // 13
LinearCopy<false, PixelFormat::D16, true>, // 14
nullptr, // 15
LinearCopy<false, PixelFormat::D24, true>, // 16
// No conversion here as we need to do a special interleaving conversion elsewhere.
nullptr, // 17
};
} // namespace VideoCore

View File

@@ -1,198 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/scope_exit.h"
#include "video_core/rasterizer_cache/rasterizer_cache_utils.h"
#include "video_core/rasterizer_cache/texture_runtime.h"
#include "video_core/renderer_opengl/gl_state.h"
namespace OpenGL {
GLbitfield ToBufferMask(Aspect aspect) {
switch (aspect) {
case Aspect::Color:
return GL_COLOR_BUFFER_BIT;
case Aspect::Depth:
return GL_DEPTH_BUFFER_BIT;
case Aspect::DepthStencil:
return GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT;
}
}
TextureRuntime::TextureRuntime() {
read_fbo.Create();
draw_fbo.Create();
}
void TextureRuntime::ReadTexture(const OGLTexture& tex, Subresource subresource,
const FormatTuple& tuple, u8* pixels) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
OpenGLState state;
state.ResetTexture(tex.handle);
state.draw.read_framebuffer = read_fbo.handle;
state.Apply();
const u32 level = subresource.level;
switch (subresource.aspect) {
case Aspect::Color:
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex.handle,
level);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
break;
case Aspect::Depth:
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, tex.handle,
level);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
break;
case Aspect::DepthStencil:
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
tex.handle, level);
break;
}
const auto& rect = subresource.region;
glReadPixels(rect.left, rect.bottom, rect.GetWidth(), rect.GetHeight(), tuple.format,
tuple.type, pixels);
}
bool TextureRuntime::ClearTexture(const OGLTexture& tex, Subresource subresource,
ClearValue value) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
// Setup scissor rectangle according to the clear rectangle
const auto& clear_rect = subresource.region;
OpenGLState state;
state.scissor.enabled = true;
state.scissor.x = clear_rect.left;
state.scissor.y = clear_rect.bottom;
state.scissor.width = clear_rect.GetWidth();
state.scissor.height = clear_rect.GetHeight();
state.draw.draw_framebuffer = draw_fbo.handle;
state.Apply();
const u32 level = subresource.level;
switch (subresource.aspect) {
case Aspect::Color:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex.handle,
level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
state.color_mask.red_enabled = true;
state.color_mask.green_enabled = true;
state.color_mask.blue_enabled = true;
state.color_mask.alpha_enabled = true;
state.Apply();
glClearBufferfv(GL_COLOR, 0, value.color.AsArray());
break;
case Aspect::Depth:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, tex.handle,
level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
state.depth.write_mask = GL_TRUE;
state.Apply();
glClearBufferfv(GL_DEPTH, 0, &value.depth);
break;
case Aspect::DepthStencil:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
tex.handle, level);
state.depth.write_mask = GL_TRUE;
state.stencil.write_mask = -1;
state.Apply();
glClearBufferfi(GL_DEPTH_STENCIL, 0, value.depth, value.stencil);
break;
}
return true;
}
bool TextureRuntime::CopyTextures(const OGLTexture& src_tex, Subresource src_subresource,
const OGLTexture& dst_tex, Subresource dst_subresource) {
return true;
}
bool TextureRuntime::BlitTextures(const OGLTexture& src_tex, Subresource src_subresource,
const OGLTexture& dst_tex, Subresource dst_subresource,
bool dst_cube) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
OpenGLState state;
state.draw.read_framebuffer = read_fbo.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.Apply();
auto BindAttachment =
[dst_cube, src_level = src_subresource.level, dst_level = dst_subresource.level,
dst_layer = dst_subresource.layer](GLenum target, u32 src_tex, u32 dst_tex) -> void {
GLenum dst_target = dst_cube ? GL_TEXTURE_CUBE_MAP_POSITIVE_X + dst_layer : GL_TEXTURE_2D;
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, target, GL_TEXTURE_2D, src_tex, src_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, target, dst_target, dst_tex, dst_level);
};
// Sanity check; Can't blit a color texture to a depth buffer
ASSERT(src_subresource.aspect == dst_subresource.aspect);
switch (src_subresource.aspect) {
case Aspect::Color:
// Bind only color
BindAttachment(GL_COLOR_ATTACHMENT0, src_tex.handle, dst_tex.handle);
BindAttachment(GL_DEPTH_STENCIL_ATTACHMENT, 0, 0);
break;
case Aspect::Depth:
// Bind only depth
BindAttachment(GL_COLOR_ATTACHMENT0, 0, 0);
BindAttachment(GL_DEPTH_ATTACHMENT, src_tex.handle, dst_tex.handle);
BindAttachment(GL_STENCIL_ATTACHMENT, 0, 0);
break;
case Aspect::DepthStencil:
// Bind to combined depth + stencil
BindAttachment(GL_COLOR_ATTACHMENT0, 0, 0);
BindAttachment(GL_DEPTH_STENCIL_ATTACHMENT, src_tex.handle, dst_tex.handle);
break;
}
// TODO (wwylele): use GL_NEAREST for shadow map texture
// Note: shadow map is treated as RGBA8 format in PICA, as well as in the rasterizer cache, but
// doing linear intepolation componentwise would cause incorrect value. However, for a
// well-programmed game this code path should be rarely executed for shadow map with
// inconsistent scale.
const GLenum filter = src_subresource.aspect == Aspect::Color ? GL_LINEAR : GL_NEAREST;
const auto& src_rect = src_subresource.region;
const auto& dst_rect = dst_subresource.region;
glBlitFramebuffer(src_rect.left, src_rect.bottom, src_rect.right, src_rect.top, dst_rect.left,
dst_rect.bottom, dst_rect.right, dst_rect.top,
ToBufferMask(src_subresource.aspect), filter);
return true;
}
void TextureRuntime::GenerateMipmaps(const OGLTexture& tex, u32 max_level) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
OpenGLState state;
state.texture_units[0].texture_2d = tex.handle;
state.Apply();
glActiveTexture(GL_TEXTURE0);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, max_level);
glGenerateMipmap(GL_TEXTURE_2D);
}
} // namespace OpenGL

View File

@@ -1,70 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/math_util.h"
#include "common/vector_math.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
namespace OpenGL {
// Describes the type of data a texture holds
enum class Aspect { Color = 0, Depth = 1, DepthStencil = 2 };
// A union for both color and depth/stencil clear values
union ClearValue {
Common::Vec4f color;
struct {
float depth;
u8 stencil;
};
};
struct Subresource {
Subresource(Aspect aspect, Common::Rectangle<u32> region, u32 level = 0, u32 layer = 0)
: aspect(aspect), region(region), level(level), layer(layer) {}
Aspect aspect;
Common::Rectangle<u32> region;
u32 level = 0;
u32 layer = 0;
};
struct FormatTuple;
/**
* Provides texture manipulation functions to the rasterizer cache
* Separating this into a class makes it easier to abstract graphics API code
*/
class TextureRuntime {
public:
TextureRuntime();
~TextureRuntime() = default;
// Copies the GPU pixel data to the provided pixels buffer
void ReadTexture(const OGLTexture& tex, Subresource subresource, const FormatTuple& tuple,
u8* pixels);
// Fills the rectangle of the texture with the clear value provided
bool ClearTexture(const OGLTexture& texture, Subresource subresource, ClearValue value);
// Copies a rectangle of src_tex to another rectange of dst_rect
// NOTE: The width and height of the rectangles must be equal
bool CopyTextures(const OGLTexture& src_tex, Subresource src_subresource,
const OGLTexture& dst_tex, Subresource dst_subresource);
// Copies a rectangle of src_tex to another rectange of dst_rect performing
// scaling and format conversions
bool BlitTextures(const OGLTexture& src_tex, Subresource src_subresource,
const OGLTexture& dst_tex, Subresource dst_subresource,
bool dst_cube = false);
// Generates mipmaps for all the available levels of the texture
void GenerateMipmaps(const OGLTexture& tex, u32 max_level);
private:
OGLFramebuffer read_fbo, draw_fbo;
};
} // namespace OpenGL

View File

@@ -0,0 +1,76 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/rasterizer_cache/surface_params.h"
#include "video_core/rasterizer_cache/texture_codec.h"
#include "video_core/rasterizer_cache/utils.h"
namespace VideoCore {
u32 MipLevels(u32 width, u32 height, u32 max_level) {
u32 levels = 1;
while (width > 8 && height > 8) {
levels++;
width >>= 1;
height >>= 1;
}
return std::min(levels, max_level + 1);
}
void EncodeTexture(const SurfaceParams& surface_info, PAddr start_addr, PAddr end_addr,
std::span<u8> source, std::span<u8> dest, bool convert) {
const PixelFormat format = surface_info.pixel_format;
const u32 func_index = static_cast<u32>(format);
if (surface_info.is_tiled) {
const MortonFunc SwizzleImpl =
(convert ? SWIZZLE_TABLE_CONVERTED : SWIZZLE_TABLE)[func_index];
if (SwizzleImpl) {
SwizzleImpl(surface_info.width, surface_info.height, start_addr - surface_info.addr,
end_addr - surface_info.addr, source, dest);
return;
}
} else {
const LinearFunc LinearEncodeImpl =
(convert ? LINEAR_ENCODE_TABLE_CONVERTED : LINEAR_ENCODE_TABLE)[func_index];
if (LinearEncodeImpl) {
LinearEncodeImpl(source, dest);
return;
}
}
LOG_ERROR(HW_GPU, "Unimplemented texture encode function for pixel format = {}, tiled = {}",
func_index, surface_info.is_tiled);
UNIMPLEMENTED();
}
void DecodeTexture(const SurfaceParams& surface_info, PAddr start_addr, PAddr end_addr,
std::span<u8> source, std::span<u8> dest, bool convert) {
const PixelFormat format = surface_info.pixel_format;
const u32 func_index = static_cast<u32>(format);
if (surface_info.is_tiled) {
const MortonFunc UnswizzleImpl =
(convert ? UNSWIZZLE_TABLE_CONVERTED : UNSWIZZLE_TABLE)[func_index];
if (UnswizzleImpl) {
UnswizzleImpl(surface_info.width, surface_info.height, start_addr - surface_info.addr,
end_addr - surface_info.addr, dest, source);
return;
}
} else {
const LinearFunc LinearDecodeImpl =
(convert ? LINEAR_DECODE_TABLE_CONVERTED : LINEAR_DECODE_TABLE)[func_index];
if (LinearDecodeImpl) {
LinearDecodeImpl(source, dest);
return;
}
}
LOG_ERROR(HW_GPU, "Unimplemented texture decode function for pixel format = {}, tiled = {}",
func_index, surface_info.is_tiled);
UNIMPLEMENTED();
}
} // namespace VideoCore

View File

@@ -0,0 +1,142 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <compare>
#include <span>
#include <boost/icl/right_open_interval.hpp>
#include "common/hash.h"
#include "common/math_util.h"
#include "common/vector_math.h"
#include "video_core/rasterizer_cache/pixel_format.h"
namespace VideoCore {
using SurfaceInterval = boost::icl::right_open_interval<PAddr>;
struct Offset {
constexpr auto operator<=>(const Offset&) const noexcept = default;
u32 x = 0;
u32 y = 0;
};
struct Extent {
constexpr auto operator<=>(const Extent&) const noexcept = default;
u32 width = 1;
u32 height = 1;
};
union ClearValue {
Common::Vec4f color;
struct {
float depth;
u8 stencil;
};
};
struct TextureClear {
u32 texture_level;
Common::Rectangle<u32> texture_rect;
ClearValue value;
};
struct TextureCopy {
u32 src_level;
u32 dst_level;
u32 src_layer;
u32 dst_layer;
Offset src_offset;
Offset dst_offset;
Extent extent;
};
struct TextureBlit {
u32 src_level;
u32 dst_level;
u32 src_layer;
u32 dst_layer;
Common::Rectangle<u32> src_rect;
Common::Rectangle<u32> dst_rect;
};
struct BufferTextureCopy {
u32 buffer_offset;
u32 buffer_size;
Common::Rectangle<u32> texture_rect;
u32 texture_level;
};
struct StagingData {
u32 size = 0;
std::span<u8> mapped{};
u64 buffer_offset = 0;
};
struct TextureCubeConfig {
PAddr px;
PAddr nx;
PAddr py;
PAddr ny;
PAddr pz;
PAddr nz;
u32 width;
u32 levels;
Pica::TexturingRegs::TextureFormat format;
bool operator==(const TextureCubeConfig& rhs) const {
return std::memcmp(this, &rhs, sizeof(TextureCubeConfig)) == 0;
}
bool operator!=(const TextureCubeConfig& rhs) const {
return std::memcmp(this, &rhs, sizeof(TextureCubeConfig)) != 0;
}
const u64 Hash() const {
return Common::ComputeHash64(this, sizeof(TextureCubeConfig));
}
};
class SurfaceParams;
u32 MipLevels(u32 width, u32 height, u32 max_level);
/**
* Encodes a linear texture to the expected linear or tiled format.
*
* @param surface_info Structure used to query the surface information.
* @param start_addr The start address of the dest data. Used if tiled.
* @param end_addr The end address of the dest data. Used if tiled.
* @param source_tiled The source linear texture data.
* @param dest_linear The output buffer where the encoded linear or tiled data will be written to.
* @param convert Whether the pixel format needs to be converted.
*/
void EncodeTexture(const SurfaceParams& surface_info, PAddr start_addr, PAddr end_addr,
std::span<u8> source, std::span<u8> dest, bool convert = false);
/**
* Decodes a linear or tiled texture to the expected linear format.
*
* @param surface_info Structure used to query the surface information.
* @param start_addr The start address of the source data. Used if tiled.
* @param end_addr The end address of the source data. Used if tiled.
* @param source_tiled The source linear or tiled texture data.
* @param dest_linear The output buffer where the decoded linear data will be written to.
* @param convert Whether the pixel format needs to be converted.
*/
void DecodeTexture(const SurfaceParams& surface_info, PAddr start_addr, PAddr end_addr,
std::span<u8> source, std::span<u8> dest, bool convert = false);
} // namespace VideoCore
namespace std {
template <>
struct hash<VideoCore::TextureCubeConfig> {
std::size_t operator()(const VideoCore::TextureCubeConfig& config) const noexcept {
return config.Hash();
}
};
} // namespace std