Add a std::bit_cast-like function archiving the same runtime results as
the standard function, without compile time support.
This allows us to use bit_cast while we wait for compiler support, it
can be trivially replaced in the future.
VirtualBuffer makes use of VirtualAlloc (on Windows) and mmap() (on
other platforms). Neither of these ensure that non-trivial objects are
properly constructed in the allocated memory.
To prevent potential undefined behavior occurring due to that, we can
add a static assert to loudly complain about cases where that is done.
Makes page tables and virtual buffers able to be moved, but not copied,
making the interface more flexible.
Previously, with the destructor specified, but no move assignment or
constructor specified, they wouldn't be implicitly generated.
Hides all of the implementation details for users of the class. This has
the benefit of reducing includes and also making the fiber classes
movable again.
Allows our CI to catch more potential bugs. This also removes the
[[nodiscard]] attribute of IOFile's Open member function. There are
cases where a file may want to be opened, but have the status of it
checked at a later time.
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.
The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.
To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.
Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.
Async GPU is not properly implemented at the moment.
Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
Makes our error coverage a little more consistent across the board by
applying it to Linux side of things as well. This also makes it more
consistent with the warning settings in other libraries in the project.
This also updates httplib to 0.7.9, as there are several warning
cleanups made that allow us to enable several warnings as errors.
From -fsanitize=address, this code wasn't calling the proper destructor.
Adding virtual destructors for each inherited class and the base class
fixes this bug.
While we are at it, mark the functions as final.
I made a request on the Xbyak issue tracker to allow some constructors
to be constexpr in order to avoid static constructors from needing to
execute for some of our register constants.
This request was implemented, so this updates Xbyak so that we can make
use of it.
The extended logging option is automatically disabled on boot but can be enabled afterwards, allowing the log file to go up to 1 GB during that session.
This commit also fixes a few errors that are present in the general debug menu.
This is the only place it's actively used. It's also more appropriate
for web-related structures to be within the web service target.
Especially given this one doesn't rely on anything in the common
library.
Migrates a remaining common file over to the Common namespace, making it
consistent with the rest of common files.
This also allows for high-traffic FS related code to alias the
filesystem function namespace as
namespace FS = Common::FS;
for more concise typing.
These are intentionally discarded internally, since the rest of the
public API allows querying success. We want all non-internal uses of
these functions to be explicitly checked, so we can signify that we
intentionally want to discard the return values here.
We can simplify this function down into a single line with the use of
fmt. A benefit with the fmt approach is that the fmt variant of
localtime is thread-safe as well, making GetOsTimeZoneOffset()
thread-safe as well.
Now that clang-format makes [[nodiscard]] attributes format sensibly, we
can apply them to several functions within the common library to allow
the compiler to complain about any misuses of the functions.
This makes it more inline with its currently unavailable standardized
analogue std::derived_from.
While we're at it, we can also make the template match the requirements
of the standardized variant as well.
Previously the constructor for all of these would run at program
startup, consuming time before the application can enter main().
This is also particularly dangerous, given the logging system wouldn't
have been initialized properly yet, yet the program would use the logs
to signify an error.
To rectify this, we can replace the literals with constexpr functions
that perform the conversion at compile-time, completely eliminating the
runtime cost of initializing these arrays.
- In `SetCurrentThreadName`, when on Linux, truncate to 15 bytes, as (at
least on glibc) `pthread_set_name_np` will otherwise return `ERANGE` and
do nothing.
- Also, add logging in case `pthread_set_name_np` returns an error
anyway. This is Linux-specific, as the Apple and BSD versions of
`pthread_set_name_np return `void`.
- Change the name for CPU threads in multi-core mode from
"yuzu:CoreCPUThread_N" (19 bytes) to "yuzu:CPUCore_N" (14 bytes) so it
fits into the Linux limit. Some other thread names are also cut off,
but I didn't bother addressing them as you can guess them from the
truncated versions. For a CPU thread, truncation means you can't see
which core it is!
On DragonFly and NetBSD build fails with
src/common/virtual_buffer.cpp
src/common/virtual_buffer.cpp:16:10: fatal error: sys/sysinfo.h: No such file or directory
#include <sys/sysinfo.h>
^~~~~~~~~~~~~~~
* ipc: Allow all trivially copyable objects to be passed directly into WriteBuffer
With the support of C++20, we can use concepts to deduce if a type is an STL container or not.
* More agressive concept for stl containers
* Add -fconcepts
* Move to common namespace
* Add Common::IsBaseOf
In file included from src/core/hle/kernel/memory/page_table.cpp:5:
src/./common/alignment.h:67:68: error: no member named 'align_val_t' in namespace 'std'
return static_cast<T*>(::operator new (n * sizeof(T), std::align_val_t{Align}));
~~~~~^
src/./common/alignment.h:71:51: error: no member named 'align_val_t' in namespace 'std'
::operator delete (p, n * sizeof(T), std::align_val_t{Align});
~~~~~^
In cases where the size is not a known constant when inlining, AlignUp<std::size_t> currently generates two 64-bit div instructions.
This generates one div and a cmov which is significantly cheaper.
This is a new attempt at #4206 that shouldn't break windows builds.
If someone else could test on windows, it would be much appreciated.
Previously, the build bot passed but the actual builds failed.
On gcc/ld, and clang/lld, fmt::v6 symbols are excluded, so linking
fails. This fixes the issue.
Note: This was included in the FindBoost changes I shared with
BlinkHawk, however only they were merged. I'm not sure if it was missed,
or if there was an issue with this part of the change.
This commit: Implements CPU Interrupts, Replaces Cycle Timing for Host
Timing, Reworks the Kernel's Scheduler, Introduce Idle State and
Suspended State, Recreates the bootmanager, Initializes Multicore
system.
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
* Remove git submodules that will be loaded through conan
* Move custom Find modules to their own folder
* Use conan for downloading missing external dependencies
* CI: Change the yuzu source folder user to the user that the containers run on
* Attempt to remove dirty mingw build hack
* Install conan on the msvc build
* Only set release build type when using not using multi config generator
* Re-add qt bundled to workaround an issue with conan qt not downloading prebuilt binaries
* Add workaround for submodules that use legacy CMAKE variables
* Re-add USE_BUNDLED_QT on the msvc build bot
It's undefined behavior to pass a null pointer to std::fread and
std::fwrite, even if the length passed in is zero, so we must perform
the precondition checking ourselves.
A common case where this can occur is when passing in the data of an
empty std::vector and size, as an empty vector will typically have a
null internal buffer.
While we're at it, we can move the implementation out of line and add
debug checks against passing in nullptr to std::fread and std::fwrite.
On Windows, network shares use paths like \\server\share\file which were
being broken by FileUtil::SanitizePath() removing double slashes.
Changed the code in SanitizePath to permit a double-backslash if it
occurs at the start of a filepath (on Windows only).
* IOFile: Make the move constructor and move assignment operator noexcept
Certain parts of the standard library try to determine whether or not a
transfer operation should either be a copy or a move. The prevalent notion
of move constructors/assignment operators is that they should not throw,
they simply move an already existing resource somewhere else.
This is typically done with 'std::move_if_noexcept'. Like the name says,
if a type's move constructor is noexcept, then the functions retrieves an
r-value reference (for move semantics), or an l-value (for copy semantics)
if it is not noexcept.
As IOFile deletes the copy constructor and copy assignment operators,
using IOFile with certain parts of the standard library can fail in
unexcepted ways (especially when used with various container
implementations). This prevents that.
* fix various instances of -1 being assigned to unsigned types
* do not assign in conditional statements
* File/IOFile: Check _tfopen_s properly
* common/file_util.cpp: address review comments
Co-authored-by: Lioncash <mathew1800@gmail.com>
Co-authored-by: Shawn Hoffman <godisgovernment@gmail.com>
Co-authored-by: Sepalani <sepalani@hotmail.fr>
This PR aims to reduce the memory usage in the CPU page table by moving
GPU specific parameters into a child class. This saves 1Gb of Memory for
most games.
An implementation of the cemuhook motion/touch protocol, this adds the
ability for users to connect several different devices to citra to send
direct motion and touch data to citra.
Co-Authored-By: jroweboy <jroweboy@gmail.com>
We relies on UNREACHABLE's noreturn attribute to eliminate parent's "no return value" warning. However, this was wrapped in a `if(!false)` block, which compilers may not unfold to recognize the noreturn nature.
Makes the header more general for other potential algorithms in the
future. While we're at it, include a missing <functional> include to
satisfy the use of std::less.
This was related to the source allocator being passed into the
constructor potentially having a different type than allocator being
constructed.
We simply need to provide a constructor to handle this case.
This resolves issues related to the allocator causing debug builds on
MSVC to fail.
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
This commit ensures that all backing memory allocated for the Guest CPU
is aligned to 256 bytes. This due to how gpu memory works and the heavy
constraints it has in the alignment of physical memory.
Instead of storing all block width, height and depths in their shifted
form:
block_width = 1U << block_shift;
Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
Avoids potentially performing multiple reallocations (depending on the
size of the input data) by reserving the necessary amount of memory
ahead of time.
This is trivially doable, so there's no harm in it.