Releases: Devsh-Graphics-Programming/Nabla
v0.7.2-alpha1
What's Changed
New Features
- Almost all SPIR-V intrinsics for Raytracing Pipeline exposed (Except Ray Terminators and ReportHit) so you don't need to rely on HLSL intrinsics getting translated/codegenned properly
- AABB computation for Polygon Geometries
- Non-List Triangle Indexings to Triangle List Indexing conversion
Bugfixes
- Fix small bug in matrix preventing product of unorthodox matrix sizes by @Fletterio in #908
Known Bugs
- Unused SPIR-V built-in variables still emit their OpCapability and Extensions into the SPIR-V module microsoft/DirectXShaderCompiler#7715
Full Changelog: v0.7.1-alpha1...v0.7.2-alpha1
v0.7.1-alpha1
What's Changed
New Features
- Github CI now tests Examples which are not added with
EXCLUDE_FROM_ALL
to the meta-example project IGeometry
,IPolygonGeometry
classes- Polygon Geometry can be used with Asset Converter from day one
CGeometryCreator
making basic geometries like cubes, cones, disks, etc.IGeometryLoader
base class (Mesh loaders are back!)- PLY Geometry Loader
- Mitsuba Serialized Geometry Loader
- CAD Example can now do Digital Terrain Models and display isolines and heightshading on tringular and grid meshes
- Precompile your Shaders to SPIR-V using NSC (with all our STL headers) and also as a CMake command!
- Precompile Shader Permutations with different JSON generated Device Capability Traits (we also have C++ autogen utilities that let you resolve the keys to them depending on device capabilities)
- Normal and Quaternion quantization caches done with our HLSL types
CGeometryManipulator
stub, more functionality from deprecatedMeshManipulator
to come- finished the
AABB.hlsl
shape - added transform, union and intersect functions which can be specialized for shapes in the HLSL library
Removals
IRenderpassIndependentPipeline
IMesh
andIMeshBuffer
MeshPacker
s V1 and V2, we encourage programmable pulling from BDA now
Improvements
- made the
refctd_memory_resource::allocate
anddeallocate
virtual so they can be overriden (the STD::pmr didn't make sense here) - created a
adoption_memory_resource
- most of
algorithm.hlsl
also compiles as C++ Host code - Builtin resource and PCH improvements
- LRU Cache and Doubly Linked List Container improvements for resizability etc.
- Replaced Parallel Hashmap with Greg's Template Library as the submodule
- skip duplicate validation for SPIR-V optimizer
- Updated DXC
- Boost-Wave Shader preprocessor now its own Translation Unit, can be always compiled with optimizations (so even Debug builds don't take minutes preprocessing the input to DXC)
- target SPIR-V version is now a shader preprocessor option (because of
__SPIR_V_MAJOR__
and friends) fast_affine.hlsl
for doing mathematical abominations like multiplying 3x4 matrices with 3x1 vectors as if they're padded 4x4 and 4x1- cofactor and fast inverse HLSL utilities (useful for nice fast normal matrix calc)
IUtilities
has acreate
factory which can fail if it can't allocate the amount of HOST_VISIBLE memory you requested- split
MonoAssetManagerAndBuiltinResourceApplication
into two classes alPreviousStages
andallLaterStages
sync utlitity functionsemulated_vector
now haslength_helper
specialization, andinversesqrt
foremulated_float64
- constexpr
findLSB
variant - 3x3 matrix from quaternion HLSL utility
Bugfixes
DeferredFreeFunctor
actually tested and works now- AccelerationStructure::validBuildFlags infinite recursive call
- Shader compiler
adopt_memory
typo - keep boost-wave compile options consistent and encapsulate/don't leak it
Full Changelog: v0.7.0-alpha1...v0.7.1-alpha1
Multi-Entry Point SPIR-V Shaders - Removal of IGPUShader and ICPUShader
What's Changed
There's now only a single IShader
.
Shader Stage, and Specialization Info is being provided directly as Pipeline Creation Parameters.
This means that the SPIR-V each shader gets its capabilities and extensions trimmed based on the entry points used by a single pipeline.
Aggressive dead code elimination SPIR-V optimization is necessary for this to function.
Full Changelog: v0.6.2-alpha2...v0.7.0-alpha1
Bugfix: Missing `tuple.hlsl` from embed
Default build of example 23 and 29 didn't work
Workgroup2 Reductions and Scans
What's Changed
Workgroup Scans
nbl::hlsl::workgroup2
reduce + scan by @keptsecret in #876
Highly Performant, the subgroup emulated variant (Stone-Kogge adder made of subgroupShuffleUp
) up to 200% faster than native (subgroupInclusiveAdd
) on Nvidia RTX GPUs.
Blogpost incoming.
Full Changelog: v0.6.1-alpha1...v0.6.2-alpha1
Godbolt NSC Docker Image, YML Workflows for MSVC, and Image Asset Converter fix
Fixed stack overflow by not resetting empty overflow callbacks on the transfer SIntendedSubmitInfo
when uploading images in CAssetConverter
Full Changelog: v0.6.0-alpha1...v0.6.1-alpha1
Asset Converter Automated TLAS and BLAS builds & compactions, Clang MSVC builds
Asset Converter now covers 100% of asset types (except for renderpasses and framebuffers) at its inception, last feature outstanding is the RT Pipeline coming in #871
See examples of usage with and without compaction and ReBAR fast-path:
- https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/e30938c2615dd5d3ab69cadca3ba11d1e03f8233/67_RayQueryGeometry/main.cpp
- https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/e30938c2615dd5d3ab69cadca3ba11d1e03f8233/71_RayTracingPipeline/main.cpp
What's Changed
- minor improve to exclusive scan (less registers) by @keptsecret in #875
- Acceleration Structure Asset Conversion by @devshgraphicsprogramming in #872
- build: Add ClangCL profiles by @alichraghi in #791
- Working and Tested Asset Converter for Acceleration Structures by @devshgraphicsprogramming in #878
Full Changelog: v0.5.9-alpha2...v0.6.0-alpha1
Fix of v0.5.9-alpha1
What's Changed
- Update "Join our team" section of readme by @YasInvolved in #866
- Quick fix to subgroup arithmetic by @keptsecret in #874
Full Changelog: v0.5.9-alpha1...v0.5.9-alpha2
New Optimized Subgroup Arithmetic Utilities
Our emulated Subgroup Scans are 2x faster than Nvidia's implementation of KHR_shader_subgroup_arithmetic
Certain SPIR-V and GLSL intrinsics got fixed.
Acceleration Structure API refactor and Asset Converter ReBAR support
Cleaned up the inheritance hierarchy in TLAS and BLAS classes.
Now BLAS is a proper IPreHashed
and TLAS has utility static methods to convert a Polymorphic Instance to a non-polymorphic one where the type is embedded in the lower bits of the Aligned Pointer when doing a build where the Instance input buffer is a span of pointers.
Added a demote_promote_writer_readers_lock
which is well thought out and actually works.
Asset Converter can be now overriden to assume ReBAR support and create Buffers over DEVICE_LOCAL and HOST_VISIBLE memory, sidestepping the use of a transfer queue to upload the buffer data.
Queue Families declared when creating a Buffer or Image with concurrent sharing mode can now be queried from the object.
Fixed a number of bugs in Descriptor Lifetime Tracking and general inheritance patterns.
Fixed a bug in IUtilities
where overflow submits and intended submits signalling timeline semaphores upon which staging buffer deferred memory deallocations were latched, would not get patched to include the COPY_BIT
in the signal stage mask.
Minor additions of fma
to HLSL/C++ tgmath library.