Skip to content

Conversation

@futz12
Copy link
Contributor

@futz12 futz12 commented Jul 31, 2025

  • 文件级缓存Pipeline
  • 验证平台信息等
  • 缓存spv

性能表现:
CPU: 13900HX
GPU: 对应核心显卡

==================================================
              Verification and Summary
==================================================
Output verification: SUCCESS
--------------------------------------------------
Performance Summary:
  - Without Cache: 423.934 ms
  - With Cache:    99.2737 ms
  - Speedup:       76.5828%

@tencent-adm
Copy link
Member

tencent-adm commented Jul 31, 2025

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link

github-actions bot commented Jul 31, 2025

The binary size change of libncnn.so (bytes)

architecture base size pr size difference
x86_64 15643128 15657200 +14072 ⚠️
armhf 6648220 6653924 +5704 ⚠️
aarch64 9986896 9988816 +1920 ⚠️

@nihui nihui closed this Aug 1, 2025
@nihui nihui reopened this Aug 1, 2025
@codecov-commenter
Copy link

codecov-commenter commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 73.14286% with 47 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.95%. Comparing base (8c5e23b) to head (183ecb9).
⚠️ Report is 27 commits behind head on master.

Files with missing lines Patch % Lines
src/pipelinecache.cpp 73.05% 45 Missing ⚠️
src/gpu.cpp 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6221      +/-   ##
==========================================
+ Coverage   95.72%   95.95%   +0.23%     
==========================================
  Files         747      836      +89     
  Lines      249693   265392   +15699     
==========================================
+ Hits       239014   254662   +15648     
- Misses      10679    10730      +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nihui nihui requested a review from Copilot August 1, 2025 02:57

This comment was marked as outdated.

@futz12 futz12 requested review from Copilot and nihui August 2, 2025 04:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a persistent Vulkan pipeline cache feature to optimize pipeline compilation and loading times. The cache stores compiled SPIR-V modules and Vulkan pipeline cache data to disk, providing significant performance improvements for subsequent application runs.

  • Added file-level pipeline cache persistence with platform validation
  • Implemented SPIR-V module caching to avoid recompilation
  • Added performance benchmarking test showing ~76% speedup improvement

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_pipecache.cpp Comprehensive test suite for pipeline cache functionality and performance validation
tests/CMakeLists.txt Added pipeline cache test to CMake build configuration
src/pipelinecache.h Extended PipelineCache API with save/load methods for file and buffer operations
src/pipelinecache.cpp Core implementation of cache serialization, platform validation, and SPIR-V caching
src/gpu.h Added pipeline cache parameter to create_pipeline method
src/gpu.cpp Updated pipeline creation to use Vulkan pipeline cache

return ret;
}

int PipelineCache::create_shader_module(int shader_type_index, const Option& opt,
Copy link

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goto statement on line 702 creates non-linear control flow that reduces code readability. Consider restructuring this logic using an if-else statement or extracting the cache lookup into a separate function.

Copilot uses AI. Check for mistakes.
| opt.use_fp16_arithmetic << 4
| opt.use_int8_storage << 3
| opt.use_int8_arithmetic << 2;

Copy link

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bit manipulation for opt_bits uses magic numbers and could be error-prone. Consider defining named constants for bit positions or using an enum to make the code more maintainable and self-documenting.

Suggested change
// Define bit positions for opt_bits
enum OptBits {
OPT_BIT_UNUSED = 7,
OPT_BIT_FP16_PACKED = 6,
OPT_BIT_FP16_STORAGE = 5,
OPT_BIT_FP16_ARITHMETIC = 4,
OPT_BIT_INT8_STORAGE = 3,
OPT_BIT_INT8_ARITHMETIC = 2
};
uint32_t opt_bits = (0 << OPT_BIT_UNUSED)
| (opt.use_fp16_packed << OPT_BIT_FP16_PACKED)
| (opt.use_fp16_storage << OPT_BIT_FP16_STORAGE)
| (opt.use_fp16_arithmetic << OPT_BIT_FP16_ARITHMETIC)
| (opt.use_int8_storage << OPT_BIT_INT8_STORAGE)
| (opt.use_int8_arithmetic << OPT_BIT_INT8_ARITHMETIC);

Copilot uses AI. Check for mistakes.
int retc = compile_spirv_module(shader_type_index, opt, spirv);
int retc = 0;

for (int i = 0; i < d->cache_spirv_module.size(); i++)
Copy link

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linear search through cache_spirv_module could become inefficient with large numbers of cached modules. Consider using a hash map or other efficient lookup structure for better performance.

Copilot uses AI. Check for mistakes.
fprintf(stderr, "Test failed: extraction with cache failed.\n");
return -1;
}
delete net_with_cache.opt.pipeline_cache; // clean up pipeline cache by hand
Copy link

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual memory management with delete is error-prone. Consider using smart pointers (std::unique_ptr) or RAII patterns to ensure automatic cleanup and prevent potential memory leaks.

Suggested change
delete net_with_cache.opt.pipeline_cache; // clean up pipeline cache by hand
// pipeline_cache_ptr will automatically clean up the pipeline cache here

Copilot uses AI. Check for mistakes.
}

// Validate platform information for compatibility
if (header.vendorID != vkdev->info.vendor_id() || header.deviceID != vkdev->info.device_id() || header.driverVersion != vkdev->info.driver_version() || memcmp(header.uuid, vkdev->info.pipeline_cache_uuid(), VK_UUID_SIZE) != 0)
Copy link

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This long conditional statement with multiple OR operations is difficult to read and maintain. Consider breaking it into separate boolean variables or a helper validation function for better readability.

Suggested change
if (header.vendorID != vkdev->info.vendor_id() || header.deviceID != vkdev->info.device_id() || header.driverVersion != vkdev->info.driver_version() || memcmp(header.uuid, vkdev->info.pipeline_cache_uuid(), VK_UUID_SIZE) != 0)
if (!is_cache_platform_compatible(header, vkdev))

Copilot uses AI. Check for mistakes.
@nihui
Copy link
Member

nihui commented Aug 21, 2025

感谢你的工作,请将你在实现中的笔记和心得,遇到的困难和解决方法等,记录成文章,发表在discussion分区,这将作为知识总结 https://github.com/Tencent/ncnn/discussions

Thank you for your work. Please record your notes and experience in the implementation, difficulties encountered and solutions, etc. into an article and publish it in the discussion section. This will serve as a knowledge summary. https://github.com/Tencent/ncnn/discussions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants