diff --git a/README.md b/README.md index 110697c..d2054bd 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,72 @@ CUDA Path Tracer **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Yuru Wang +* Tested on: Windows 10, i7-7700HQ @ 2.5GHz 128GB, GTX 1050 Ti 8GB (personal computer) +* Modified CMakeList.txt: changed sm_20 to sm_61 inside cuda_add_library -### (TODO: Your README) +## Project Description ## +This project implements a CUDA-based path tracer capable of rendering globally-illuminated images very quickly. -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +A list of features implemented in this project is as follows: +* A shading kernel with BSDF evaluation +* Path continuation/termination using Stream Compaction (with thrust::remove_if) +* Sorting paths by material types (with thrust::sort_by_key) +* Cache first bounce intersections for re-use. +* **Physically-based depth-of-field** by jittering rays within an aperture +* **Stochastic Sampled Antialiasing** +* **Refraction with Snell's law** +* **Motion blur** by averaging samples at different times in the animation +## Results ## +All images below are rendered with 5000 samples and 8 depths. + +### Basic Path tracer +![](img/basic.png) + +### Different Materials +_Perfect surfaces_ +![](img/three_materials.png) + +Left to right: Ideal diffuse surface, Perfectly specular-reflective surface, and fully refractive surface + +_Composite materials_ +![](img/composite_materials.png) + +Left to right (reflect/refract/diffuse): 0.3/0.0/0.7, 0.3/0.3/0.4, 0、0.5/0.5. +Light emissive: 10 + +### Depth of field +![](img/DOF.png) + +Reference for Concentric Sample Disk function from https://www.dartdocs.org/documentation/dartray/0.0.1/core/ConcentricSampleDisk.html + +### Motion blur +| Without Motion Blur | With Motion Blur | +|------|------| +| ![](img/without_motionBlur.png) | ![](img/motion_blur.png) | + +The cube has a velocity of 1 along y axis, and rotates about its z axis by 45 degrees. + +### Antialiasing +| Without AA | With AA | +|------|------| +| ![](img/without_antialiazing.png) | ![](img/with_antialiazing.png) | + +## Performance Analysis and Questions ## +![](img/numOfPathsRemaining.png) + +The x horizontal axis of above diagram represents the number of remaining trace depths. As shown on the diagram above, at the end of each bounce, the number of remaining rays is decreasing. The performance regarding stream compaction is explained below. + +![](img/performanceWithOptimizations.png) + +The above figure showing the performances of each optimization tricks. Caching the first bounce intersections did slightly better job than no caching, which is expected because it eliminates one kernel call (calculating initial rays) per iteration. However, unexpectedly, with stream compaction and sorting rays did not actually improve the performance and they even slow down the rendering. I guess that because I used thrust::remove if and thrust::sort_by_key to perform stream compaction and sorting, these two function calls have large overhead which costs more time than directly shading simple materials and geometries. I guess the advantage of using stream compaction and sorting rays would pay off when render large amount of rays/depths and render complex materials. + +![](img/open_closed.png) + +It can be observed from the diagram above that with stream compaction, the performance is worse than without stream compaction in both open and closed case. As I mentioned above, thrust::remove_if has overhead which kills its advantage in my simple scene settings. However, we can still observe the fact that, with stream compaction, performance in open space is way better than closed space. This is because in open space, many rays would shoot out from the box at every iteration, which means the number of eligible rays would decrease quicker than that in closed box. While for the closed case, all rays are bouncing off in the box until they hit the light source or reach the maximum number of bounces, in which case the stream compaction could only eliminates less rays than that in open space. + +## Debugging Log ## +![](img/bug.png) + +This bug is caused by forgetting to add a small offset to the origin of each new ray. Forgetting to do so makes calculating intersections for those new rays get errors (always get the same intersections as last bounce so that those rays are not able to reach light source) diff --git a/img/DOF.png b/img/DOF.png new file mode 100644 index 0000000..a580a24 Binary files /dev/null and b/img/DOF.png differ diff --git a/img/REFERENCE_cornell.5000samp.png b/img/basic.png similarity index 100% rename from img/REFERENCE_cornell.5000samp.png rename to img/basic.png diff --git a/img/bug.png b/img/bug.png new file mode 100644 index 0000000..793291b Binary files /dev/null and b/img/bug.png differ diff --git a/img/composite_materials.png b/img/composite_materials.png new file mode 100644 index 0000000..530a898 Binary files /dev/null and b/img/composite_materials.png differ diff --git a/img/cornell.2018-09-29_01-50-57z.5000samp.png b/img/cornell.2018-09-29_01-50-57z.5000samp.png new file mode 100644 index 0000000..c8fca6a Binary files /dev/null and b/img/cornell.2018-09-29_01-50-57z.5000samp.png differ diff --git a/img/motion_blur.png b/img/motion_blur.png new file mode 100644 index 0000000..fa9cd17 Binary files /dev/null and b/img/motion_blur.png differ diff --git a/img/numOfPathsRemaining.png b/img/numOfPathsRemaining.png new file mode 100644 index 0000000..87da47b Binary files /dev/null and b/img/numOfPathsRemaining.png differ diff --git a/img/open_closed.png b/img/open_closed.png new file mode 100644 index 0000000..92116a6 Binary files /dev/null and b/img/open_closed.png differ diff --git a/img/performanceWithOptimizations.png b/img/performanceWithOptimizations.png new file mode 100644 index 0000000..462aca9 Binary files /dev/null and b/img/performanceWithOptimizations.png differ diff --git a/img/three_materials.png b/img/three_materials.png new file mode 100644 index 0000000..6a5eb0b Binary files /dev/null and b/img/three_materials.png differ diff --git a/img/with_antialiazing.png b/img/with_antialiazing.png new file mode 100644 index 0000000..5b66cd1 Binary files /dev/null and b/img/with_antialiazing.png differ diff --git a/img/without_antialiazing.png b/img/without_antialiazing.png new file mode 100644 index 0000000..1281b79 Binary files /dev/null and b/img/without_antialiazing.png differ diff --git a/img/without_motionBlur.png b/img/without_motionBlur.png new file mode 100644 index 0000000..d23a8ac Binary files /dev/null and b/img/without_motionBlur.png differ diff --git a/scenes/cornell-3materials.txt b/scenes/cornell-3materials.txt new file mode 100644 index 0000000..5828fa7 --- /dev/null +++ b/scenes/cornell-3materials.txt @@ -0,0 +1,145 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Refract +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 1.2 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 1 +TRANS 0 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 7 +sphere +material 4 +TRANS -3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 8 +sphere +material 5 +TRANS 3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 diff --git a/scenes/cornell-DOF.txt b/scenes/cornell-DOF.txt new file mode 100644 index 0000000..27a06d0 --- /dev/null +++ b/scenes/cornell-DOF.txt @@ -0,0 +1,171 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 10 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 2.5 +LENS_RADIUS 0.2 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 2 2 2 +LOOKAT 0 2 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 4 +TRANS 0 1 -3 +ROTAT 0 0 0 +SCALE 2 2 2 + +// Sphere +OBJECT 7 +sphere +material 4 +TRANS 0 1 -0 +ROTAT 0 0 0 +SCALE 2 2 2 + + +// Sphere +OBJECT 8 +sphere +material 4 +TRANS 0 1 3 +ROTAT 0 0 0 +SCALE 2 2 2 + + +// Sphere +OBJECT 9 +sphere +material 4 +TRANS 0 1 5 +ROTAT 0 0 0 +SCALE 2 2 2 + + +// Sphere +OBJECT 10 +sphere +material 4 +TRANS 0 1 -5 +ROTAT 0 0 0 +SCALE 2 2 2 + +// Sphere +OBJECT 11 +sphere +material 4 +TRANS 2.5 1 3.5 +ROTAT 0 0 0 +SCALE 2 2 2 + +// Sphere +OBJECT 12 +sphere +material 4 +TRANS -4 1 -1 +ROTAT 0 0 0 +SCALE 2 2 2 + diff --git a/scenes/cornell-Motion.txt b/scenes/cornell-Motion.txt new file mode 100644 index 0000000..d14b7e4 --- /dev/null +++ b/scenes/cornell-Motion.txt @@ -0,0 +1,140 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Refract +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 1.2 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 5 +TRANS -1 4 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Cube +OBJECT 7 +cube +material 4 +TRANS 2 4 0 +ROTAT 0 45 0 +SCALE 1 4 1 +MOVING 0 +VELOCITY 0 1 0 +ANGULAR_VEL 0 0 45 diff --git a/scenes/cornell-closed.txt b/scenes/cornell-closed.txt new file mode 100644 index 0000000..2e01cb8 --- /dev/null +++ b/scenes/cornell-closed.txt @@ -0,0 +1,127 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 4 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 1 +TRANS -1 4 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Front wall +OBJECT 7 +cube +material 1 +TRANS 0 5 5 +ROTAT 0 90 0 +SCALE .01 10 10 \ No newline at end of file diff --git a/scenes/cornell-prob.txt b/scenes/cornell-prob.txt new file mode 100644 index 0000000..7f7bee8 --- /dev/null +++ b/scenes/cornell-prob.txt @@ -0,0 +1,175 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 10 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Refract +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 1.2 +EMITTANCE 0 + +// composite_1 +MATERIAL 6 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0.3 +REFR 0.3 +REFRIOR 1.5 +EMITTANCE 0 + +// composite_2 +MATERIAL 7 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0.3 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// composite_3 +MATERIAL 8 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0 +REFR 0.5 +REFRIOR 1.5 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 6 +TRANS 0 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 7 +sphere +material 7 +TRANS -3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 8 +sphere +material 8 +TRANS 3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 diff --git a/scenes/cornell-refraction.txt b/scenes/cornell-refraction.txt new file mode 100644 index 0000000..19520b8 --- /dev/null +++ b/scenes/cornell-refraction.txt @@ -0,0 +1,165 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Refraction_1 +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 1 +EMITTANCE 0 + +// Refraction_2 +MATERIAL 6 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 1.5 +EMITTANCE 0 + +// Refraction_3 +MATERIAL 7 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 1 +REFRIOR 2 +EMITTANCE 0 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 6 +sphere +material 5 +TRANS 0 3 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 7 +sphere +material 6 +TRANS -3 3 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere +OBJECT 8 +sphere +material 7 +TRANS 3 3 -1 +ROTAT 0 0 0 +SCALE 3 3 3 diff --git a/scenes/cornell.txt b/scenes/cornell.txt index 83ff820..0142845 100644 --- a/scenes/cornell.txt +++ b/scenes/cornell.txt @@ -52,6 +52,8 @@ EMITTANCE 0 CAMERA RES 800 800 FOVY 45 +FOCUS_DIST 11 +LENS_RADIUS 1 ITERATIONS 5000 DEPTH 8 FILE cornell @@ -111,7 +113,7 @@ SCALE .01 10 10 // Sphere OBJECT 6 sphere -material 4 +material 1 TRANS -1 4 -1 ROTAT 0 0 0 SCALE 3 3 3 diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index a1cb3fb..3ca4297 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -19,5 +19,5 @@ set(SOURCE_FILES cuda_add_library(src ${SOURCE_FILES} - OPTIONS -arch=sm_20 + OPTIONS -arch=sm_61 ) diff --git a/src/interactions.h b/src/interactions.h index 5ce3628..5740f04 100644 --- a/src/interactions.h +++ b/src/interactions.h @@ -1,7 +1,7 @@ #pragma once #include "intersections.h" - +#include // CHECKITOUT /** * Computes a cosine-weighted random direction in a hemisphere. @@ -76,4 +76,33 @@ void scatterRay( // TODO: implement this. // A basic implementation of pure-diffuse shading will just call the // calculateRandomDirectionInHemisphere defined above. + + thrust::uniform_real_distribution u01(0, 1); + float prob = u01(rng); + glm::vec3 direction; + glm::vec3 color; + + if (prob < m.hasReflective) { + direction = glm::reflect(pathSegment.ray.direction, normal); + color = m.specular.color; + + } else if (prob < m.hasReflective + m.hasRefractive) { + direction = glm::refract(pathSegment.ray.direction, normal, m.refractRatio); + color = pathSegment.color; + + } else { + direction = calculateRandomDirectionInHemisphere(normal, rng); + color = m.color; + } + + pathSegment.ray.direction = glm::normalize(direction); + pathSegment.ray.origin = intersect + direction * 0.001f; + + pathSegment.color *= color; + pathSegment.remainingBounces--; + + if (pathSegment.remainingBounces == 0) + { + pathSegment.color = glm::vec3(0.0f); + } } diff --git a/src/main.cpp b/src/main.cpp index fe8e85e..b0b42a2 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -142,7 +142,7 @@ void runCuda() { saveImage(); pathtraceFree(); cudaDeviceReset(); - exit(EXIT_SUCCESS); + //exit(EXIT_SUCCESS); } } diff --git a/src/pathtrace.cu b/src/pathtrace.cu index c1ec122..9b53433 100644 --- a/src/pathtrace.cu +++ b/src/pathtrace.cu @@ -4,6 +4,9 @@ #include #include #include +#include + +#include #include "sceneStructs.h" #include "scene.h" @@ -15,9 +18,26 @@ #include "interactions.h" #define ERRORCHECK 1 +#define ANTIALIASING 1 +#define DOF 0 +#define CACHE_FIRST_INTERSECTIONS 0 +#define SORTMATERIAL 0 +#define STREAM_COMPACTION 1 +#define TIMER 0 + + #define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__) #define checkCUDAError(msg) checkCUDAErrorFn(msg, FILENAME, __LINE__) + + +using utilityCore::PerformanceTimer; +PerformanceTimer& timer() +{ + static PerformanceTimer timer; + return timer; +} + void checkCUDAErrorFn(const char *msg, const char *file, int line) { #if ERRORCHECK cudaDeviceSynchronize(); @@ -38,12 +58,79 @@ void checkCUDAErrorFn(const char *msg, const char *file, int line) { #endif } +struct isTerminate +{ + __host__ __device__ + bool operator()(const PathSegment& path) + { + return path.remainingBounces == 0; + } +}; + +struct comparator +{ + __host__ __device__ + bool operator()(const ShadeableIntersection& intersection1, const ShadeableIntersection& intersection2) + { + return (intersection1.materialId > intersection2.materialId); + } +}; + __host__ __device__ thrust::default_random_engine makeSeededRandomEngine(int iter, int index, int depth) { int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index); return thrust::default_random_engine(h); } +__host__ __device__ +glm::vec2 ConcentricSampleDisk(glm::vec2 offset) { + //https://www.dartdocs.org/documentation/dartray/0.0.1/core/ConcentricSampleDisk.html + float r, theta; + // Map uniform random numbers to $[-1,1]^2$ + float sx = 2 * offset.x - 1; + float sy = 2 * offset.y - 1; + + // Map square to $(r,\theta)$ + + // Handle degeneracy at the origin + if (sx == 0.f && sy == 0.f) { + return glm::vec2(0.f); + } + + if (sx >= -sy) { + if (sx > sy) { + // Handle first region of disk + r = sx; + if (sy > 0.f) { + theta = sy / r; + } + else { + theta = 8.f + sy / r; + } + } + else { + // Handle second region of disk + r = sy; + theta = 2.f - sx / r; + } + } + else { + if (sx <= sy) { + // Handle third region of disk + r = -sx; + theta = 4.f - sy / r; + } + else { + // Handle fourth region of disk + r = -sy; + theta = 6.f + sx / r; + } + } + + theta *= PI / 4.f; + return glm::vec2(r * cos(theta), r * sin(theta)); +} + //Kernel that writes the image to the OpenGL PBO directly. __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution, int iter, glm::vec3* image) { @@ -75,6 +162,8 @@ static PathSegment * dev_paths = NULL; static ShadeableIntersection * dev_intersections = NULL; // TODO: static variables for device memory, any extra info you need, etc // ... +static ShadeableIntersection * dev_firstIntersections = NULL; + void pathtraceInit(Scene *scene) { hst_scene = scene; @@ -96,6 +185,8 @@ void pathtraceInit(Scene *scene) { cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); // TODO: initialize any extra device memeory you need + cudaMalloc(&dev_firstIntersections, pixelcount * sizeof(ShadeableIntersection)); + cudaMemset(dev_firstIntersections, 0, pixelcount * sizeof(ShadeableIntersection)); checkCUDAError("pathtraceInit"); } @@ -107,6 +198,7 @@ void pathtraceFree() { cudaFree(dev_materials); cudaFree(dev_intersections); // TODO: clean up any extra device memory you created + cudaFree(dev_firstIntersections); checkCUDAError("pathtraceFree"); } @@ -129,13 +221,39 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path PathSegment & segment = pathSegments[index]; segment.ray.origin = cam.position; - segment.color = glm::vec3(1.0f, 1.0f, 1.0f); + segment.color = glm::vec3(1.0f, 1.0f, 1.0f); + + thrust::default_random_engine rng = makeSeededRandomEngine(iter, x, y); +#if ANTIALIASING // TODO: implement antialiasing by jittering the ray + thrust::uniform_real_distribution uA01(-0.5f, 0.5f); + + segment.ray.direction = glm::normalize(cam.view + - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f + uA01(rng)) + - cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f + uA01(rng)) + ); +#else segment.ray.direction = glm::normalize(cam.view - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f) - cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f) - ); + ); +#endif + +#if DOF + thrust::uniform_real_distribution uA02(0, 1); + glm::vec2 offset = glm::vec2(uA02(rng), uA02(rng)); + glm::vec2 rayOffset = cam.lensRadius * squareToDiskConcentric(offset); + segment.ray.origin = cam.position + cam.right * rayOffset.x + cam.up * rayOffset.y; + + float scale = glm::abs(cam.focusDist / segment.ray.direction.z); + glm::vec3 focusP = cam.position + segment.ray.direction * scale; + segment.ray.direction = glm::normalize(focusP - segment.ray.origin); + +#endif + + + segment.pixelIndex = index; segment.remainingBounces = traceDepth; @@ -208,6 +326,7 @@ __global__ void computeIntersections( intersections[path_index].t = t_min; intersections[path_index].materialId = geoms[hit_geom_index].materialid; intersections[path_index].surfaceNormal = normal; + intersections[path_index].intersectP = intersect_point; } } } @@ -233,11 +352,12 @@ __global__ void shadeFakeMaterial ( if (idx < num_paths) { ShadeableIntersection intersection = shadeableIntersections[idx]; + if (intersection.t > 0.0f) { // if the intersection exists... // Set up the RNG // LOOK: this is how you use thrust's RNG! Please look at // makeSeededRandomEngine as well. - thrust::default_random_engine rng = makeSeededRandomEngine(iter, idx, 0); + thrust::default_random_engine rng = makeSeededRandomEngine(iter, idx, pathSegments[idx].remainingBounces); thrust::uniform_real_distribution u01(0, 1); Material material = materials[intersection.materialId]; @@ -246,21 +366,25 @@ __global__ void shadeFakeMaterial ( // If the material indicates that the object was a light, "light" the ray if (material.emittance > 0.0f) { pathSegments[idx].color *= (materialColor * material.emittance); + pathSegments[idx].remainingBounces = 0; } // Otherwise, do some pseudo-lighting computation. This is actually more // like what you would expect from shading in a rasterizer like OpenGL. // TODO: replace this! you should be able to start with basically a one-liner else { - float lightTerm = glm::dot(intersection.surfaceNormal, glm::vec3(0.0f, 1.0f, 0.0f)); - pathSegments[idx].color *= (materialColor * lightTerm) * 0.3f + ((1.0f - intersection.t * 0.02f) * materialColor) * 0.7f; - pathSegments[idx].color *= u01(rng); // apply some noise because why not - } + //float lightTerm = glm::dot(intersection.surfaceNormal, glm::vec3(0.0f, 1.0f, 0.0f)); + //pathSegments[idx].color *= (materialColor * lightTerm) * 0.3f + ((1.0f - intersection.t * 0.02f) * materialColor) * 0.7f; + //pathSegments[idx].color *= u01(rng); // apply some noise because why not + scatterRay(pathSegments[idx], intersection.intersectP, intersection.surfaceNormal, material, rng); + } // If there was no intersection, color the ray black. // Lots of renderers use 4 channel color, RGBA, where A = alpha, often // used for opacity, in which case they can indicate "no opacity". // This can be useful for post-processing and image compositing. } else { pathSegments[idx].color = glm::vec3(0.0f); + pathSegments[idx].remainingBounces = 0; + } } } @@ -273,7 +397,11 @@ __global__ void finalGather(int nPaths, glm::vec3 * image, PathSegment * iterati if (index < nPaths) { PathSegment iterationPath = iterationPaths[index]; - image[iterationPath.pixelIndex] += iterationPath.color; + + if (iterationPath.remainingBounces == 0) { + image[iterationPath.pixelIndex] += iterationPath.color; + + } } } @@ -295,6 +423,20 @@ void pathtrace(uchar4 *pbo, int frame, int iter) { // 1D block for path tracing const int blockSize1d = 128; + float delta_t = 1.0f / hst_scene->state.iterations; + for (int i = 0; i < hst_scene->geoms.size(); ++i) { + Geom &geom = hst_scene->geoms[i]; + if (geom.moving > 0) { + geom.translation += geom.velocity * delta_t; + geom.rotation += geom.angularVel * delta_t; + hst_scene->geoms[i].transform = utilityCore::buildTransformationMatrix(hst_scene->geoms[i].translation, hst_scene->geoms[i].rotation, hst_scene->geoms[i].scale); + hst_scene->geoms[i].inverseTransform = glm::inverse(hst_scene->geoms[i].transform); + hst_scene->geoms[i].invTranspose = glm::inverseTranspose(hst_scene->geoms[i].transform); + } + } + //cudaMemcpy(dev_geoms, &(hst_scene->geoms)[0], hst_scene->geoms.size() * sizeof(Geom), cudaMemcpyHostToDevice); + cudaMemcpy(dev_geoms, hst_scene->geoms.data(), hst_scene->geoms.size() * sizeof(Geom), cudaMemcpyHostToDevice); + /////////////////////////////////////////////////////////////////////////// // Recap: @@ -326,6 +468,10 @@ void pathtrace(uchar4 *pbo, int frame, int iter) { // TODO: perform one iteration of path tracing +#if TIMER + timer().startGpuTimer(); +#endif + generateRayFromCamera <<>>(cam, iter, traceDepth, dev_paths); checkCUDAError("generate camera ray"); @@ -336,23 +482,60 @@ void pathtrace(uchar4 *pbo, int frame, int iter) { // --- PathSegment Tracing Stage --- // Shoot ray into scene, bounce between objects, push shading chunks - bool iterationComplete = false; + dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d; + bool iterationComplete = false; while (!iterationComplete) { - // clean shading chunks - cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); - - // tracing - dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d; - computeIntersections <<>> ( - depth - , num_paths - , dev_paths - , dev_geoms - , hst_scene->geoms.size() - , dev_intersections - ); - checkCUDAError("trace one bounce"); +#if CACHE_FIRST_INTERSECTIONS + // save first bounce intersections + if (depth == 0 && iter == 1) { + // tracing + computeIntersections << > > ( + depth + , num_paths + , dev_paths + , dev_geoms + , hst_scene->geoms.size() + , dev_firstIntersections + ); + checkCUDAError("trace one bounce"); + cudaMemcpy(dev_intersections, dev_firstIntersections, num_paths * sizeof(ShadeableIntersection), cudaMemcpyDeviceToDevice); + } + else if (depth == 0 && iter > 1) { + cudaMemcpy(dev_intersections, dev_firstIntersections, num_paths * sizeof(ShadeableIntersection), cudaMemcpyDeviceToDevice); + } + else { + // clean shading chunks + cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); + + // tracing + computeIntersections << > > ( + depth + , num_paths + , dev_paths + , dev_geoms + , hst_scene->geoms.size() + , dev_intersections + ); + checkCUDAError("trace more than bounce"); + } +#else + // clean shading chunks + cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); + + // tracing + computeIntersections << > > ( + depth + , num_paths + , dev_paths + , dev_geoms + , hst_scene->geoms.size() + , dev_intersections + ); + checkCUDAError("trace more than bounce"); +#endif + + cudaDeviceSynchronize(); depth++; @@ -366,6 +549,10 @@ void pathtrace(uchar4 *pbo, int frame, int iter) { // TODO: compare between directly shading the path segments and shading // path segments that have been reshuffled to be contiguous in memory. +#if SORTMATERIAL + thrust::sort_by_key(thrust::device, dev_intersections, dev_intersections + num_paths, dev_paths, comparator()); +#endif + shadeFakeMaterial<<>> ( iter, num_paths, @@ -373,12 +560,32 @@ void pathtrace(uchar4 *pbo, int frame, int iter) { dev_paths, dev_materials ); - iterationComplete = true; // TODO: should be based off stream compaction results. - } + + // Assemble this iteration and apply it to the image - dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d; - finalGather<<>>(num_paths, dev_image, dev_paths); + dim3 numBlocksPixels = (num_paths + blockSize1d - 1) / blockSize1d; + finalGather << > >(num_paths, dev_image, dev_paths); + +#if STREAM_COMPACTION + // stream compaction + dev_path_end = thrust::remove_if(thrust::device, dev_paths, dev_paths + num_paths, isTerminate()); + num_paths = dev_path_end - dev_paths; + //std::cout << " remaining paths " << num_paths << std::endl; + iterationComplete = (depth > traceDepth || num_paths == 0); // TODO: should be based off stream compaction results. + +#else + iterationComplete = (depth > traceDepth); +#endif + + + } + +#if TIMER + timer().endGpuTimer(); + std::cout << " elapsed time: " << timer().getGpuElapsedTimeForPreviousOperation() << "ms " << std::endl; +#endif + /////////////////////////////////////////////////////////////////////////// diff --git a/src/scene.cpp b/src/scene.cpp index cbae043..9358a3d 100644 --- a/src/scene.cpp +++ b/src/scene.cpp @@ -74,7 +74,13 @@ int Scene::loadGeom(string objectid) { newGeom.rotation = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); } else if (strcmp(tokens[0].c_str(), "SCALE") == 0) { newGeom.scale = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); - } + } else if (strcmp(tokens[0].c_str(), "MOVING") == 0) { + newGeom.moving = atoi(tokens[1].c_str()); + } else if (strcmp(tokens[0].c_str(), "VELOCITY") == 0) { + newGeom.velocity = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); + } else if (strcmp(tokens[0].c_str(), "ANGULAR_VEL") == 0) { + newGeom.angularVel = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); + } utilityCore::safeGetline(fp_in, line); } @@ -96,7 +102,7 @@ int Scene::loadCamera() { float fovy; //load static properties - for (int i = 0; i < 5; i++) { + for (int i = 0; i < 7; i++) { // originally i < 5 string line; utilityCore::safeGetline(fp_in, line); vector tokens = utilityCore::tokenizeString(line); @@ -111,7 +117,12 @@ int Scene::loadCamera() { state.traceDepth = atoi(tokens[1].c_str()); } else if (strcmp(tokens[0].c_str(), "FILE") == 0) { state.imageName = tokens[1]; - } + } else if (strcmp(tokens[0].c_str(), "LENS_RADIUS") == 0) { + camera.lensRadius = atof(tokens[1].c_str()); + } else if (strcmp(tokens[0].c_str(), "FOCUS_DIST") == 0) { + camera.focusDist = atof(tokens[1].c_str()); + } + } string line; @@ -180,7 +191,9 @@ int Scene::loadMaterial(string materialid) { newMaterial.indexOfRefraction = atof(tokens[1].c_str()); } else if (strcmp(tokens[0].c_str(), "EMITTANCE") == 0) { newMaterial.emittance = atof(tokens[1].c_str()); - } + } else if (strcmp(tokens[0].c_str(), "REFRIOR") == 0) { + newMaterial.refractRatio = atof(tokens[1].c_str()); + } } materials.push_back(newMaterial); return 1; diff --git a/src/sceneStructs.h b/src/sceneStructs.h index b38b820..1e693fa 100644 --- a/src/sceneStructs.h +++ b/src/sceneStructs.h @@ -26,6 +26,10 @@ struct Geom { glm::mat4 transform; glm::mat4 inverseTransform; glm::mat4 invTranspose; + int moving; + glm::vec3 velocity; + glm::vec3 angularVel; + }; struct Material { @@ -38,6 +42,7 @@ struct Material { float hasRefractive; float indexOfRefraction; float emittance; + float refractRatio; }; struct Camera { @@ -49,6 +54,8 @@ struct Camera { glm::vec3 right; glm::vec2 fov; glm::vec2 pixelLength; + float focusDist; + float lensRadius; }; struct RenderState { @@ -73,4 +80,5 @@ struct ShadeableIntersection { float t; glm::vec3 surfaceNormal; int materialId; + glm::vec3 intersectP; }; diff --git a/src/utilities.cpp b/src/utilities.cpp index 9c06c68..e8fd4ff 100644 --- a/src/utilities.cpp +++ b/src/utilities.cpp @@ -110,3 +110,4 @@ std::istream& utilityCore::safeGetline(std::istream& is, std::string& t) { } } } + diff --git a/src/utilities.h b/src/utilities.h index abb4f27..c75697b 100644 --- a/src/utilities.h +++ b/src/utilities.h @@ -8,6 +8,8 @@ #include #include #include +#include +#include #define PI 3.1415926535897932384626422832795028841971f #define TWO_PI 6.2831853071795864769252867665590057683943f @@ -23,4 +25,92 @@ namespace utilityCore { extern glm::mat4 buildTransformationMatrix(glm::vec3 translation, glm::vec3 rotation, glm::vec3 scale); extern std::string convertIntToString(int number); extern std::istream& safeGetline(std::istream& is, std::string& t); //Thanks to http://stackoverflow.com/a/6089413 + + class PerformanceTimer + { + public: + PerformanceTimer() + { + cudaEventCreate(&event_start); + cudaEventCreate(&event_end); + } + + ~PerformanceTimer() + { + cudaEventDestroy(event_start); + cudaEventDestroy(event_end); + } + + void startCpuTimer() + { + if (cpu_timer_started) { throw std::runtime_error("CPU timer already started"); } + cpu_timer_started = true; + + time_start_cpu = std::chrono::high_resolution_clock::now(); + } + + void endCpuTimer() + { + time_end_cpu = std::chrono::high_resolution_clock::now(); + + if (!cpu_timer_started) { throw std::runtime_error("CPU timer not started"); } + + std::chrono::duration duro = time_end_cpu - time_start_cpu; + prev_elapsed_time_cpu_milliseconds = + static_cast(duro.count()); + + cpu_timer_started = false; + } + + void startGpuTimer() + { + if (gpu_timer_started) { throw std::runtime_error("GPU timer already started"); } + gpu_timer_started = true; + + cudaEventRecord(event_start); + } + + void endGpuTimer() + { + cudaEventRecord(event_end); + cudaEventSynchronize(event_end); + + if (!gpu_timer_started) { throw std::runtime_error("GPU timer not started"); } + + cudaEventElapsedTime(&prev_elapsed_time_gpu_milliseconds, event_start, event_end); + gpu_timer_started = false; + } + + float getCpuElapsedTimeForPreviousOperation() //noexcept //(damn I need VS 2015 + { + return prev_elapsed_time_cpu_milliseconds; + } + + float getGpuElapsedTimeForPreviousOperation() //noexcept + { + return prev_elapsed_time_gpu_milliseconds; + } + + // remove copy and move functions + PerformanceTimer(const PerformanceTimer&) = delete; + PerformanceTimer(PerformanceTimer&&) = delete; + PerformanceTimer& operator=(const PerformanceTimer&) = delete; + PerformanceTimer& operator=(PerformanceTimer&&) = delete; + + private: + cudaEvent_t event_start = nullptr; + cudaEvent_t event_end = nullptr; + + using time_point_t = std::chrono::high_resolution_clock::time_point; + time_point_t time_start_cpu; + time_point_t time_end_cpu; + + bool cpu_timer_started = false; + bool gpu_timer_started = false; + + float prev_elapsed_time_cpu_milliseconds = 0.f; + float prev_elapsed_time_gpu_milliseconds = 0.f; + }; + + }