Skip to content

Latest commit

 

History

History
32 lines (17 loc) · 1.74 KB

File metadata and controls

32 lines (17 loc) · 1.74 KB

simpleStreams - simpleStreams

Description

This sample uses CUDA streams to overlap kernel executions with memory copies between the host and a GPU device. This sample uses a new CUDA 4.0 feature that supports pinning of generic host memory. Requires Compute Capability 2.0 or higher.

Key Concepts

Asynchronous Data Transfers, CUDA Streams and Events

Supported SM Architectures

SM 5.0 SM 5.2 SM 5.3 SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0

Supported OSes

Linux, Windows

Supported CPU Architecture

x86_64, armv7l

CUDA APIs involved

cudaMemcpy, cudaSetDeviceFlags, cudaSetDevice, cudaEventDestroy, cudaStreamCreate, cudaMallocHost, cudaEventCreateWithFlags, cudaFreeHost, cudaMemcpyAsync, cudaGetDeviceCount, cudaStreamDestroy, cudaMemset, cudaEventElapsedTime, cudaHostAlloc, cudaFree, cudaHostRegister, cudaEventSynchronize, cudaEventRecord, cudaMalloc, cudaGetDeviceProperties, cudaHostUnregister

Prerequisites

Download and install the CUDA Toolkit for your corresponding platform.

References (for more details)