GPU Accelerated OCT Processing at MHz A-Scan Rate and High Resolution Video Rate Volumetric Rendering


In this paper, we describe how to highly optimize a CUDA based platform to perform real-time processing of optical coherence tomography interferometric data and 3D volumetric rendering using commercially-available cost-effective graphic processing units (GPUs). The maximum complete attainable axial scan processing rate (including memory transfer and displaying Bscan frame) was 2.24 megahertz for 16 bits pixel depth and 2048 FFT size; the maximum 3D volumetric rendering (including B-scan, en face view display, and 3D rendering) rate was ~23 volumes/second (volume size:1024x256x200). To the best of our knowledge, this is the fastest processing rate reported to date with a single-chip GPU and the first implementation of realtime video-rate volumetric OCT processing and rendering that is capable of matching the acquisition rates of ultrahigh-speed OCT.

Open Source Projects


The open source projects currently includes working version of File Read and Basler Acquisition. The Alazar and Dalsa code will be uploaded very soon! These projects can be found at:

http://code.google.com/p/fdoct-gpu-code

Please send an e-mail to ksw10@sfu.ca or yjian@sfu.ca for further information.

Figures and Videos



Figure 1: Flow chart for OCT acquisition and GPU processing.


Figure 2: CUDA Profiler Timeline for SS OCT processing and volume rendering pipeline. Memcpy HtoD [async] is transferring the raw data, subDC_PadCom is a kernel with the function of type conversion, DC subtraction and zero padding. spVector2048C_kernelTex is a kernel called by cuFFT library which performs the FFT. CropMLS performs the modulus and Log. Memcpy DtoA [sync] copy the processed OCT data into CUDA 3D Array for volume rendering.


Figure 3: OCT A-Scan batch processing rate. For SSOCT, processing pipeline includes DC subtraction, FFT, modulus and Log. For SDOCT, processing pipeline includes linear interpolation, DC subtraction, dispersion compensation, FFT, modulus and Log.



Figure 4: FDOCT Acquisition Simulation @ 2.24 Megahertz Processing Rate and Volume Rendering at Video Rate (24 FPS) for Volume size of 1024x256x200.



Figure 5: GPU Accelerated Swept Source OCT with 1060nm Processing, Rendering, and Display.



Figure 6: GPU Accelerated Swept Source OCT with 1310nm Processing, Rendering, and Display.



Figure 7: Simulation of Ultrahigh Speed OCT for Acquisition, Processing, and Display.


Reference


Y. Jian, K. Wong, M.V. Sarunic. "GPU Accelerated OCT Processing at Megahertz Axial Scan Rate and High Resolution Video Rate Volumetric Rendering", Journal of Biomedical Optics(2012).