SLI Zone
NVIDIA.com Developer Home

The HR algorithm plays a significant role in solving nonsymmetric eigenvalue problems. By porting it to the GPU, Tomov and Dongarra, from the University of Tennessee and Oakridge National labs, report  a 16X performance improvement over the latest LAPACK 3.1 algorithm running just on current multicores (in double precision arithmetic). In addition, the paper shows a way of accelerating a large and important class of DLA algorithms, namely the two-sided factorizations.

 

See http://www.nvidia.com/object/cuda_home.html#state=detailsOpen;aid=f3c0c426-1df9-4fb3-b9f4-1bd7f62a6978 on CUDA Zone.


07/30/2009 | Direct link

The CUDA Toolkit and SDK v2.3 are now released and available to all developers.

Hardware debugging, significant performance improvements in single-precision transforms, new samples, and many other new features can be found in this release.

You can find more details about this release, as well as information on how to download the SDK for Windows, OS X, or Linux, at this location


 


07/22/2009 | Direct link

We are kicking off a new series of online seminars on GPU computing with CUDA, including new sessions on OpenCL and best practice guidance.  Advance registration is required. More details can be found at http://developer.nvidia.com/object/gpu_computing_online.html


07/16/2009 | Direct link

The "CUDA C Programming Best Practices Guide" is now available to all GPU Computing Registered Developers.  This helpful guide includes chapters on the following topics and more:

  • Introduction to Parallel Computing with CUDA

  • Performance Metrics

  • Memory Optimizations

  • Execution Configuration Optimizations

  • Instruction Optimizations

  • Control Flow

  • Debugging

  • Numerical Accuracy and Precsion

  • Performance Optimization Strategies

Follow this link for more details and discussion.


07/08/2009 | Direct link

PerfHUD