The following contents was posted to the registered developer website on January 31st, 2013.
(new: [url]https://developer.nvidia.com/user/register[/url]; old: [url]https://partners.nvidia.com[/url]):
SIMD-in-a-word functions
simd_functions.h contains a collection of inline functions for processing byte and half-word data packed into 32-bit words. The functions are hardware accelerated on Kepler platforms. Efficient emulation code is provided for earlier platform so the functions are portable across all compute capabilities. The functionality provided should be useful for image processing tasks and other application areas.
The list of supported functions is as follows:
[code]
vabsdiffu2(a,b) per-halfword unsigned absolute difference: |a - b|
vadd2(a,b) per-halfword (un)signed addition, with wrap-around: a + b
vavgu2(a,b) per-halfword unsigned rounded average: (a + b + 1) / 2
vcmpeq2(a,b) per-halfword (un)signed comparison: a == b ? 0xffff : 0
vcmpgeu2(a,b) per-halfword unsigned comparison: a >= b ? 0xffff : 0
vcmpgtu2(a,b) per-halfword unsigned comparison: a > b ? 0xffff : 0
vcmpleu2(a,b) per-halfword unsigned comparison: a <= b ? 0xffff : 0
vcmpltu2(a,b) per-halfword unsigned comparison: a < b ? 0xffff : 0
vcmpne2(a,b) per-halfword (un)signed comparison: a != b ? 0xffff : 0
vhaddu2(a,b) per-halfword unsigned average: (a + b) / 2
vmaxu2(a,b) per-halfword unsigned maximum: max(a, b)
vminu2(a,b) per-halfword unsigned minimum: min(a, b)
vseteq2(a,b) per-halfword (un)signed comparison: a == b ? 1 : 0
vsetgeu2(a,b) per-halfword unsigned comparison: a >= b ? 1 : 0
vsetgtu2(a,b) per-halfword unsigned comparison: a > b ? 1 : 0
vsetleu2(a,b) per-halfword unsigned comparison: a <= b ? 1 : 0
vsetltu2(a,b) per-halfword unsigned comparison: a < b ? 1 : 0
vsetne2(a,b) per-halfword (un)signed comparison: a != b ? 1 : 0
vsub2(a,b) per-halfword (un)signed subtraction, with wrap-around: a - b
vabsdiffu4(a,b) per-byte unsigned absolute difference: |a - b|
vadd4(a,b) per-byte (un)signed addition, with wrap-around: a + b
vavgu4(a,b) per-byte unsigned rounded average: (a + b + 1) / 2
vcmpeq4(a,b) per-byte (un)signed comparison: a == b ? 0xff : 0
vcmpgeu4(a,b) per-byte unsigned comparison: a >= b ? 0xff : 0
vcmpgtu4(a,b) per-byte unsigned comparison: a > b ? 0xff : 0
vcmpleu4(a,b) per-byte unsigned comparison: a <= b ? 0xff : 0
vcmpltu4(a,b) per-byte unsigned comparison: a < b ? 0xff : 0
vcmpne4(a,b) per-byte (un)signed comparison: a != b ? 0xff: 0
vhaddu4(a,b) per-byte unsigned average: (a + b) / 2
vmaxu4(a,b) per-byte unsigned maximum: max(a, b)
vminu4(a,b) per-byte unsigned minimum: min(a, b)
vseteq4(a,b) per-byte (un)signed comparison: a == b ? 1 : 0
vsetgeu4(a,b) per-byte unsigned comparison: a >= b ? 1 : 0
vsetgtu4(a,b) per-byte unsigned comparison: a > b ? 1 : 0
vsetleu4(a,b) per-byte unsigned comparison: a <= b ? 1 : 0
vsetltu4(a,b) per-byte unsigned comparison: a < b ? 1 : 0
vsetne4(a,b) per-byte (un)signed comparison: a != b ? 1: 0
vsub4(a,b) per-byte (un)signed subtraction, with wrap-around: a - b
[/code]

simd_functions.h contains a collection of inline functions for processing byte and half-word data packed into 32-bit words. The functions are hardware accelerated on Kepler platforms. Efficient emulation code is provided for earlier platform so the functions are portable across all compute capabilities. The functionality provided should be useful for image processing tasks and other application areas.

The list of supported functions is as follows:

vabsdiffu2(a,b) per-halfword unsigned absolute difference: |a - b|
vadd2(a,b) per-halfword (un)signed addition, with wrap-around: a + b
vavgu2(a,b) per-halfword unsigned rounded average: (a + b + 1) / 2
vcmpeq2(a,b) per-halfword (un)signed comparison: a == b ? 0xffff : 0
vcmpgeu2(a,b) per-halfword unsigned comparison: a >= b ? 0xffff : 0
vcmpgtu2(a,b) per-halfword unsigned comparison: a > b ? 0xffff : 0
vcmpleu2(a,b) per-halfword unsigned comparison: a <= b ? 0xffff : 0
vcmpltu2(a,b) per-halfword unsigned comparison: a < b ? 0xffff : 0
vcmpne2(a,b) per-halfword (un)signed comparison: a != b ? 0xffff : 0
vhaddu2(a,b) per-halfword unsigned average: (a + b) / 2
vmaxu2(a,b) per-halfword unsigned maximum: max(a, b)
vminu2(a,b) per-halfword unsigned minimum: min(a, b)
vseteq2(a,b) per-halfword (un)signed comparison: a == b ? 1 : 0
vsetgeu2(a,b) per-halfword unsigned comparison: a >= b ? 1 : 0
vsetgtu2(a,b) per-halfword unsigned comparison: a > b ? 1 : 0
vsetleu2(a,b) per-halfword unsigned comparison: a <= b ? 1 : 0
vsetltu2(a,b) per-halfword unsigned comparison: a < b ? 1 : 0
vsetne2(a,b) per-halfword (un)signed comparison: a != b ? 1 : 0
vsub2(a,b) per-halfword (un)signed subtraction, with wrap-around: a - b

vabsdiffu4(a,b) per-byte unsigned absolute difference: |a - b|
vadd4(a,b) per-byte (un)signed addition, with wrap-around: a + b
vavgu4(a,b) per-byte unsigned rounded average: (a + b + 1) / 2
vcmpeq4(a,b) per-byte (un)signed comparison: a == b ? 0xff : 0
vcmpgeu4(a,b) per-byte unsigned comparison: a >= b ? 0xff : 0
vcmpgtu4(a,b) per-byte unsigned comparison: a > b ? 0xff : 0
vcmpleu4(a,b) per-byte unsigned comparison: a <= b ? 0xff : 0
vcmpltu4(a,b) per-byte unsigned comparison: a < b ? 0xff : 0
vcmpne4(a,b) per-byte (un)signed comparison: a != b ? 0xff: 0
vhaddu4(a,b) per-byte unsigned average: (a + b) / 2
vmaxu4(a,b) per-byte unsigned maximum: max(a, b)
vminu4(a,b) per-byte unsigned minimum: min(a, b)
vseteq4(a,b) per-byte (un)signed comparison: a == b ? 1 : 0
vsetgeu4(a,b) per-byte unsigned comparison: a >= b ? 1 : 0
vsetgtu4(a,b) per-byte unsigned comparison: a > b ? 1 : 0
vsetleu4(a,b) per-byte unsigned comparison: a <= b ? 1 : 0
vsetltu4(a,b) per-byte unsigned comparison: a < b ? 1 : 0
vsetne4(a,b) per-byte (un)signed comparison: a != b ? 1: 0
vsub4(a,b) per-byte (un)signed subtraction, with wrap-around: a - b

(1) Log in at https://developer.nvidia.com/user
(2) Click green link "CUDA/GPU Computing Registered Developer Program"
(3) Click green link "Download" after "CUDA SIMD-within-a-word functions"
(4) Click green link "simd_functions.tar"
(5) Confirm legal notice by clicking "Agree & Download" button at the bottom of the notice
The download should start automatically at that point.

(2) Click green link "CUDA/GPU Computing Registered Developer Program"
(3) Click green link "Download" after "CUDA SIMD-within-a-word functions"
(4) Click green link "simd_functions.tar"
(5) Confirm legal notice by clicking "Agree & Download" button at the bottom of the notice

The download should start automatically at that point.

Sorry for the inconvenience. I do not have Chrome here to repro (just IE and Firefox). I will notify the relevant team so they can look into this. Thanks for alerting us to this issue.

Sorry for the inconvenience. I do not have Chrome here to repro (just IE and Firefox). I will notify the relevant team so they can look into this. Thanks for alerting us to this issue.

We have been unable to reproduce the problem with operating the download page in Chrome. Maybe the problem is tied to a specific version of Chrome or specific security settings? You might want to file a bug with details of the exact Chrome configuration. It would also be of interest to hear whether other Chrome users encountered issues with the download page.

We have been unable to reproduce the problem with operating the download page in Chrome. Maybe the problem is tied to a specific version of Chrome or specific security settings? You might want to file a bug with details of the exact Chrome configuration. It would also be of interest to hear whether other Chrome users encountered issues with the download page.

(new: https://developer.nvidia.com/user/register; old: https://partners.nvidia.com):

SIMD-in-a-word functions

simd_functions.h contains a collection of inline functions for processing byte and half-word data packed into 32-bit words. The functions are hardware accelerated on Kepler platforms. Efficient emulation code is provided for earlier platform so the functions are portable across all compute capabilities. The functionality provided should be useful for image processing tasks and other application areas.

The list of supported functions is as follows:

I am a registered developer but cannot find this under developer.nvidia.com.

Thanks

(2) Click green link "CUDA/GPU Computing Registered Developer Program"

(3) Click green link "Download" after "CUDA SIMD-within-a-word functions"

(4) Click green link "simd_functions.tar"

(5) Confirm legal notice by clicking "Agree & Download" button at the bottom of the notice

The download should start automatically at that point.

I tried with Chrome in both OSX and Linux.