Question about installation of PFAC library

Dear all,

I am trying to achieve high-speed DNA sequence data anlaysis using our original algorithm, and consider that cuda and PFAC library should be useful for this trial.
I have a machine with Tesla K20c, where ubuntu and CUDA-7.0 is already installed. I downloaded the package from GitHub - pfac-lib/PFAC: PFAC is an open library for exact string matching performed on NVIDIA GPUs, and try make in directory PFAC. To solve compile errors occured in the first trial, I modified common.mk and src/Makefile as follows:

%diff common.mk common.mk~
63,64c63,64
< sm_21_support := $(if $(filter $(nvcc_version), 3.2 4.0 4.1 4.2 7.0),1,)
< sm_30_support := $(if $(filter $(nvcc_version), 4.2 7.0),1,)

sm_21_support := $(if $(filter $(nvcc_version), 3.2 4.0 4.1 4.2),1,)
sm_30_support := $(if $(filter $(nvcc_version), 4.2),1,)
% diff Makefile~ Makefile
51,53c51,53
< #cu_cpp_sm13_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm13_%.cpp,$(CU_CPP))
< #cu_cpp_sm12_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm12_%.cpp,$(CU_CPP))
< #cu_cpp_sm11_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm11_%.cpp,$(CU_CPP))


cu_cpp_sm13_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm13_%.cpp,$(CU_CPP))
cu_cpp_sm12_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm12_%.cpp,$(CU_CPP))
cu_cpp_sm11_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm11_%.cpp,$(CU_CPP))
58,60c58,60
< #cu_cpp_obj_sm13_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm13_%.cpp.o,$(CU_CPP))
< #cu_cpp_obj_sm12_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm12_%.cpp.o,$(CU_CPP))
< #cu_cpp_obj_sm11_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm11_%.cpp.o,$(CU_CPP))


cu_cpp_obj_sm13_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm13_%.cpp.o,$(CU_CPP))
cu_cpp_obj_sm12_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm12_%.cpp.o,$(CU_CPP))
cu_cpp_obj_sm11_loc = $(patsubst %.cpp,$(OBJ_DIR)/sm11_%.cpp.o,$(CU_CPP))
71,73c71
< mk_libso_no21: $(cu_cpp_sm20_loc)
< $(CXX) -shared -o $(LIB_DIR)/libpfac_sm20.so $(LIBS) $(cu_cpp_obj_sm20_l
oc)
< #mk_libso_no21: $(cu_cpp_sm20_loc) $(cu_cpp_sm13_loc) $(cu_cpp_sm12_loc) $(cu_cpp_sm11_loc)


#mk_libso_no21: $(cu_cpp_sm20_loc)
75,77c73,77
< # $(CXX) -shared -o $(LIB_DIR)/libpfac_sm13.so $(LIBS) $(cu_cpp_obj_sm13_loc)
< # $(CXX) -shared -o $(LIB_DIR)/libpfac_sm12.so $(LIBS) $(cu_cpp_obj_sm12_loc)
< # $(CXX) -shared -o $(LIB_DIR)/libpfac_sm11.so $(LIBS) $(cu_cpp_obj_sm11_loc)


mk_libso_no21: $(cu_cpp_sm20_loc) $(cu_cpp_sm13_loc) $(cu_cpp_sm12_loc) $(cu_cpp_sm11_loc)
$(CXX) -shared -o $(LIB_DIR)/libpfac_sm20.so $(LIBS) $(cu_cpp_obj_sm20_loc)
$(CXX) -shared -o $(LIB_DIR)/libpfac_sm13.so $(LIBS) $(cu_cpp_obj_sm13_loc)
$(CXX) -shared -o $(LIB_DIR)/libpfac_sm12.so $(LIBS) $(cu_cpp_obj_sm12_loc)
$(CXX) -shared -o $(LIB_DIR)/libpfac_sm11.so $(LIBS) $(cu_cpp_obj_sm11_loc)

After retrial of make, the test program enclosed in the archive, named “simple_example.exe” seemed to be successfully generated. However, test run of this program resulted in abort at first assertion.
simple_example.exe: simple_example.cpp:62: int main(int, char**): Assertion `PFAC_STATUS_SUCCESS == PFAC_status’ failed.
I checked by gdb that the value of PFAC_status was PFAC_STATUS_ARCH_MISMATCH when the error occurred.
How can I solve this program?
Thank you for any help.
Koji Doi

Your tesla K20c is a cc3.5 device. The PFAC was apparently written to exclusively support only cc1.1, cc1.2, cc1.3, cc2.0, cc2.1 and cc3.0 devices. Take a look in PFAC.cpp (starting at line 153):

cudaDeviceProp deviceProp;
    cudaGetDeviceProperties(&deviceProp, device);

    PFAC_PRINTF("major = %d, minor = %d, name=%s\n", deviceProp.major, deviceProp.minor, deviceProp.name );

    int device_no = 10*deviceProp.major + deviceProp.minor ;
    if ( 30 == device_no ){
        strcpy (modulepath, "libpfac_sm30.so");    
    }else if ( 21 == device_no ){
        strcpy (modulepath, "libpfac_sm21.so");    
    }else if ( 20 == device_no ){
        strcpy (modulepath, "libpfac_sm20.so");
    }else if ( 13 == device_no ){
        strcpy (modulepath, "libpfac_sm13.so");
    }else if ( 12 == device_no ){
        strcpy (modulepath, "libpfac_sm12.so");
    }else if ( 11 == device_no ){
        strcpy (modulepath, "libpfac_sm11.so");
    }else{
        return PFAC_STATUS_ARCH_MISMATCH ;
    }

Your cc3.5 device will put 35 into the above test, and it will fail with PFAC_STATUS_ARCH_MISMATCH

You could try modifying the above test in PFAC.cpp to:

cudaDeviceProp deviceProp;
    cudaGetDeviceProperties(&deviceProp, device);

    PFAC_PRINTF("major = %d, minor = %d, name=%s\n", deviceProp.major, deviceProp.minor, deviceProp.name );

    int device_no = 10*deviceProp.major + deviceProp.minor ;
    if (( 30 == device_no ) || ( 35 == device_no)){ // ****modification here on this line only ****
        strcpy (modulepath, "libpfac_sm30.so");    
    }else if ( 21 == device_no ){
        strcpy (modulepath, "libpfac_sm21.so");    
    }else if ( 20 == device_no ){
        strcpy (modulepath, "libpfac_sm20.so");
    }else if ( 13 == device_no ){
        strcpy (modulepath, "libpfac_sm13.so");
    }else if ( 12 == device_no ){
        strcpy (modulepath, "libpfac_sm12.so");
    }else if ( 11 == device_no ){
        strcpy (modulepath, "libpfac_sm11.so");
    }else{
        return PFAC_STATUS_ARCH_MISMATCH ;
    }

And see what happens, since your cc3.5 device should work fine with a cc3.0 library, but I haven’t actually tested that with this application.

Thank you very much, txbob.
I modified PFAC.cpp as suggested, updated all libraries. The problem at the first assertion seems to be solved.

However, simple_example.exe aborts at step 4. the following line.

PFAC_status = PFAC_matchFromHost( handle, h_inputString, input_size, h_matched_result ) ;

PFAC_matchFromHost returns PFAC_STATUS_CUDA_ALLOC_FAILED.
I could not find any information to understand what this error code means and how to solve.
Does anyone have suggestions?

Koji Doi

Since CUDA 4.1 the function cudaGetTextureReference() is marked as deprecated.
Link: http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__TEXTURE__DEPRECATED_gc86c8aca553c7d6dcd8c57e5b93882a8.html

The PFAC library use these function in all *.cu kernel files to save transition tables for the state machine in the texture memory of the device. To fix the problem modify all *cu files from:

cudaGetTextureReference( (const struct textureReference**)&texRefTable, "tex_PFAC_table" );

to

cudaGetTextureReference( (const struct textureReference**)&texRefTable, &tex_PFAC_table );

But I have not yet verified the results!

Hi there,

I have a question about the installation on Mac OS. I run the installation of PFAC from the github depository.
I have CUDA-6-0 installed and it works well. During the installation with the command “make”, I have the following problem:

g++ -m64 -fopenmp -shared -o ../lib/libpfac_sm20.so -L/<b>Developer/NVIDIA/CUDA-6.0/lib64</b> -lcudart -ldl -lpthread ../obj/sm20_PFAC_kernel.cu.cpp.o ../obj/sm20_PFAC_reduce_kernel.cu.cpp.o ../obj/sm20_PFAC_reduce_inplace_kernel.cu.cpp.o ../obj/sm20_PFAC_kernel_spaceDriven.cu.cpp.o
ld: warning: directory not found for option '-L/Developer/NVIDIA/CUDA-6.0/lib64'
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [mk_libso_no21] Error 1
make: *** [src] Error 2

The point is, I never had the “lib64” in my CUDA-6-0 directory…

What am I suppose to do? I tried to change my CUDA version. I tried with the 3.2 version mentioned in PFAC documentation, and the 7-5 to check if the “lib64” appears. But nothing.

Any kind of help will be appreciated, and thank you for your time. :)

Hi Wocka,
this library use OpenMP to parallelize the CPU version and i think also to use more then one GPU. I don’t use Mac OS but if found something on Stack Overflow that can help you.

Error enabling openmp - “ld: library not found for -lgomp” and Clang errors:
http://stackoverflow.com/questions/20321988/error-enabling-openmp-ld-library-not-found-for-lgomp-and-clang-errors

Thank you a lot falCon1394 for this link. I saw this kind of subject, and I will try to go a step forward with your link.

A little question for you, are you using linux? Do you have this folder lib64 with you CUDA version?

Yes i am using Linux with the CUDA Version 7.5 and i have this folder. But on Linux there are many symlinks to the real objects in my system.

All right. I am trying to install an other GCC compiler version as your link suggested. Then, I will try to install PFAC library again with the other compiler.

Are you using PFAC with this CUDA version? There is no problem with this? Because I read that the 3.2 version of CUDA is the officially tested, right?

Thank you for your advice. :)

The original Library doesn’t run on my System (Git master branch). But with the described changes it works on Linux.

Good luck and have fun with the lib!


A little side node for this thread. The authors wrote a subsequent paper to accelerate regular expressions with this library.

Paper name: Accelerating Regular Expression Matching Using Hierarchical Parallel Machines on GPU
http://cial.csie.ncku.edu.tw/st2011/pdf/Accelerating%20Regular%20Expression%20Matching%20Using.pdf

Hi Guys,

I know this is a long shot (based on the time), but I seem to have the same error as Kojidoi.

simple_example.exe: simple_example.cpp:62: int main(int, char**): Assertion `PFAC_STATUS_SUCCESS == PFAC_status' failed.

I seem to have a problem with LD_LIBRARY_PATH, where PFAC_LIB_ROOT seem not to be set properly.

I tried

export LD_LIBRARY_PATH=$(PFAC_LIB_ROOT)/lib:$LD_LIBRARY_PATH

This returns an error

I then tried to export

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/alex/PFAC-master/PFAC/lib:/usr/local/cuda-4.0/lib:/usr/local/cuda-7.0/lib:/usr/local/cuda-7.0/lib64:/usr/local/cuda-4.0/lib64

However, when running the simple example again, I have the following error

PFAC_STATUS_LIB_NOT_EXIST: cannot find PFAC library, please check LD_LIBRARY_PATH 
simple_example.exe: simple_example.cpp:63: int main(int, char**): Assertion `PFAC_STATUS_SUCCESS == PFAC_status' failed.
Aborted (core dumped)

As you can see, it can’t find the PFAC Library. I have looked at the manual PFAC/PFAC_userGuide_r1.1.pdf at master · pfac-lib/PFAC · GitHub but I failed to find a solution.

Any advice would be welcome

– Sorry double post –