PacBio 是第三代定序技術 (Third generation sequencing)的技術公司,目前的最新機型為RSII,最長長度可到30kb。
出來的rawdata其檔案格式為hdf5,其中有兩種,一個是bas.h5和bax.h5,其PacBio的文件有寫到以下:
"Due to the increased throughput and read lengths achieved by the PacBio® RS II upgrade, this information is now contained in one bas.h5 file and three bax.h5 files."
http://files.pacb.com/software/instrument/2.0.0/bas.h5%20Reference%20Guide.pdf
hdf5 格式並無法直接觀看,如果要看序列,必須要經過格式轉換,PacBio有出一個軟體為pbh5tools來進行hdf5的格式轉換。
pbh5tools是以python寫的套件,所以安裝前需要安裝其他python套件,在安裝python套件用pip比較好,pip會兼顧各個依存的套件版本,少用easy-install。
http://blog.longwin.com.tw/2014/08/python-setup-pip-package-2014/
除了python套件外,其他的套件還有python-devel 和 hdf5與其相關套件。
其安裝其他requirement套件指令為:
sudo yum install python-pip python-devel.x86_64 hdf5-devel.x86_64 hdf5.x86_64 hdf5-mpich-devel.x86_64 hdf5-mpich.x86_64 hdf5-openmpi.x86_64 hdf5-openmpi-devel.x86_64
sudo pip install --upgrade pip
sudo pip install --upgrade nose
sudo pip install --upgrade Cython
sudo pip install --upgrade numpy
sudo pip install --upgrade six
sudo pip install --upgrade setuptools
sudo pip install --upgrade pkgconfig
sudo pip install --upgrade pysam
sudo pip install --upgrade h5py
sudo pip install --upgrade pbcore
其中,pbcore需要python 2.7,故如果系統為python 2.6又無法利用yum升級上去,就必須手動安裝python 2.7,並注意需雙版本依存。
參考資料如下:
http://ruiaylin.github.io/2014/12/12/python%20update/
https://github.com/h2oai/h2o-2/wiki/Installing-python-2.7-on-centos-6.3.-Follow-this-sequence-exactly-for-centos-machine-only
http://toomuchdata.com/2014/02/16/how-to-install-python-on-centos/
http://tecadmin.net/install-python-2-7-on-centos-rhel/