哈工大分词安装及使用linux

安装
    py@ubuntu:~/models/syntaxnet$ sudo pip install pyltp

[sudo] password for py:
The directory '/home/py/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/py/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting pyltp
  Downloading pyltp-0.1.9.1.tar.gz (3.3MB)
    100% |████████████████████████████████| 3.3MB 211kB/s
Installing collected packages: pyltp
  Running setup.py install for pyltp ... done
Successfully installed pyltp-0.1.9.1

在https://pan.baidu.com/share/link?shareid=1988562907&uk=2738088569#list/path=%2F 下载模型文件,里面有很多,
我下了ltp_data_v3.4.0.tar.gz,复制到/usr/local

    py@ubuntu:~/models/syntaxnet$ sudo cp ~/Desktop/ltp_data_v3.4.0.tar.gz  /usr/local

[sudo] password for py:
py@ubuntu:~/models/syntaxnet$ cd /usr/local
py@ubuntu:/usr/local$ ls
bin  etc  games  include  lib  ltp_data_v3.4.0.tar.gz  man  sbin  share  src
解压
    py@ubuntu:/usr/local$ sudo tar -xzvf ltp_data_v3.4.0.tar.gz
ltp_data_v3.4.0/
ltp_data_v3.4.0/cws.model
ltp_data_v3.4.0/md5.txt
ltp_data_v3.4.0/ner.model
ltp_data_v3.4.0/parser.model
ltp_data_v3.4.0/pisrl.model
ltp_data_v3.4.0/pos.model
ltp_data_v3.4.0/version
py@ubuntu:/usr/local$ ls
bin  games    lib              ltp_data_v3.4.0.tar.gz  sbin   src
etc  include  ltp_data_v3.4.0  man                     share
重命名
    py@ubuntu:/usr/local$ sudo mv ltp_data_v3.4.0 ltp

py@ubuntu:/usr/local$ ls
bin  etc  games  include  lib  ltp  ltp_data_v3.4.0.tar.gz  man  sbin  share  src
py@ubuntu:/usr/local$ sudo rm -rf ltp_data_v3.4.0.tar.gz

测试
py@ubuntu:/usr/local$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyltp import Segmentor
>>> segmentor = Segmentor
>>> segmentor = Segmentor()
>>> segmentor.load("/usr/local/ltp/cws.model")
>>> words = segmentor.segment("这本书很好")
>>> print ' '.join(words)
这    本    书    很    好
>>> segmentor.release()
>>>

原文地址:https://www.cnblogs.com/herosoft/p/8134213.html