好用好玩的Python包

click

最好的实现命令行的方式，它基于optparse。
optparse和argparse是python标准库，但它们不够好用。
docopt通过解析注释来实现命令行。

Fire

无侵入的命令行工具，和click、argparse是同类东西。

ujson

用C++实现的json解析器，速度飞快。

prettytable

在控制台下使用表格展示数据

import prettytable
t=prettytable.PrettyTable(['name','age'])
for i in range(10):
    t.add_row(['user%s'%i,i])
print(t)

tabulate

tabulate包只有一个tabulate函数

data参数

支持打印的数据类型

二维list或者其它二维可迭代对象
一维字典列表，每个字典表示一行
二维numpy数组
numpy记录数组
pandas.DataFrame

firstrow
keys
字符串列表

showindex

布尔值，是否显示每行的下标
rowId列表，手动指定每行的ID

tablefmt

"plain"
"simple"
"github"
"grid"
"fancy_grid"
"pipe"
"orgtbl"
"jira"
"presto"
"psql"
"rst"
"mediawiki"
"moinmoin"
"youtrack"
"html"
"latex"
"latex_raw"
"latex_booktabs"
"textile"

参考资料

wget

>>> import wget
>>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3'
>>> filename = wget.download(url)
100% [................................................] 3841532 / 3841532>
>> filename
'razorback.mp3'

linux和osx的用户还有其它选择：from sh import wget。

progressbar

人尽皆知的tqdm就不必多说了，这里说一个更为灵活的进度条

from progressbar import ProgressBar
import time
pbar = ProgressBar(maxval=10)
for i in range(1, 11):
    pbar.update(i)
    time.sleep(1)
pbar.finish()

colorama

向控制台打印彩色文字。这个库非常简洁优美，它主要包含四种东西：

Fore：设置前景色
Back：设置背景色
Style：设置字体粗细，分为Bright、DIM、NORMAL、RESET_ALL四种
Cursor：控制光标，分为UP，DOWN，FORWARD，BACK，POS五种函数，其中方向函数接受一个参数n，表示移动的格数，POS接受x和y两个参数表示把光标移动到的位置。

import colorama

colorama.init(True)
print(dir(colorama))
for t in "Fore Back Style".split():
    ty = getattr(colorama, t)
    for i in dir(ty):
        if i.startswith('__'): continue
        print(t, i, getattr(ty, i) + "天下大势为我所控")

print("haha" + colorama.Cursor.BACK(2) + "baga")  # 输出habaga

如下代码展示了颜色打印的原理

class Colorize:
    color_map = {
        'black': 0,
        'red': 1,
        'green': 2,
        'yellow': 3,
        'blue': 4,
        'magenta': 5,
        'cyan': 6,
        'white': 7
    }
    format_buffer = dict(
        bg_color=None,
        text_color=None,
        is_bold=None,
    )

    @classmethod
    def text(cls, color):
        cls.format_buffer['text_color'] = cls.color_map.get(color.lower(), None)
        if cls.format_buffer['text_color'] is not None:
            cls.format_buffer['text_color'] += 30
        return cls

    @classmethod
    def bg(cls, color):
        cls.format_buffer['bg_color'] = cls.color_map.get(color.lower(), None)
        if cls.format_buffer['bg_color'] is not None:
            cls.format_buffer['bg_color'] += 40
        return cls

    @classmethod
    def bold(cls):
        cls.format_buffer['is_bold'] = 1
        return cls

    def __new__(cls, *message, delimiter=' '):
        result = '33[{}m{}33[0m'.format(';'.join([str(x) for x in cls.format_buffer.values() if x is not None]),
                                            delimiter.join([str(m) for m in message]))
        cls.format_buffer['text_color'] = None
        cls.format_buffer['bg_color'] = None
        cls.format_buffer['is_bold'] = None
        return result

使用时print(Colorize.text(color)(s))

functools.lru_cache

最近最少使用装饰器，用于缓存函数运行结果。
此装饰器接受两个参数：maxsize和typed。当maxsize=None时，无限存储；否则maxsize必须是一个int值，表示缓存的参数类型种数。如果typed=True，则缓存认为3.0和3是同一个key。
此装饰器修饰的函数的参数必须都是可以求哈希值的参数。
如果用此装饰器修饰f()函数，可通过f.cache_info()查看缓存信息，包括：命中次数，失败次数，缓存的最大大小，当前缓存大小。
使用f.cache_clear()可以清空缓存。

import functools


@functools.lru_cache(maxsize=2, typed=False)
def haha(x):
    print(haha.__name__, x, 'is called')
    return str(x)


haha(1)
haha(2)
haha(3)
print(haha.cache_info())
haha(1)
haha(3)
haha.cache_clear()
haha(3)

"""输出
haha 1 is called
haha 2 is called
haha 3 is called
CacheInfo(hits=0, misses=3, maxsize=2, currsize=2)
haha 1 is called
haha 3 is called
"""

在不知道此装饰器之前，我自己写过一个同样功能的装饰器，但是肯定不如此装饰器考虑的详细。

def simple_cache(timeout=3):
    """
    基于内存的缓存
    :param timeout: float 缓存过期的时间，单位为秒
    :return: 返回被装饰的函数的返回值

    注意事项：
    * 被装饰的函数的参数必须可以被可序列化为JSON，否则调用出错
    * 对于参数种类很多的函数，不要使用此装饰器，否则内存容易爆
    * 此装饰器适合装饰
    >>> @simple_cache(timeout=3)
    >>>  def haha(user_id):
    >>>     print("haha", user_id)
    >>> haha(0)# 第一次访问user_id=0的用户，调用haha这个函数
    >>> haha(0)#第二次调用user_id=0的用户，使用缓存，不会调用haha这个函数
    >>> haha(1)#第一次调用user_id=1的用户，缓存未命中，会调用haha这个函数
    >>> time.sleep(5)
    >>> haha(1)#经过一段时间后，缓存失效，第二次调用user_id=1的用户缓存未命中，会调用haha这个函数
    """

    def decorator(f):
        def ff(*args, **kwargs):
            arg = json.dumps([args, kwargs])
            res = None
            key = f.__module__ + f.__name__ + arg
            if hasattr(f, key):
                res = getattr(f, key)
                if time.time() - res['last_time'] > timeout:
                    res = None
            if res is None:
                res = {'last_time': time.time(), 'data': f(*args, **kwargs)}
                setattr(f, key, res)
            return res['data']

        return ff

    return decorator

使用git submodules

当一个repo依赖另一个repo，另一个repo无法通过pip安装时，就需要添加一个.gitmodules文件，文件内容如下：

[submodule "vendor/libbpp"]
	path = vendor/libbpp
	url = git@git-core.megvii-inc.com:SkunkWorks/libbpp.git

使用命令git submodule update --init --recursive可以初始化gitmodule。

使用make命令

一个项目包含的命令可能非常多，而这些命令又非常短，如果每个命令都新建一个.bat或者.sh会显得非常啰嗦，这时make就派上用场了。在node中，可以通过package.json配置命令，但是那样只能配置一个命令。make可以把多个长命令用一个短命令替代。

deploy:
	make -j 64 -f Makefile.work deploy

update:
	git pull
	git submodule update --init --recursive
	pip3 install --user -r requirements.txt

简单的守护进程

supervisor.sh

#!/bin/bash
set -x

while true; do
    date
    "$@"
    sleep 1
done

使用时直接supervisor.sh haha，就可以在haha停止时的下一秒钟自动运行haha。

pony

pony是一个python orm框架，用户不必书写SQL，该框架自动将python语法转成SQL语句。
Pony is a Python ORM with beautiful query syntax.
https://ponyorm.org/

sendToTrash

兼容多个平台，把文件发送到垃圾桶。