使用python开发命令行程序的知识点

#==========================
# 几个必须了解的模块: os, os.path和shutil
#==========================
写命令行程序, 经常要和操作系统和文件打交道,

关于os,os.path, shutil模块的介绍
http://www.cnblogs.com/lovemo1314/archive/2010/11/08/1871781.html

http://docs.python.org/library/os.html
http://docs.python.org/library/shutil.html#module-shutil

#==========================
# subprocess 模块
#==========================
python2.6后推荐使用subprocess模块来调用外部程序, 包括shell.

subprocess.call(*popenargs, **kwargs)
运行命令, 该函数将一直等待到子进程运行结束，并返回进程的returncode。

p=subprocess.Popen(*popenargs, **kwargs))
可以使用Popen来创建子进程，并可与子进程进行复杂的交互, 功能相当强大. 但该函数并不等子进程结束, 就会返回到父进程.

python31高级[创建进程]
http://www.cnblogs.com/itech/archive/2010/12/31/1922427.html

python中的subprocess.Popen的使用
http://apps.hi.baidu.com/share/detail/20257791

#==========================
# 其他文件常用操作
#==========================
判断文件是否存在, 可以使用:
os.path.isfile(flg2File)

判断是否是路径, 或则路径是否存在:
os.path.isdir(path)

判断路径是否存在, 可以使用:
os.path.exists(self.backupPath)

获取文件的大小, 可以使用:
os.path.getsize(newGzipFullName2)

拼接路径, 可以使用:
newGzipFullName2=os.path.join(self.dataPath, newGzipFile)

遍历path下所有的文件目录(包括子目录), 可以使用:
for root, dirs, files in os.walk(path):
os.walk()本质上是一个generator, 所以需要使用foreach来遍历, 每次循环, root是当前接受遍历的目录, dirs是root下的直接子目录列表, files是root下直接文件列表.

获取path本层目录的文件
    def getShortFilesInFolder(self, path=""):
        result=[]
        for root, dirs, files in os.walk(path):
            if root==path:
                result=files
                break
        return(result)

解压tar.gz, 可以使用tarfile模块,
   if tarfile.is_tarfile(fileFullName):
       tar =tarfile.open(fileFullName)
       tar.extractall(workingPath)
       tar.close()

合并几个文件, 也可以很方便地使用tarfile模块完成.
   tar =tarfile.open(fileFullName,"w")
   tar.add(file1,"fileA") #"fileA"为在tarinfo中的显示名称, 如果缺省的话, 即为file1的全路径
   tar.add(file2,"fileB")
   tar.close()

压缩几个文件成tar.gz, 原则上也可以使用tarfile模块完成, 只不过需要指定open的格式为"w|gz", 即tarfile.open(fileFullName,"w|gz"). 但是这么做, gz中tar文件的显示名, 永远为tar文件的全路径. 如使用下面方法,gz中tar文件的显示名为tar文件短文件名.
   os.chdir(workingPath)
   cmd ="tar -zcf newfile.tar.gz newfile*"
   subprocess.call(cmd, shell=True)


#==========================
# 在console打印两个方法的区别:
#==========================
1. print()自动会自动加上换行
   print("abc")
   print("def")
2. 如果不需要子自动换行, 需要使用sys.stdout.write()
   import sys
   sys.stdout.write('abc')
   sys.stdout.write('def')
   sys.stdout.write('\n') #new line

#==========================
# 在console上等待用户的输入
#==========================
命令行程序有时候需要用户输入一些东西:
   guess = raw_input('guess a letter: ') #string
   nb = input('Choose a number') #number


#==========================
# 命令行参数解析器
#==========================
python optparse 模块使用 - python学习 - 博客园
http://zoomquiet.org/res/scrapbook/ZqFLOSS/data/20100903115917/

或者, 参考下面这个简单的代码:
class TableFilesBatchRename:
    def parseArguments(self):
        cmdln_args=sys.argv[1:]
        #print(cmdln_args)
       #Sample is: bulkinit.py dataPath=D:\1 table=SDB_TB_EQP_GROUP backup=True

        argKeyValues=dict([arg.split("=") for arg in cmdln_args])
        """
        for arg in argKeyValues.iteritems():
            print(arg)
        """
        self.dataPath=argKeyValues["dataPath"]
        self.backupPath=os.path.join(self.dataPath,"backup")

        self.tableName=argKeyValues["table"]

        if (argKeyValues.has_key("backup")):
            self.backup=argKeyValues["backup"]

    def printUsage(self):
        usage="""
=======================================
One utility for edw project
=======================================
bulkInit is to batchly rename all data files and flag files related to one table .
Usage:
python bulkInit.py dataPath=your_data_Path table=your_table backup=True|False
Example:
python bulkInit.py dataPath=/bac/kkk/working table=table_a backup=True
"""
        print(usage)

if __name__ == '__main__':
    processor=TableFilesBatchRename()
    parsed=False
    try:
        processor.parseArguments()
        parsed=True
    except Exception as ex:
        print("Argument parse failed.")
        processor.printUsage()
    if(parsed):
        processor.execute()
        print("............done")


#==========================
# 如何处理文本文件:
#==========================
   text_file = open("Output.txt", "w")
   text_file.write("Purchase Amount: %s"%TotalAmount)
   text_file.close()
在python 2.5之后由于有了context manager, 就可以使用with语法, 在with语句结束时, 有系统来保证自动关闭文件句柄.
   with open("Output.txt", "w") as text_file:
       text_file.write("Purchase Amount: %s"%TotalAmount)



#==========================
# 如何sleep
#==========================
import time
time.sleep(5)

#==========================
# 如何使用dom方式访问XML,
#==========================
可以import xml.dom.minidom的parse和parseString()来解析XML, 可以使用getDOMImplementation()来用来修改XML文件
http://docs.python.org/library/xml.dom.minidom.html

#==========================
# 更高效XML库,
#==========================
lxml处理速度很快,关于xpath, 支持弱化版的xpath, 名为ElementPath.
libxml2是另一个高效包, 但比lxml难用多了, 推荐使用lxml.
http://www.iteye.com/topic/890812

#==========================
# 如何记录log
#==========================
使用logging记录log, 用法同log4j
http://docs.python.org/howto/logging.html#logging-basic-tutorial
import logging
logging.basicConfig(filename='example.log',level=logging.DEBUG)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')