字符串操作、文件操作，英文词频统计预处理

1.字符串操作：

解析身份证号：生日、性别、出生地等。
凯撒密码编码与解码
网址观察与批量生成

（1）解析身份证号：

ID = input('请输入十八位身份证号码: ')
if len(ID) == 18:
    print("你的身份证号码是 " + ID)
else:
    print("错误的身份证号码")

ID_add = ID[0:6]
ID_birth = ID[6:14]
ID_sex = ID[14:17]
ID_check = ID[17]

# ID_add是身份证中的区域代码，如果有一个行政区划代码字典，就可以用获取大致地址#
print("出生地为："+ID_add)
year = ID_birth[0:4]
moon = ID_birth[4:6]
day = ID_birth[6:8]
print("生日: " + year + '年' + moon + '月' + day + '日')

if int(ID_sex) % 2 == 0:
    print('性别：女')
else:
    print('性别：男')

# 此部分应为错误判断，如果错误就不应有上面的输出，如何实现？#
W = [7, 9, 10, 5, 8, 4, 2, 1, 6, 3, 7, 9, 10, 5, 8, 4, 2]
ID_num = [18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2]
ID_CHECK = ['1', '0', 'X', '9', '8', '7', '6', '5', '4', '3', '2']
ID_aXw = 0
for i in range(len(W)):
    ID_aXw = ID_aXw + int(ID[i]) * W[i]

ID_Check = ID_aXw % 11
if ID_check == ID_CHECK[ID_Check]:
    print('正确的身份证号码')
else:
    print('错误的身份证号码')

解析结果：

（2）凯撒密码编码与解码

plaincode=input('letter:')
for i in plaincode:
    print(chr(ord(i)+3),end='')

　　结果:

(3)网址观察与批量生成

for i in range(2,10):
    url='http://news.gzcc.cn/html/xiaoyuanxinwen/{}.html'.format(i)
    print(url)

　打开网址：

import webbrowser as web
url='http://news.gzcc.cn/html/xiaoyuanxinwen/'
web.open_new_tab(url)
for i in range(2,4):
    web.open_new_tab('http://news.gzcc.cn/html/xiaoyuanxinwen/'+str(i)+'.html')

　网址观察结果：

2.英文词频统计预处理

下载一首英文的歌词或文章或小说。
将所有大写转换为小写
将所有其他做分隔符（,.？！）替换为空格
分隔出一个一个的单词
并统计单词出现的次数。

代码：

str='''I'm a big, big girl in a big, big world

It's not a big, big thing if you leave me

But I do, do feel that I do, do will miss you much

Miss you much '''
str=str.lower()
s=',.!?'
for c in s:
    str = str.replace(c, "")

print(str.split())
print(str.count('big'))

结果：

3.文件操作

同一目录、绝对路径、相对路径
凯撒密码：从文件读入密函，进行加密或解密，保存到文件。
词频统计：下载一首英文的歌词或文章或小说，保存为utf8文件。从文件读入文本进行处理。

同一目录读文件代码：

f=open('yw.txt','r',encoding='utf8')
text=f.read()
f.close()
print(text)

绝对路径读文件代码：

f=open(r'C:UsersAdministratorPycharmProjectsdvenvljxd.txt','r',encoding='utf8')
text=f.read()
f.close()
print(text)

相对路径读文件代码：

f=open(r'..venvljxd.txt','r',encoding='utf8')
text=f.read()
f.close()
print(text)

读取文件内容结果：

凯撒密码：

file=open("caesar.txt")
caesar=file.read()
print("加密前的密码：",caesar)
cipher='';
jiemi='';
for i in caesar:
    cipher=cipher+chr(ord(i)+3);
print("加密后的密码：",cipher)
file=open("cipher.txt",'w')
file.write(cipher)
file.close()

加密结果：

加密前文件：

加密后文件：

4.函数定义

加密函数

def get_text():
    plaincode = 'abcd'
    cipher=''
    for i in plaincode:
        cipher=cipher+chr(ord(i) + 3)
    return cipher
bigstr = get_text()
print(bigstr)

解密函数

def get_text():
    plaincode = 'defg'
    cipher=''
    for i in plaincode:
        cipher=cipher+chr(ord(i) -3)
    return cipher
bigstr = get_text()
print(bigstr)

读文本函数

def get_text():
    with open('yw.txt', 'r', encoding='utf8',errors='ignore') as f:
        text = f.read()
    return text
bigstr = get_text()
print(bigstr)