作业来源于:https://edu.cnblogs.com/campus/gzcc/GZCC-16SE2/homework/2696
1.列表,元组,字典,集合分别如何增删改查及遍历。
(1)列表
list = ['aaa','bbb','ccc'] list.append('ddd') print(list) #末尾插入元素 list = ['aaa','bbb','ccc'] list.insert(2,'ddd') print(list) #元素插入指定位置 list = ['aaa','bbb','ccc'] list.remove('ccc') print(list) #按名称删除元素 list = ['aaa','bbb','ccc'] list.pop(1) print(list) #按位置删除元素 list = ['aaa','bbb','ccc'] list[1] = 'ddd' print(list) #按位置修改元素 list = ['aaa','bbb','ccc'] print(list[1]) #查找元素 list = ['aaa','bbb','ccc'] for bianli in list: print("序号:{} {}".format(list.index(bianli),bianli)) #遍历
显示结果:
(2)元组
ob = ('aaa','bbb') ob2 = ('ccc','ddd') ob3 = ob + ob2 print(ob3) #添加元素 ob3 = ('aaa','bbb','ccc','ddd') print("第一个:{} 第二个:{}".format(ob3[0],ob3[1])) #查找指定元素 ob = ('aaa','bbb') print("已删除元组ob") del ob #元组删除 ob3 = ('你','是','真','滴','皮') for bl in ob3: print(bl) #遍历元组
显示结果:
(3)字典
dict = {'aaa':100,'bbb':90,'ccc':80} dict['ddd'] = 70 print(dict) #添加元素 dict = {'aaa':100,'bbb':90,'ccc':80} del dict['aaa'] print(dict) #删除元素1 dict = {'aaa':100,'bbb':90,'ccc':80} dict.pop('aaa') print(dict) #删除元素2 dict = {'aaa':100,'bbb':90,'ccc':80} dict['aaa'] = 99 print(dict) #修改元素 dict = {'aaa':100,'bbb':90,'ccc':80} print("查找的人:{}".format(dict['aaa'])) #查找元素 dict = {'aaa':100,'bbb':90,'ccc':80} for bl in dict: print("{}:{}".format(bl,dict[bl])) #遍历字典
显示结果:
(4)集合
s = set(['aaa','bbb','ccc']) s.add('123456') print(s) #添加元素 s = set(['aaa','bbb','ccc']) s.remove('aaa') print(s) #删除元素 s = set(['aaa','bbb','ccc']) s = list(s) s[0] = 'ddd' s = set(s) print(s) #修改元素 s = set(['aaa','bbb','ccc']) s.clear() print(s) s = set(['aaa','bbb','ccc']) for bl in s: print(bl) #遍历
显示结果:
2.总结列表,元组,字典,集合的联系与区别。参考以下几个方面:
下列以列表,元组,字典,集合为默认顺序:
- 括号 ------ (1)列表:[ ] (2)元组:( ) (3)字典:{ } (4) 集合:( )
- 有序无序------(1)有序 (2)有序 (3)无序 (4)无序
- 可变不可变-----(1)可变 (2)可变 (3)不可变,元组中的元素不可修改、不可删除(4)可变
- 重复不可重复-----(1)可以重复(2)可以重复(3)可以重复(4)不可以重复
- 存储与查找方式------(1)① 找出某个值第一个匹配项的索引位置,如:list.index(‘a’)② 使用下标索引,如:list[1] (2)使用下标索引,如:tuple[1](3)通过使用相应的键来查找,如:dict[‘a’] (4)通过判断元素是否在集合内,如:1 in dict
3.词频统计
f = open(r'D:pc软件xiangmuzz.txt',encoding='utf8') #打开文件 stop={'a','the','and','i','you','in','but','not','with','by','its','for','of','an','to','my','myself','we','our','ours','ourelves','about','no','nor'} def gettext(): sep = "~`*()!<>?,./;':[]{}-=_+" text = f.read().lower() for s in sep: text=text.replace(s,'') return text #读取文件 textList = gettext().split() print(textList) #分解提取单词 textSet = set(textList) stop = set(stop) textSet = textSet - stop print(textSet) #排除语法词 textDict = {} for word in textSet: textDict[word] = textList.count(word) print(textDict) print(textDict.items()) word = list(textDict.items()) #单词计数 word.sort(key=lambda x:x[1],reverse=True) print(word) #排序 for q in range(20): print(word[q]) #次数为前20的单词 import pandas as pd pd.DataFrame(data=word).to_csv("text.csv",encoding='utf-8')
显示结果:
词云可视化: