查询mysql中经纬度判断坐标范围

先上代码，稍后附上说明：

1. 从mysql中取出记录，打印有效经纬度：

import json
import MySQLdb

# lines = c.fetchall()  #所有的记录，一个tuple
#one = c.fetchone()

def gen_row():
    db = MySQLdb.connect(host='192.168.1.205', user='root', passwd='123456', db='kaqu')
    c = db.cursor()
    c.execute("select params from t2")
    row = c.fetchone()
    while row is not None:
        try:
            latitude = float(json.loads(row[0])['latitude'])   #同时过滤掉经纬度为空和没有经纬度的记录
            longitude = float(json.loads(row[0])['longitude'])        
            if not (latitude == 5e-324 or latitude == 0):    #有些传过来的错误经纬度
                print(latitude, longitude)
            row = c.fetchone()
        except:
            row = c.fetchone()

if __name__ == "__main__":
    gen_row()

感觉这里的try被玩坏了~~~~(>_<)~~~~

2. 统计出现次数最多的10个经纬度：

import json
import MySQLdb

# lines = c.fetchall()  #所有的记录，一个tuple
#one = c.fetchone()

def gen_row():
    db = MySQLdb.connect(host='192.168.1.205', user='root', passwd='123456', db='kaqu')
    c = db.cursor()
    c.execute("select params from t1")
    row = c.fetchone()
    points = []
    while row is not None:
        try:
            latitude = float(json.loads(row[0])['latitude'])
            longitude = float(json.loads(row[0])['longitude'])
            if not (latitude == 5e-324 or latitude == 0.0):
                #print(latitude, longitude)
                points.append((latitude,longitude))

            row = c.fetchone()
        except:
            row = c.fetchone()
    return points

def gen_count(points):
    from collections import Counter
    counts = Counter(points)
    max = counts.most_common(10)
    print(max)

if __name__ == "__main__":
    points = gen_row()
    gen_count(points)

数据库中大约有65w的数据，脚本运行需耗费约300M内存，运行结束立即释放。

之前看过pandas库的简单使用，拿pandas来做统计应该效果更好，以后有空在学习一下。

附：mysql fetchone() 的2种迭代方法

# Using a while loop
cursor.execute("SELECT * FROM employees")
row = cursor.fetchone()
while row is not None:
  print(row)
  row = cursor.fetchone()

# Using the cursor as iterator 
cursor.execute("SELECT * FROM employees")
for row in cursor:
  print(row)

经测试，fetchone() 和 fetchall() 所消耗内存几乎一致，2者的内部都是使用的list，所以直接fetchall()似乎更简单。