今天分享的是 【月小水长】pandas 三十六计系列 的第八篇 ,一个小工具,将 json 文件转成 csv 文件。
文件格式是表,文件内容是里,只要里子一样,外表是可以像穿衣一样随便换的,就像在 MySQL 中,可以任意导入导出 SQL、csv、json 等文件一样。
假设我们有一个这样的 json 文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| { "4893424946515214": { "mid": "4893424946515214", "weibo_link": "https://weibo.com/2803301701/MDcporkU6", "text": "据悉,全城月季花已逐渐进入盛花期。", "publish_time": "2023-04-22 20:34:45", "user_link": "https://weibo.com/u/2803301701", "user_name": "人民日报", "reposts_count": 55, "comments_count": 92, "attitudes_count": 298 }, "4893416880346795": { "mid": "4893416880346795", "weibo_link": "https://weibo.com/2803301701/MDcco1sdt", "text": "4月22日,陕西西安。游客发视频... ", "publish_time": "2023-04-22 20:02:42", "user_link": "https://weibo.com/u/2803301701", "user_name": "人民日报", "reposts_count": 119, "comments_count": 249, "attitudes_count": 785 }, "4893410513127118": { "mid": "4893410513127118", "weibo_link": "https://weibo.com/2803301701/MDc27d7vo", "text": "第54个世界地球日,江豚回家路还有多远...", "publish_time": "2023-04-22 19:37:24", "user_link": "https://weibo.com/u/2803301701", "user_name": "人民日报", "reposts_count": 119, "comments_count": 145, "attitudes_count": 463 } }
|
现在要转成下面这样的 csv:
只需要运行下面这份代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
|
import json import pandas as pd
def convert_json_to_csv(input_json_path, output_csv_path): with open(input_json_path, mode='r', encoding='utf-8-sig') as f: input_json = json.loads(f.read())
data_list = []
data_cols = input_json[list(input_json.keys())[0]].keys() for a_weibo in input_json.values(): data_list.append(list(a_weibo.values())) df = pd.DataFrame(data_list, columns=data_cols)
df.to_csv(output_csv_path, index=False, encoding='utf-8-sig')
convert_json_to_csv('./data/2803301701.json', './data/2803301701.csv')
|
代码中没有指定 csv 的任何列名,自动从 json 文件中获取,具有一定的普适性。