import pandas as pd
import json
from pandas.io.json import json_normalize
json_columns = ['device', 'geoNetwork', 'totals', 'trafficSource']
def load_data_frame(path):
df = pd.read_csv(path, converters={column: json.loads for column in json_columns}, dtype={'fullVisitorId': 'str'})
# c = pd.read_csv("test.csv", converters={column: json.loads for column in json_columns}, dtype={'fullVisitorId': 'str'})
for column in json_columns:
column_as_df = json_normalize(df[column].tolist())
column_as_df.columns = [f"{column}_{sub_column}" for sub_column in column_as_df.columns]
df = df.drop(column, axis=1).merge(column_as_df, right_index=True, left_index=True)
return df
002 pandas json_normalize
pandas相关文章
最近热门
- 论文《Applying Deep Learning To Airbnb Search》阅读笔记
- SFT(Supervised Fine-Tuning,即有监督微调)
- 论文 | PAL: A Position-bias Aware Learning Framework for CTR Prediction in Live Recommender Systems
- 凸优化中的 Slater 条件
- 因果推断 | uplift | 营销增长 | 增长算法 | 智能营销
- Context Parallel(简称CP)并行化技术
- ITC(Image-Text Contrastive)loss和ITM(Image-Text Matching)loss
- tf.losses.log_loss
- 论文:Dataset Regeneration for Sequential Recommendation
- 论文 | POSO: Personalized Cold Start Modules for Large-scale Recommender Systems
最常浏览
- 016 推荐系统 | 排序学习(LTR - Learning To Rank)
- 偏微分符号
- i.i.d(又称IID)
- 利普希茨连续条件(Lipschitz continuity)
- (error) MOVED 原因和解决方案
- TextCNN详解
- 找不到com.google.protobuf.GeneratedMessageV3的类文件
- Deployment failed: repository element was not specified in the POM inside distributionManagement
- cannot access com.google.protobuf.GeneratedMessageV3 解决方案
- CLUSTERDOWN Hash slot not served 问题原因和解决办法
×