train_data = pd.read_csv("train.csv")
train_data.head()
train_data.describe()
对于数值数据,结果的索引将包括计数,平均值,标准差,最小值,最大值以及较低的百分位数和50。默认情况下,较低的百分位数为25,较高的百分位数为75.50百分位数与中位数相同。
date | visitId | visitNumber | visitStartTime | |
---|---|---|---|---|
count | 9.036530e+05 | 9.036530e+05 | 903653.000000 | 9.036530e+05 |
mean | 2.016589e+07 | 1.485007e+09 | 2.264897 | 1.485007e+09 |
std | 4.697698e+03 | 9.022124e+06 | 9.283735 | 9.022124e+06 |
min | 2.016080e+07 | 1.470035e+09 | 1.000000 | 1.470035e+09 |
25% | 2.016103e+07 | 1.477561e+09 | 1.000000 | 1.477561e+09 |
50% | 2.017011e+07 | 1.483949e+09 | 1.000000 | 1.483949e+09 |
75% | 2.017042e+07 | 1.492759e+09 | 1.000000 | 1.492759e+09 |
max | 2.017080e+07 | 1.501657e+09 | 395.000000 | 1.501657e+09 |