def getCoding(strInput):
'''
获取编码格式
'''
if isinstance(strInput, unicode):
return "unicode"
try:
strInput.decode("utf8")
return 'utf8'
except:
pass
try:
strInput.decode("gbk")
return 'gbk'
except:
pass
def tran2UTF8(strInput):
'''
转化为utf8格式
'''
strCodingFmt = getCoding(strInput)
if strCodingFmt == "utf8":
return strInput
elif strCodingFmt == "unicode":
return strInput.encode("utf8")
elif strCodingFmt == "gbk":
return strInput.decode("gbk").encode("utf8")
def tran2GBK(strInput):
'''
转化为gbk格式
'''
strCodingFmt = getCoding(strInput)
if strCodingFmt == "gbk":
return strInput
elif strCodingFmt == "unicode":
return strInput.encode("gbk")
elif strCodingFmt == "utf8":
return strInput.decode("utf8").encode("gbk")
编码判断和编码转换
标签:
  更新于:
2018/04/25
阅读:868
最近热门
- 论文 | DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
- Chinchilla law
- 近线召回
- MMDetection:一个基于PyTorch的开源目标检测工具箱
- Deep Retrieval(DR召回)
- 快手召回新范式 KuaiFormer: Transformer-Based Retrieval at Kuaishou
- java HashMap 按value排序
- 2.1 机器学习常见算法分类梳理
- NVIDIA L20和NVIDIA A30
- 英伟达H20与L20的参数对比
最常浏览
- 016 推荐系统 | 排序学习(LTR - Learning To Rank)
- 偏微分符号
- i.i.d(又称IID)
- 利普希茨连续条件(Lipschitz continuity)
- (error) MOVED 原因和解决方案
- TextCNN详解
- 找不到com.google.protobuf.GeneratedMessageV3的类文件
- Deployment failed: repository element was not specified in the POM inside distributionManagement
- cannot access com.google.protobuf.GeneratedMessageV3 解决方案
- CLUSTERDOWN Hash slot not served 问题原因和解决办法
×