def decode_libsvm(line):
columns = tf.string_split([line], ' ')
labels = tf.string_to_number(columns.values[0], out_type=tf.float32)
splits = tf.string_split(columns.values[1:], ':')
id_vals = tf.reshape(splits.values,splits.dense_shape)
feat_ids, feat_vals = tf.split(id_vals,num_or_size_splits=2,axis=1)
feat_ids = tf.string_to_number(feat_ids, out_type=tf.int32)
feat_vals = tf.string_to_number(feat_vals, out_type=tf.float32)
return {"feat_ids": feat_ids, "feat_vals": feat_vals}, labels
# Extract lines from input files using the Dataset API, can pass one filename or filename list
dataset = tf.contrib.data.TextLineDataset(filenames).map(decode_libsvm, num_threads=10).prefetch(1000)
017 tensorflow | decode libsvm 解析libsvm格式文件
tensorflow相关文章
最近热门
- 次梯度方法
- 论文《Applying Deep Learning To Airbnb Search》阅读笔记
- SFT(Supervised Fine-Tuning,即有监督微调)
- 论文 | PAL: A Position-bias Aware Learning Framework for CTR Prediction in Live Recommender Systems
- 凸优化中的 Slater 条件
- 因果推断 | uplift | 营销增长 | 增长算法 | 智能营销
- Context Parallel(简称CP)并行化技术
- ITC(Image-Text Contrastive)loss和ITM(Image-Text Matching)loss
- tf.losses.log_loss
- 论文:Dataset Regeneration for Sequential Recommendation
最常浏览
- 016 推荐系统 | 排序学习(LTR - Learning To Rank)
- 偏微分符号
- i.i.d(又称IID)
- 利普希茨连续条件(Lipschitz continuity)
- (error) MOVED 原因和解决方案
- TextCNN详解
- 找不到com.google.protobuf.GeneratedMessageV3的类文件
- Deployment failed: repository element was not specified in the POM inside distributionManagement
- cannot access com.google.protobuf.GeneratedMessageV3 解决方案
- CLUSTERDOWN Hash slot not served 问题原因和解决办法
×