博客

字节LONGER: Scaling Up Long Sequence Modeling in Industrial Recommenders

To the Globe (TTG): Towards Language-Driven Guaranteed Travel Planning

论文旅行规划

Efficient Streaming Language Models with Attention Sinks

Asynchronous Stochastic Gradient Descent with Delay Compensation

论文优化器

论文：Perceiver - General Perception with Iterative Attention

论文 Transformer Google Deepmind

AdaF2M2 : Comprehensive Learning and Responsive Leveraging Features in Recommendation System

CLS, COMPOSITE SLICE TRANSFORMER: AN EFFICIENT TRANSFORMER WITH COMPOSITION OF MULTI-SCALE MULTI-RANGE ATTENTIONS

论文 TRANSFORMER

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

论文

TIGER：Recommender Systems with Generative Retrieval 生成式召回

论文推荐系统召回

Soft MoE《FROM SPARSE TO SOFT MIXTURES OF EXPERTS》

论文

Fast Inference from Transformers via Speculative Decoding

论文

谷歌MLP - Mixer: An all - MLP Architecture for Vision

谷歌DiLoCo: Distributed Low-Communication Training of Language Models

快手冷启动POSO: Personalized Cold Start Modules for Large-scale Recommender Systems

论文冷启动快手

快手MARM: Unlocking the Future of Recommendation Systems through Memory Augmentation and Scalable Complexity

论文 | ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING、RoPE编码

论文：Improving Item Cold - start Recommendation via Model - agnostic Conditional Variational Autoencoder

论文冷启动

Rethinking the Role of Pre-ranking in Large-scale E-Commerce Searching System

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

论文 LLM 推荐系统

阿里MIMN模型Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction

«
1
2
3
4
5
6
7
»