Apple Machine Learning Research at ICLR 2026
Apple is advancing AI and ML with fundamental research, much of which is shared through publications and engagement at conferences in order to accelerate progress in this important field and support the broader community. This week, the Fourteenth International Conference on Learning Representations (ICLR) will be held in Rio de Janeiro, Brazil, and Apple is proud to again participate in this important event for the research community and to support it with sponsorship. At the main conference and associated workshops, Apple researchers will present new research across a variety of topics, including work unlocking large-scale training for Recurrent Neural Networks, a technique for improving State Space Models, a new approach to unifying image understanding and generation, a method for generating 3D scenes from a single photo, and a new approach to protein folding. During exhibition hours, attendees will be able to experience demonstrations of Apple’s ML research in our booth #204, including local LLM inference on Apple silicon with MLX and Sharp Monocular View Synthesis in Less Than a Second. Apple is also sponsoring and participating in a number of affinity group-hosted events that support underrepresented groups in the ML community. A comprehensive overview of Apple’s participation in and contributions to ICLR 2026 can be found here, and a selection of highlights follows below. Unlocking Large-Scale Training for RNNs Recurrent Neural Networks (RNNs) are naturally suited to efficient inference, requiring far less memory and compute than attention-based architectures, but the sequential nature of their computation has historically made it impractical to scale up RNNs to billions of parameters. A new advancement from Apple researchers makes RNN training dramatically more efficient — enabling large-scale training for the first time and widening the set of architecture choices available to practitioners in designing LLMs, particularly for resource-constrained deployment. In ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models, a new paper accepted to ICLR 2026 as an Oral, Apple researchers share a new framework for parallelized RNN training that achieves a 665× speedup over the traditional sequential approach (see Figure 1). This efficiency gain enables the training of the first 7-billion-parameter classical RNNs that can achieve language modeling performance competitive with transformers (see Figure 2). To accelerate research in efficient sequence modeling and enable researchers and practitioners to explore new nonlinear RNN models at scale, the ParaRNN codebase has been released as an open-source framework for automatic training-parallelization of nonlinear RNNs. At ICLR, the paper’s first author will also deliver an Expo Talk about this research. Improving State Space Models State Space Models (SSMs) like Mamba have become the leading alternative to Transformers for sequence modeling tasks. Their primary advantage is efficiency in long-context and long-form generation, enabled by fixed-size memory and linear scaling of computational complexity. To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models, a new Apple paper accepted as an Oral at ICLR, explores the capabilities and limitations of SSMs for long-form generation tasks. The paper shows that the efficiency of SSMs comes at a…

