by Yongqian Tan, Fanju Zeng Achieving an ideal balance between style fidelity and content preservation remains a critical challenge in style transfer. In this work, we present UltraStyle , a new framework that reconciles these two objectives through a reformulation of the learning process.
Unlike prior LoRA-based methods that rely on noise prediction, UltraStyle adopts a reconstruction-centered optimization paradigm, allowing the diffusion model to better retain global structural features while faithfully reproducing stylistic patterns. We propose a dual-phase training method that first isolates content representations before specializing style learning, minimizing cross-interference.
To further refine detail preservation without sacrificing structure, we introduce a progressive loss transition strategy during training. Moreover, we develop a flexible inference control mechanism that enables smooth adjustment of content and style influences in the generation phase.
Experimental results demonstrate that UltraStyle consistently delivers stylized outputs with superior structural integrity and stylistic authenticity, significantly mitigating issues such as content drift and feature entanglement found in existing methods.
The approach reframes the learning objective around reconstruction rather than noise prediction used in some prior methods.