MGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM Alignment

Andor Vári-Kakas; Ji Won Park; Natasa Tagasovska

✨ TL;DR

MGDA-Decoupled is a geometry-based multi-objective optimization algorithm for aligning LLMs that balances conflicting objectives like helpfulness and truthfulness without requiring reinforcement learning or explicit reward models. It operates within the Direct Preference Optimization framework and achieves higher win rates than existing methods by accounting for each objective's convergence dynamics.

01 · Problem

Large language models require alignment with multiple, potentially conflicting human values such as helpfulness, truthfulness, and harmlessness. Most existing alignment pipelines use fixed scalarization of these objectives, which can introduce procedural unfairness by systematically under-weighting objectives that are harder to optimize or represent minority preferences. This creates a multi-objective optimization challenge where standard approaches fail to achieve equitable trade-offs across all objectives.

02 · Approach

MGDA-Decoupled introduces a geometry-aware multi-objective optimization algorithm that finds a shared descent direction while explicitly accounting for each objective's convergence dynamics. Unlike prior methods that depend on reinforcement learning (GAPO) or explicit reward models (MODPO), this approach operates entirely within the lightweight Direct Preference Optimization (DPO) paradigm, making it more computationally efficient while maintaining the ability to balance multiple objectives.

03 · Key insights

What the paper shows.

01Geometry-based multi-objective optimization can promote more equitable trade-offs between conflicting alignment objectives compared to fixed scalarization approaches

02Accounting for each objective's convergence dynamics is crucial for preventing systematic under-weighting of harder-to-optimize objectives

03Multi-objective optimization for LLM alignment can be achieved without reinforcement learning or explicit reward models by working within the DPO framework

04Geometry-aware methods achieve superior performance both overall and when evaluated per individual objective

04 · Results

Experiments on the UltraFeedback dataset demonstrate that MGDA-Decoupled achieves the highest win rates against golden responses compared to baseline methods, both when evaluated overall and when assessed per individual objective. The geometry-aware approach successfully balances multiple alignment objectives while maintaining computational efficiency within the DPO paradigm.

05 · Limitations

The paper evaluates the approach primarily on the UltraFeedback dataset, which may limit generalizability to other preference datasets or alignment benchmarks. The specific performance gains over competing methods and ablation studies on the geometry-aware components are not detailed in the abstract. Additionally, the scalability of the approach to even larger models or more complex objective sets is not explicitly discussed.

✨ Generated by Claude · Apr 25, 2026 · Read the PDF for authoritative content.

What the paper shows.

↘ Related papers