New Method Customises LLMs in Seconds, Beats Tuning: Research

A team of researchers from the National University of Singapore and collaborators at Oxford University, the University of Texas at Austin, and the University of St Gallen have proposed a method to customise large language models (LLMs) without the need for conventional training.

Referred to as Drag-and-Drop (DnD) LLMs, the new approach generates task-specific LoRA adapters directly from prompts, achieving significantly faster and often more accurate results than traditional methods.

The research paper outlines the architecture that combines a frozen text encoder with a hyper-convolutional decoder, generating adapter weights efficiently from prompt embeddings.

“By collapsing the classical ‘data→gradients→weights’ loop into a single forward step, DnD challenges the notion that gradient descent is indispensable for model specialisation and opens a new path where weights themselves become a new data modality and generative target conditioned on concise task descriptors,” researchers said.

Typically, adapting LLMs for new tasks requires time-consuming parameter-efficient fine-tuning (PEFT) using methods like LoRA. This process demands considerable GPU resources and separate training runs for each new dataset. In contrast, DnD skips this optimisation step altogether. Instead, it uses a trained generator that maps a batch of unlabelled task prompts to LoRA weight updates in a single forward pass.

The researchers claim that DnD delivers task-specific parameters up to 12,000 times faster than standard fine-tuning, while achieving up to 30% performance gains over baseline LoRAs.

In common-sense reasoning, DnD improved accuracy on datasets like ARC-e from 37.5% to 68.6% and BoolQ from 13.5% to 44.9%.

On HumanEval, DnD increased pass@1 scores from 17.6% to 32.7%. For maths, it raised accuracy on GSM8K from 42.9% to 66.3%, and for multimodal benchmarks like Math-Vision, it edged out training LoRAs by over one percentage point.

DnD also generalised across domains and model sizes, from 0.5B to 7B parameters. In cross-domain tests, accuracy on a science dataset improved from 35.6% to 45.3%, despite being trained only on reasoning data.

Generated by RSStT. The copyright belongs to the original author.

New Method Customises LLMs in Seconds, Beats Tuning: Research

Report Page