Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Description of video

Date: 11/22/23
Speaker :Muntág Márton

Keywords

    Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

    Adrián Csiszárik, Melinda F. Kiss, Péter Kőrösi-Szabó, Márton Muntag, Gergely Papp, Dániel Varga

    As recently discovered (Ainsworth-Hayase-Srinivasa 2022 and others), two wide neural networks with identical network topology and trained on similar data can be permutation-aligned. That is, we can shuffle their neurons (channels) so that linearly interpolating between the two networks in parameter space becomes a meaningful operation (linear mode connectivity).

    We extend this notion by considering more general strategies to combine permutation-aligned networks. We investigate extensively which such strategies succeed and which ones fail. As an example, coordinate-wise randomly picking one of the two weights leads to a well-functioning combined network. This might suggest that the two networks are roughly identical functionally, and interpolation is vacuous. We demonstrate that this is not the case: there is actual interpolation in functional behavior.

    https://arxiv.org/abs/2308.11511

    Downloads