Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

DEEP LEARNING SZEMINÁRIUM

Seminars

Adrián Csiszárik, Melinda F. Kiss, Péter Kőrösi-Szabó, Márton Muntag, Gergely Papp, Dániel Varga

As recently discovered (Ainsworth-Hayase-Srinivasa 2022 and others), two wide neural networks with identical network topology and trained on similar data can be permutation-aligned. That is, we can shuffle their neurons (channels) so that linearly interpolating between the two networks in parameter space becomes a meaningful operation (linear mode connectivity).

We extend this notion by considering more general strategies to combine permutation-aligned networks. We investigate extensively which such strategies succeed and which ones fail. As an example, coordinate-wise randomly picking one of the two weights leads to a well-functioning combined network. This might suggest that the two networks are roughly identical functionally, and interpolation is vacuous. We demonstrate that this is not the case: there is actual interpolation in functional behavior.

https://arxiv.org/abs/2308.11511

Date:	11/22/23
Speaker :	Muntág Márton

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Description of video

DEEP LEARNING SZEMINÁRIUM

Seminars

Keywords

Downloads

Related videos

Gradient presentacions in ReLU networks as similarity functions, the targent sensitivity matrix

DEEP LEARNING SZEMINÁRIUM

Why small data still matter?

DEEP LEARNING SZEMINÁRIUM

Geometriai mélytanulás: Mit árul el Rólunk az agyfelszín geometriája?

DEEP LEARNING SZEMINÁRIUM