⚠️ Preview mode — 3D models cannot load when this page is opened directly from disk. Please serve the folder via a local web server (e.g. python3 -m http.server) and open it through http://localhost.

MeshLoom: Feed-Forward Non-Rigid Registration of Mesh Sequences

One feed-forward pass — from heterogeneous mesh frames to a topology-consistent sequence.

MeshLoom teaser: unifying heterogeneous input meshes into a continuous registered mesh sequence, generalizing across categories, motions, and geometric variations.

From heterogeneous inputs to a continuous registered sequence. MeshLoom takes a sparse set of observed meshes with varying vertex counts and connectivity (upper-left) and outputs a topology-consistent sequence in which every frame shares the anchor mesh's vertex set. Because the network learns a continuous motion field, it can also be queried at unseen intermediate timestamps (upper-right). The same model generalizes across categories, motions, and geometric variations (bottom). Colors encode anchor-mesh vertex positions, so corresponding vertices share a color across frames.

The Task

Non-rigid mesh registration: turn a set of meshes with different vertex counts and connectivity into a sequence that shares one consistent topology.

The Gap

Prior methods are either slow (per-instance optimization), category-restricted, pairwise-only, or output only intermediate quantities — no single method covers all axes.

Our Answer

One feed-forward network that is fast, open-vocabulary, sequence-level, and directly outputs vertex deformations — plus interpolation and morphing for free.

Abstract

We present MeshLoom, a feed-forward registration network that directly reconstructs vertex deformations across mesh sequences. Our approach advances non-rigid registration beyond existing models, which are typically constrained by costly per-instance optimization, narrow object categories, pairwise-only inputs, or merely intermediate outputs. The network is simple and efficient, registering multiple meshes within seconds.

At its core lies a topology-aware encoder–decoder design. We first introduce a topology-aware point representation that encodes the anchor (reference) mesh's topology into its per-vertex features, disambiguating points that are Euclidean-close yet geodesically distant. We then propose a multi-modal encoder that fuses this anchor-mesh representation with complementary cues from each frame, such as shape latents and image features. These multi-source signals are compressed into a compact global motion embedding that captures dense inter-frame correspondence. A lightweight decoder then queries this global embedding with the anchor-mesh point representation, retrieving per-vertex deformations at target timestamps.

Through extensive experiments across diverse motions and object categories, we show that MeshLoom achieves state-of-the-art results on non-rigid registration. In addition, our global embedding-then-query paradigm naturally enables the network to generate deformations at intermediate timestamps, extending MeshLoom to motion interpolation and mesh morphing.

Take-Aways

The six ideas that define MeshLoom.

Fast & feed-forward

Registers a full mesh sequence in seconds — no per-instance optimization, no iterative refinement.

🔀

Sequence-level input

Ingests an entire variable-length sequence in one pass, instead of stitching together pairwise source–target registrations.

🎯

Direct vertex deformations

Outputs explicit per-vertex displacements ready to use — no external solver, no post-processing step.

🔗

Topology-aware

Bakes anchor-mesh connectivity into per-vertex features, so close-but-disjoint regions no longer move together.

⏱️

Continuous in time

Embed-then-query design lets the network deform the anchor at any timestamp — interpolation and morphing for free.

🌐

Open-vocabulary, SOTA

One model across humans, animals, and general objects — outperforming existing baselines.

Method Overview

Encode the anchor with its topology, fuse the sequence into one global motion embedding, then query that embedding per vertex per frame.

MeshLoom pipeline: anchor mesh is encoded into a topology-aware representation; target meshes and optional guided images are fused by a multi-modal transformer encoder into a global motion embedding; a lightweight decoder queries this embedding to produce the registered output sequence.

Workflow of MeshLoom. Given an input mesh sequence whose frames differ in vertex count and connectivity, our network proceeds in three steps. (1) Any frame (typically the first) is designated as the anchor mesh and embedded into a topology-aware representation Ha. (2) The remaining frames and their (optional) images are encoded into per-frame shape latents St and image features It, then fused with Ha by a transformer-based encoder to produce a global motion embedding Z. (3) A lightweight deformation decoder queries Z with Ha to predict per-vertex deformations of the anchor mesh at every frame, yielding an output sequence with a consistent vertex count and face connectivity.

Interactive 3D Results

Explore our registered mesh sequences interactively — rotate, zoom, and scrub through frames to inspect correspondence at any timestamp.

How to interact with the 3D viewers:
🖱️ Drag to rotate
🔍 Scroll to zoom
Right-drag to pan
▶️ Playback to animate
Note on shape coloring: The coloring of each registered mesh is based on vertex positions on the anchor (reference) mesh. Since the registered meshes are deformations of the anchor, corresponding vertices carry the same color across frames.
Note on mesh resolution: For web display purposes, we decimate meshes shown here to reduce file size.

1 Comparison with Other Methods

Side-by-side against prior registration baselines. All methods share the same anchor mesh (Frame 0).

Select a sequence to view
Compare with:
Input Shapes
Select a sequence above
MeshLoom (Ours)
Select a sequence above
Other Method
Select a method above
Frame 0 / 3

2 Applications

Beyond standard registration, the embed-then-query design extends to motion interpolation and mesh morphing.

Registration Across Geometric Variations Registration

Generalizes across geometric variations — e.g., cross-species registration that closely conforms to the input shapes.

Select a case to view
Input Shapes
Select an example above
Our Registered Output
Select an example above
Frame 0 / 3

Motion Interpolation Interpolation

From sparse input motion states, synthesizes smooth intermediate frames that preserve object identity and local structure.

Select a case to view
Input Shapes (sparse 4 motion states)
Select an example above
Our Registered Output (dense 16 frames with interpolation)
Select an example above
Frame 0 / 15

Mesh Morphing Morphing

Smooth, coherent transitions between two distinct shapes — evidence that the encoder learns continuous deformation, not just frame replay.

Select a case to view
Start Shape
Select an example above
Our Registered Output (16 morphing transition steps)
Select an example above
End Shape
Select an example above
Frame 0 / 15

3 Additional Visual Results

Our registration results across fifteen diverse animation sequences.

Select a sequence to view
Input Shapes
Select a sequence above
Our Registered Output
Select a sequence above
Frame 0 / 3