We introduce TGAvatar, a novel framework for 3D head animation and reconstruction that revolutionizes the use of 3D Gaussian Splatting (3DGS). TGAvatar significantly advances rendering quality by leveraging the intricate properties of 3DGS to achieve detailed and realistic representations of human head geometries and textures. We use an innovative application of linear blending techniques to imitate 3D Morphable Model (3DMM) coefficients within 3DGS, thereby enabling precise and dynamic facial feature and expression modeling. Further enhancing TGAvatar's capabilities, a transformer based tri-plane module is incorporated to accurately infer spherical harmonics and alpha parameters. This integration is pivotal for the method, as it allows allows us to efficiently and precisely represent the visual characteristics of gaussians, tailored specifically to the intricate details of the head's components. Our exhaustive evaluations show that TGAvatar not only elevates the fidelity and realism of 3D head reconstructions but also sets a new standard by surpassing existing methods in rendering quality and computational efficiency.
TGAvatar process begins with the random initialization of a set of Gaussians with pose, rotation, and scale bases (\(P, Q, S\)) and bias terms (\(p_0, q_0, s_0\)). In addition, a transformer based tri-plane module is employed to ensure high-fidelity novel view synthesis. Specifically, we first use a transformer-based tri-plane decoder to predict tri-plane features. Subsequently, we incorporate a tri-plane module to extract hybrid features based on the pose of each Gaussian. Finally, these hybrid features are fed into an MLP network to infer opacity (\(\alpha\)) and spherical harmonics coefficients(SH) in each gaussian.
Qualitative comparisons between our TGAvatar and INSTA, FlashAvatar and GaussianBlendshapes. Results are executed under the configurations specified in their works. For INSTA dataset, INSTA and GaussianBlendshapes provide pretrained models, therefore, these results are evaluated by their pretrained models. Our TGAvatar achieves better results, particularly in capturing details such as teeth, eyes, wrinkles and reflections.