FAVAE: SEQUENCE DISENTANGLEMENT USING IN- FORMATION BOTTLENECK PRINCIPLE
A state-of-the-art generative model, a ”factorized action variational autoencoder (FAVAE),” is presented for learning disentangled and interpretable representations from sequential data via the information bottleneck without supervision. The purpose of disentangled representation learning is to obtain interpretable and transferable representations from data. We focused on the disentangled representation of sequential data because there is a wide range of potential applications if disentanglement representation is extended to sequential data such as video, speech, and stock price data. Sequential data is characterized by dynamic factors and static factors: dynamic factors are time-dependent, and static factors are independent of time. Previous works succeed in disentangling static factors and dynamic factors by explicitly modeling the priors of latent variables to distinguish between static and dynamic factors. However, this model can not disentangle representations between dynamic factors, such as disentangling ”picking” and ”throwing” in robotic tasks. In this paper, we propose new model that can disentangle multiple dynamic factors. Since our method does not require modeling priors, it is capable of disentangling ”between” dynamic factors. In experiments, we show that FAVAE can extract the disentangled dynamic factors.