A key element of any modern video codec is the efficient exploitation of temporal redundancy via motion-compensated prediction. In this book, a novel paradigm of representing and employing motion information in a video compression system is described that has several advantages over existing approaches. Traditionally, motion is estimated, modelled, and coded as a vector field at the target frame it predicts. While this “prediction-centric” approach is convenient, the fact that the motion is “attached” to a specific target frame implies that it cannot easily be re-purposed to predict or synthesize other frames, which severely hampers temporal scalability. In light of this, the present book explores the possibility of anchoring motion at reference frames instead. Key to the success of the proposed “reference-based” anchoring schemes is high quality motion inference, which is enabled by the use of a more “physical” motion representation than the traditionally employed “block” motion fields. The resulting compression system can support computationally efficient, high-quality temporal motion inference, which requires half as many coded motion fields as conventional codecs. Furthermore, “features” beyond compressibility - including high scalability, accessibility, and “intrinsic” framerate upsampling - can be seamlessly supported. These features are becoming ever more relevant as the way video is consumed continues shifting from the traditional broadcast scenario to interactive browsing of video content over heterogeneous networks. This book is of interest to researchers and professionals working in multimedia signal processing, in particular those who are interested in next-generation video compression. Two comprehensive background chapters on scalable video compression and temporal frame interpolation make the book accessible for students and newcomers to the field.