What Are Vision Language Models?