Toward an infrastructure for data-driven multimodal communication research