(full disclosure: I am an employee who has worked with this team before internally, but I was not associated with this paper)
the author mentions CLIP, but it looks like he made his own variant. i wish i knew what it was since it seems he is now AWOL