I use Spark everyday and I have a good grip on the basics of Spark, so this question isnt for myself. But this came up and I wanted to see what other Spark users would say, and I dont want to influence your answer. And SO is weird about polls. The question is
"Which one do you feel is accurate... Dataset is a subset of DataFrame, or DataFrame a subset of Dataset?"