I think this is the same thing we already discussed extensively on your JIRA.
The type of the key/value class argument to newAPIHadoopFile are not the type of your custom class, but of the Writable describing encoding of keys and values in the file. I think that's the start of part of the problem. This is how all Hadoop-related APIs would work, because Hadoop uses Writables for encoding.
You're asking again why it isn't caught at compile time, and that stems from two basic causes. First is the way the underlying Hadoop API works, needing Class parameters because of it's Java roots. Second is the Scala/Java difference; the Scala API will accept, for instance, non-Writable arguments if you can supply implicit conversion to Writable (if I recall correctly). This isn't available in Java, leaving its API expressing flexibility that isn't there. This isn't the exact issue here; it's that you're using raw class literals in Java which have no generic types -- they are Class<?>. The InputFormat arg expresses nothing about the key/value types; there's nothing to 'contradict' your declaration, which is doesn't represent the actual types correctly. (You can cast class literals to (Class<..>) to express this if you want. It's a little mess in Java.) That's why it compiles just as any Java code with an invalid cast compiles but fails at runtime.
It is a bit weird if you're not familiar with the Hadoop APIs, Writables, or how Class arguments shake out in the context of generics. It does take the research you did. It does work as you've found. The reason you were steered several times to the DataFrame API is that it can hide a lot of this from you, including details of Avro and Writables. You're directly accessing Hadoop APIs that are foreign to you.
This and the JIRA do not describe a bug.