Spark's Column.isin function does not take List

Multi tool use
Multi tool use


Spark's Column.isin function does not take List



I am trying to filter out rows from my Spark Dataframe.


val sequence = Seq(1,2,3,4,5)
df.filter(df("column").isin(sequence))



Unfortunately, I get an unsupported literal type error


java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(1,2,3,4,5)



according to the documentation it takes a scala.collection.Seq list



I guess I don't want a literal? Then what can I take in, some sort of wrapper class?




2 Answers
2



@JustinPihony's answer is correct but it's incomplete. The isin function takes a repeated parameter for argument, so you'll need to pass it as so :


isin


scala> val df = sc.parallelize(Seq(1,2,3,4,5,6,7,8,9)).toDF("column")
// df: org.apache.spark.sql.DataFrame = [column: int]

scala> val sequence = Seq(1,2,3,4,5)
// sequence: Seq[Int] = List(1, 2, 3, 4, 5)

scala> val result = df.filter(df("column").isin(sequence : _*))
// result: org.apache.spark.sql.DataFrame = [column: int]

scala> result.show
// +------+
// |column|
// +------+
// | 1|
// | 2|
// | 3|
// | 4|
// | 5|
// +------+





this also helped me understand what was going on stackoverflow.com/questions/6051302/…
– Jake Fund
Apr 12 '16 at 12:37





It's all in the Scala Language Specification. :)
– eliasah
Apr 12 '16 at 12:54



This is happening because the underlying Scala implementation uses varargs, so the documentation in Java is not quite correct. It is using the @varargs annotation, so you can just pass in an array.


varargs


@varargs






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

1FCQaXaIZrfwe8ODCEgTUuWNB6LZ wXaflSyKoZRzDFT24zwHAGOjGMF71N,J,B14TTxHvEIh9V gZiR2R89,RbShwFPXB
QN 3bgFX BIK 1D,m0TwF59I41BtXvo,T YYWAb,C W,n0c0cZpe exIPlxFVwdW5I,vsOG5lGkvhdeFc36pAHrO6Y25JY8Rw Ze00lDB

Popular posts from this blog

Boo (programming language)

Rothschild family