什么是Scala通过分隔符拆分List的惯用方法?
如果我有一个String类型的列表,
scala> val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef") items: List[java.lang.String] = List(Apple, Banana, Orange, Tomato, Grapes, BREAK, Salt, Pepper, BREAK, Fish, Chicken, Beef)
如何根据某个字符串/模式将其拆分为n
单独的列表(在本例中为"BREAK"
)。
我已经考虑过使用indexOf
找到"BREAK"
的位置,然后用这个方式拆分列表,或者用takeWhile (i => i != "BREAK")
使用类似的方法,但我想知道是否有更好的方法办法?
如果它有帮助,我知道items
列表中只有3组items
(因此2个"BREAK"
标记)。
def splitBySeparator[T]( l: List[T], sep: T ): List[List[T]] = { l.span( _ != sep ) match { case (hd, _ :: tl) => hd :: splitBySeparator( tl, sep ) case (hd, _) => List(hd) } } val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef") splitBySeparator(items, "BREAK")
结果:
res1: List[List[String]] = List(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))
更新:上述版本虽然简洁有效,但有两个问题:它不能很好地处理边缘情况(如List("BREAK")
或List("BREAK", "Apple", "BREAK")
,而不是tail recursive。所以这是另一个(命令性)版本修复此问题:
import collection.mutable.ListBuffer def splitBySeparator[T]( l: Seq[T], sep: T ): Seq[Seq[T]] = { val b = ListBuffer(ListBuffer[T]()) l foreach { e => if ( e == sep ) { if ( !b.last.isEmpty ) b += ListBuffer[T]() } else b.last += e } b.map(_.toSeq) }
它在内部使用ListBuffer
,就像我在splitBySeparator
的第一个版本中使用的ListBuffer
的实现splitBySeparator
。
另外一个选项:
val l = Seq(1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5, 9, 1, 2, 3, 4, 5) l.foldLeft(Seq(Seq.empty[Int])) { (acc, i) => if (i == 9) acc :+ Seq.empty else acc.init :+ (acc.last :+ i) } // produces: List(List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5), List(1, 2, 3, 4, 5))
怎么样:使用scan
来确定列表中每个元素属于哪个部分。
val l = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef") val count = l.scanLeft(0) { (n, s) => if (s=="BREAK") n+1 else n } drop(1) val paired = l zip count (0 to count.last) map { sec => paired flatMap { case (x, c) => if (c==sec && x!="BREAK") Some(x) else None } } // Vector(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))
这也不是尾递归,但它适用于边缘情况:
def splitsies[T](l:List[T], sep:T) : List[List[T]] = l match { case head :: tail => if (head != sep) splitsies(tail,sep) match { case h :: t => (head :: h) :: t case Nil => List(List(head)) } else List() :: splitsies(tail, sep) case Nil => List() }
唯一烦人的事:
scala> splitsies(List("BREAK","Tiger"),"BREAK") res6: List[List[String]] = List(List(), List(Tiger))
如果你想更好地处理分隔符启动的情况,请查看与Martin的答案中使用span不同的东西(稍微不同的问题)。
val q = items.mkString(",").split("BREAK").map("(^,|,$)".r.replaceAllIn(_, "")).map(_.split(","))
这里“,”是一个唯一的分隔符,它不会出现在项目列表中的任何字符串中。 如果需要,我们可以选择不同的分隔符。
items.mkString(",")
将所有内容组合成一个字符串
.split("BREAK") // which we then split using "BREAK" as delimiter to get a list .map("(^,|,$)".r.replaceAllIn(_, "")) // removes the leading/trailing commas of each element of the list in previous step .map(_.split(",")) // splits each element using comma as seperator to give a list of lists scala> val q = items.mkString(",").split("BREAK").map("(^,|,$)".r.replaceAllIn(_, "")).map(_.split(",")) q: Array[Array[String]] = Array(Array(Apple, Banana, Orange, Tomato, Grapes), Array(Salt, Pepper), Array(Fish, Chicken, Beef)) scala> q(0) res21: Array[String] = Array(Apple, Banana, Orange, Tomato, Grapes) scala> q(1) res22: Array[String] = Array(Salt, Pepper) scala> q(2) res23: Array[String] = Array(Fish, Chicken, Beef)
- JXTable:对特定单元格而不是整列使用TableCellEditor和TableCellRenderer
- 使用lsof对“打开的文件太多”进行故障排除
- 如何将JSON对象流式传输到HttpURLConnection POST请求
- mergesort中的递归:两个递归调用
- com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException:为键’PRIMARY’复制条目”
- 在OS X Mountain Lion上具有自签名证书的Java小程序
- 了解Java中的generics类型
- 数组的getter和setter
- Java – 需要一个记录堆栈跟踪的日志包