Scalaz(13)- Monad:Writer

来源:互联网 时间:2015-12-05

  通过前面的几篇讨论我们了解到F[T]就是FP中运算的表达形式(representation of computation)。在这里F[]不仅仅是一种高阶类型,它还代表了一种运算协议(computation protocol)或者称为运算模型好点,如IO[T],Option[T]。运算模型规范了运算值T的运算方式。而Monad是一种特殊的FP运算模型M[A],它是一种持续运算模式。通过flatMap作为链条把前后两个运算连接起来。多个flatMap同时作用可以形成一个程序运行链。我们可以在flatMap函数实现中增加一些附加作用,如维护状态值(state value)、跟踪记录(log)等。 

  在上一篇讨论中我们用一个Logger的实现例子示范了如何在flatMap函数实现过程中增加附加作用;一个跟踪功能(logging),我们在F[T]运算结构中增加了一个String类型值作为跟踪记录(log)。在本篇讨论中我们首先会对上篇的Logger例子进行一些log类型的概括,设计一个新的Logger结构:

1 case class Logger[LOG, A](log: LOG, value: A) {

2 def map[B](f: A => B): Logger[LOG,B] = Logger(log, f(value))

3 def flatMap[B](f: A => Logger[LOG,B])(implicit M: Monoid[LOG]): Logger[LOG,B] = {

4 val nxLogger = f(value)

5 Logger(log |+| nxLogger.log, nxLogger.value)

6 }

7

8 }

以上Logger对LOG类型进行了概括:任何拥有Monoid实例的类型都可以,能够支持Monoid |+|操作符号。这点从flatMap函数的实现可以证实。

当然我们必须获取Logger的Monad实例才能使用for-comprehension。不过由于Logger有两个类型参数Logger[LOG,A],我们必须用type lambda把LOG类型固定下来,让Monad运算只针对A类型值:

1 object Logger {

2 implicit def toLogger[LOG](implicit M: Monoid[LOG]) = new Monad[({type L[x] = Logger[LOG,x]})#L] {

3 def point[A](a: => A) = Logger(M.zero,a)

4 def bind[A,B](la: Logger[LOG,A])(f: A => Logger[LOG,B]): Logger[LOG,B] = la flatMap f

5 }

6 }

有了Monad实例我们可以使用for-comprehension:

 1 def enterInt(x: Int): Logger[String, Int] = Logger("Entered Int:"+x, x)

2 //> enterInt: (x: Int)Exercises.logger.Logger[String,Int]

3 def enterStr(x: String): Logger[String, String] = Logger("Entered String:"+x, x)

4 //> enterStr: (x: String)Exercises.logger.Logger[String,String]

5

6 for {

7 a <- enterInt(3)

8 b <- enterInt(4)

9 c <- enterStr("Result:")

10 } yield c + (a * b).shows //> res0: Exercises.logger.Logger[String,String] = Logger(Entered Int:3Entered I

11 //| nt:4Entered String:Result:,Result:12)

不过必须对每个类型定义操作函数,用起来挺不方便的。我们可以为任何类型注入操作方法:

1 final class LoggerOps[A](a: A) {

2 def applyLog[LOG](log: LOG): Logger[LOG,A] = Logger(log,a)

3 }

4 implicit def toLoggerOps[A](a: A) = new LoggerOps[A](a)

5 //> toLoggerOps: [A](a: A)Exercises.logger.LoggerOps[A]

我们为任意类型A注入了applyLog方法:

1 3.applyLog("Int three") //> res1: Exercises.logger.Logger[String,Int] = Logger(Int three,3)

2 "hi" applyLog "say hi" //> res2: Exercises.logger.Logger[String,String] = Logger(say hi,hi)

3 for {

4 a <- 3 applyLog "Entered Int 3"

5 b <- 4 applyLog "Entered Int 4"

6 c <- "Result:" applyLog "Entered String 'Result'"

7 } yield c + (a * b).shows //> res3: Exercises.logger.Logger[String,String] = Logger(Entered Int 3Entered

8 //| Int 4Entered String 'Result',Result:12)

用aplyLog这样操作方便多了。由于LOG可以是任何拥有Monoid实例的类型。除了String类型之外,我们还可以用Vector或List这样的高阶类型:

1 for {

2 a <- 3 applyLog Vector("Entered Int 3")

3 b <- 4 applyLog Vector("Entered Int 4")

4 c <- "Result:" applyLog Vector("Entered String 'Result'")

5 } yield c + (a * b).shows //> res4: Exercises.logger.Logger[scala.collection.immutable.Vector[String],Str

6 //| ing] = Logger(Vector(Entered Int 3, Entered Int 4, Entered String 'Result')

7 //| ,Result:12)

一般来讲,用Vector效率更高,在下面我们会证实这点。

既然A可以是任何类型,那么高阶类型如Option[T]又怎样呢:

1 for {

2 oa <- 3.some applyLog Vector("Entered Some(3)")

3 ob <- 4.some applyLog Vector("Entered Some(4)")

4 } yield ^(oa,ob){_ * _} //> res0: Exercises.logger.Logger[scala.collection.immutable.Vector[String],Opti

5 //| on[Int]] = Logger(Vector(Entered Some(3), Entered Some(4)),Some(12))

 一样可以使用。注意oa,ob是Option类型所以必须使用^(oa,ob){...}来结合它们。

我们再来看看Logger的典型应用:一个gcd(greatest common denominator)算法例子:

 1 def gcd(x: Int, y: Int): Logger[Vector[String], Int] = {

2 if (y == 0 ) for {

3 _ <- x applyLog Vector("Finished at " + x)

4 } yield x

5 else

6 x applyLog Vector(x.shows + " mod " + y.shows + " = " + (x % y).shows) >>= {_ => gcd(y, x % y) }

7

8 } //> gcd: (x: Int, y: Int)Exercises.logger.Logger[Vector[String],Int]

9 gcd(18,6) //> res5: Exercises.logger.Logger[Vector[String],Int] = Logger(Vector(18 mod 6

10 //| = 0, Finished at 6),6)

11 gcd(8,3) //> res6: Exercises.logger.Logger[Vector[String],Int] = Logger(Vector(8 mod 3 =

12 //| 2, 3 mod 2 = 1, 2 mod 1 = 0, Finished at 1),1)

注意 >>= 符号的使用,显现了Logger的Monad实例特性。

实际上scalar提供了Writer数据结构,它是WriterT类型的一个特例:

1 type Writer[+W, +A] = WriterT[Id, W, A]

我们再看看WriterT:scalaz/WriterT.scala

final case class WriterT[F[_], W, A](run: F[(W, A)]) { self =>

...

WriterT在运算值A之外增加了状态值W,形成一个对值(paired value)。这是一种典型的FP状态维护模式。不过WriterT的这个(W,A)是在运算模型F[]内的。这样可以实现更高层次的概括,为这种状态维护的运算增加多一层运算协议(F[])影响。我们看到Writer运算是WriterT运算模式的一个特例,它直接计算运算值,不需要F[]影响,所以Writer的F[]采用了Id,因为Id[A] = A。我们看看WriterT是如何通过flatMap来实现状态维护的:scalaz/WriterT.scala:

1 def flatMap[B](f: A => WriterT[F, W, B])(implicit F: Bind[F], s: Semigroup[W]): WriterT[F, W, B] =

2 flatMapF(f.andThen(_.run))

3

4 def flatMapF[B](f: A => F[(W, B)])(implicit F: Bind[F], s: Semigroup[W]): WriterT[F, W, B] =

5 writerT(F.bind(run){wa =>

6 val z = f(wa._2)

7 F.map(z)(wb => (s.append(wa._1, wb._1), wb._2))

8 })

在flatMapF函数里对(W,A)的W进行了Monoid append操作。

实际上Writer可以说是一种附加的数据结构,它在运算模型F[A]内增加了一个状态值W形成了F(W,A)这种形式。当我们为任何类型A提供注入方法来构建这个Writer结构后,任意类型的运算都可以使用Writer来实现在运算过程中增加附加作用如维护状态、logging等等。我们看看scalaz/Syntax/WriterOps.scala:

package scalaz

package syntax

final class WriterOps[A](self: A) {

def set[W](w: W): Writer[W, A] = WriterT.writer(w -> self)

def tell: Writer[A, Unit] = WriterT.tell(self)

}

trait ToWriterOps {

implicit def ToWriterOps[A](a: A) = new WriterOps(a)

}

存粹是方法注入。现在任何类型A都可以使用set和tell来构建Writer类型了:

 1 3 set Vector("Entered Int 3") //> res2: scalaz.Writer[scala.collection.immutable.Vector[String],Int] = WriterT

2 //| ((Vector(Entered Int 3),3))

3 "hi" set Vector("say hi") //> res3: scalaz.Writer[scala.collection.immutable.Vector[String],String] = Writ

4 //| erT((Vector(say hi),hi))

5 List(1,2,3) set Vector("list 123") //> res4: scalaz.Writer[scala.collection.immutable.Vector[String],List[Int]] = W

6 //| riterT((Vector(list 123),List(1, 2, 3)))

7 3.some set List("some 3") //> res5: scalaz.Writer[List[String],Option[Int]] = WriterT((List(some 3),Some(3

8 //| )))

9 Vector("just say hi").tell //> res6: scalaz.Writer[scala.collection.immutable.Vector[String],Unit] = Writer

10 //| T((Vector(just say hi),()))

用Writer运算上面Logger的例子:

1 for {

2 a <- 3 set "Entered Int 3 "

3 b <- 4 set "Entered Int 4 "

4 c <- "Result:" set "Entered String 'Result'"

5 } yield c + (a * b).shows //> res7: scalaz.WriterT[scalaz.Id.Id,String,String] = WriterT((Entered Int 3 En

6 //| tered Int 4 Entered String 'Result',Result:12))

如果A是高阶类型如List[T]的话,还能使用吗:

1 for {

2 la <- List(1,2,3) set Vector("Entered List(1,2,3)")

3 lb <- List(4,5) set Vector("Entered List(4,5)")

4 lc <- List(6) set Vector("Entered List(6)")

5 } yield (la |@| lb |@| lc) {_ + _ + _} //> res1: scalaz.WriterT[scalaz.Id.Id,scala.collection.immutable.Vector[String]

6 //| ,List[Int]] = WriterT((Vector(Entered List(1,2,3), Entered List(4,5), Enter

7 //| ed List(6)),List(11, 12, 12, 13, 13, 14)))

的确没有问题。

那个gcd例子还是挺有代表性的,我们用Writer来运算和跟踪gcd运算:

 1 def gcd(a: Int, b: Int): Writer[Vector[String],Int] =

2 if (b == 0 ) for {

3 _ <- Vector("Finished at "+a.shows).tell

4 } yield a

5 else

6 Vector(a.shows+" mod "+b.shows+" = "+(a % b).shows).tell >>= {_ => gcd(b,a % b)}

7 //> gcd: (a: Int, b: Int)scalaz.Writer[Vector[String],Int]

8

9 gcd(8,3) //> res8: scalaz.Writer[Vector[String],Int] = WriterT((Vector(8 mod 3 = 2, 3 mo

10 //| d 2 = 1, 2 mod 1 = 0, Finished at 1),1))

11 gcd(16,4) //> res9: scalaz.Writer[Vector[String],Int] = WriterT((Vector(16 mod 4 = 0, Fin

12 //| ished at 4),4))

在维护跟踪记录(logging)时使用Vector会比List更高效。我们来证明一下:

 1 def listLogCount(c: Int): Writer[List[String],Unit] = {

2 @annotation.tailrec

3 def countDown(c: Int, w: Writer[List[String],Unit]): Writer[List[String],Unit] = c match {

4 case 0 => w >>= {_ => List("0").tell }

5 case x => countDown(x-1, w >>= {_ => List(x.shows).tell })

6 }

7 val t0 = System.currentTimeMillis

8 val r = countDown(c,List[String]().tell)

9 val t1 = System.currentTimeMillis

10 r >>= {_ => List((t1 -t0).shows+"msec").tell }

11 } //> listLogCount: (c: Int)scalaz.Writer[List[String],Unit]

12 def vectorLogCount(c: Int): Writer[Vector[String],Unit] = {

13 @annotation.tailrec

14 def countDown(c: Int, w: Writer[Vector[String],Unit]): Writer[Vector[String],Unit] = c match {

15 case 0 => w >>= {_ => Vector("0").tell }

16 case x => countDown(x-1, w >>= {_ => Vector(x.shows).tell })

17 }

18 val t0 = System.currentTimeMillis

19 val r = countDown(c,Vector[String]().tell)

20 val t1 = System.currentTimeMillis

21 r >>= {_ => Vector((t1 -t0).shows+"msec").tell }

22 } //> vectorLogCount: (c: Int)scalaz.Writer[Vector[String],Unit]

23

24 (listLogCount(10000).run)._1.last //> res10: String = 361msec

25 (vectorLogCount(10000).run)._1.last //> res11: String = 49msec

看,listLogCount(10000)用了361msec

vectorLogCount(10000)只用了49msec,快了8,9倍呢。

 

相关阅读:
Top