Implementing a builder: Combine

本篇我们继续讨论从一个使用Combine方法的computation expression中返回多值。

前面的故事

到现在为止，我们的表达式建造（builder）类如下

type TraceBuilder() =
    member this.Bind(m, f) = 
        match m with 
        | None -> 
            printfn "Binding with None. Exiting."
        | Some a -> 
            printfn "Binding with Some(%A). Continuing" a
        Option.bind f m

    member this.Return(x) = 
        printfn "Returning a unwrapped %A as an option" x
        Some x

    member this.ReturnFrom(m) = 
        printfn "Returning an option (%A) directly" m
        m

    member this.Zero() = 
        printfn "Zero"
        None

    member this.Yield(x) = 
        printfn "Yield an unwrapped %A as an option" x
        Some x

    member this.YieldFrom(m) = 
        printfn "Yield an option (%A) directly" m
        m
        
// make an instance of the workflow                
let trace = new TraceBuilder()

这个类到现在工作正常。但是，我们即将看到一个问题

两个‘yield’带来的问题

之前，我们看到yield可以像return一样返回值。

通常来说，yield不会只使用一次，而是使用多次，以便在一个过程中的不同阶段返回多个值，如枚举（enumeration）。如下代码所示

trace { 
    yield 1
    yield 2
    } |> printfn "Result for yield then yield: %A"

但是运行这段代码，我们获得一个错误

This control construct may only be used if the computation expression builder defines a 'Combine' method.

并且如果你使用return来代替yield，你会获得同样的错误

trace { 
    return 1
    return 2
    } |> printfn "Result for return then return: %A"

在其他上下文中，也同样会有这个错误，比如我们想在做某事后返回一个值，如下代码

trace { 
    if true then printfn "hello" 
    return 1
    } |> printfn "Result for if then return: %A"

我们会获得同样的错误。

理解这个问题

那这里该怎么办呢？

为了帮助理解，我们回到computation expression的后台视角，我们能看到return和yield是一系列计算的最后一步，就比如

Bind(1,fun x -> 
   Bind(2,fun y -> 
     Bind(x + y,fun z ->

        Return(z)  // or Yield

可以将return（或yield）看成是对行首缩进的复位，那样当我们再次return/yield时，我们可以这么写代码

Bind(1,fun x -> 
   Bind(2,fun y -> 
     Bind(x + y,fun z -> 
        Yield(z)  
// start a new expression        
Bind(3,fun w -> 
   Bind(4,fun u -> 
     Bind(w + u,fun v -> 
        Yield(v)

然而这段代码可以被简化成

let value1 = some expression 
let value2 = some other expression

也就是说，我们在computation expression中有两个值，现在问题很明显，如何让这两个值结合成一个值作为整个computation expression的返回结果？

这是一个关键点。对单个computation expression，return和yield不提前返回。Computation expression的每个部分总是被计算——不会有短路。如果我们想短路并提前返回，我们必须写代码来实现。

回到刚才提出的问题。我们有两个表达式，这两个表达式有两个结果值：如何将多个值结合到一个值里面？

介绍"Combine"

上面问题的答案就是使用“combine”方法，这个方法输入参数为两个包装类型值，然后将这两个值结合生成另外一个包装值。

在我们的例子中，我们使用int option，故一个简单的实现就是将数字加起来。每个参数是一个option类型，需要考虑四种情况，代码如下

type TraceBuilder() =
    // other members as before

    member this.Combine (a,b) = 
        match a,b with
        | Some a', Some b' ->
            printfn "combining %A and %A" a' b' 
            Some (a' + b')
        | Some a', None ->
            printfn "combining %A with None" a' 
            Some a'
        | None, Some b' ->
            printfn "combining None with %A" b' 
            Some b'
        | None, None ->
            printfn "combining None with None"
            None

// make a new instance        
let trace = new TraceBuilder()

运行测试代码

trace { 
    yield 1
    yield 2
    } |> printfn "Result for yield then yield: %A"

然而，这次却获得了一个不同的错误

This control construct may only be used if the computation expression builder defines a 'Delay' method

Delay方法类似一个钩子，使computation expression延迟计算，直到需要用到其值时才进行计算。一会我们将讨论这其中的细节。现在，我们创建一个默认实现

type TraceBuilder() =
    // other members as before

    member this.Delay(f) = 
        printfn "Delay"
        f()

// make a new instance        
let trace = new TraceBuilder()

再次运行测试代码

trace { 
    yield 1
    yield 2
    } |> printfn "Result for yield then yield: %A"

最后我们获得结果如下

Delay
Yield an unwrapped 1 as an option
Delay
Yield an unwrapped 2 as an option
combining 1 and 2
Result for yield then yield: Some 3

整个工作流的结果为所有yield的和，即3。

如果在工作流中发生一个“错误”（例如，None），那第二个yield不发生，总的结果为Some 1

trace { 
    yield 1
    let! x = None
    yield 2
    } |> printfn "Result for yield then None: %A"

使用三个yield

trace { 
    yield 1
    yield 2
    yield 3
    } |> printfn "Result for yield x 3: %A"

结果如期望，为Some 6

我们甚至可以混用yield和return。除了语法不同，结果是相同的

trace { 
    yield 1
    return 2
    } |> printfn "Result for yield then return: %A" 

trace { 
    return 1
    return 2
    } |> printfn "Result for return then return: %A"

使用Combine实现顺序产生结果

将数值加起来不是yield真正的目的，尽管你也可以使用yield类似地将字符串连接起来，就像StringBuilder一样。

yield更一般地是用来顺序产生结果，现在我们已经知道Combine，我们可以使用Combine和Delay方法来扩展“ListBuilder”工作流

Combine方法是连接list
Delay方法使用默认的实现

整个建造类如下

type ListBuilder() =
    member this.Bind(m, f) = 
        m |> List.collect f

    member this.Zero() = 
        printfn "Zero"
        []
        
    member this.Yield(x) = 
        printfn "Yield an unwrapped %A as a list" x
        [x]

    member this.YieldFrom(m) = 
        printfn "Yield a list (%A) directly" m
        m

    member this.For(m,f) =
        printfn "For %A" m
        this.Bind(m,f)
        
    member this.Combine (a,b) = 
        printfn "combining %A and %A" a b 
        List.concat [a;b]

    member this.Delay(f) = 
        printfn "Delay"
        f()

// make an instance of the workflow                
let listbuilder = new ListBuilder()

下面使用它的代码

listbuilder { 
    yield 1
    yield 2
    } |> printfn "Result for yield then yield: %A" 

listbuilder { 
    yield 1
    yield! [2;3]
    } |> printfn "Result for yield then yield! : %A"

以下是一个更为复杂的例子，这个例子使用了for循环和一些yield

listbuilder { 
    for i in ["red";"blue"] do
        yield i
        for j in ["hat";"tie"] do
            yield! [i + " " + j;"-"]
    } |> printfn "Result for for..in..do : %A"

然后结果为

["red"; "red hat"; "-"; "red tie"; "-"; "blue"; "blue hat"; "-"; "blue tie"; "-"]

可以看到，结合for..in..do和yield，我们已经很接近内建的seq表达式语法了（当然，除了不像seq那样的延迟特性）。

我强烈建议你再回味一下以上那些内容，直到非常清楚在那些语法的背后发生了什么。正如你在上面的例子中看到的一样，你创造性地可以使用yeild产生各种不规则list，而不仅仅是简单的list

说明：如果想知道while，我们将延后一些，直到我们在下一篇中讲完了Delay之后再来讨论while。

"Combine"处理顺序

Combine方法只有两个输入参数，那如果组合多个两个的值呢？例如，下面代码组合4个值

listbuilder { 
    yield 1
    yield 2
    yield 3
    yield 4
    } |> printfn "Result for yield x 4: %A"

如果你看输出，你将会知道是成对地组合值

combining [3] and [4]
combining [2] and [3; 4]
combining [1] and [2; 3; 4]
Result for yield x 4: [1; 2; 3; 4]

更准确地说，它们是从最后一个值开始，向后被组合起来。“3”和“4”组合，结果再与“2”组合，如此类推。

Combine

无序的Combine

在之前的第二个有问题的例子中，表达式是无序的，我们只是让两个独立的表达式处于同一行中

trace { 
    if true then printfn "hello"  //expression 1
    return 1                      //expression 2
    } |> printfn "Result for combine: %A"

此时，如何组合组合表达式？

有很多通用的方法，具体是哪种方法还依赖于工作流想实现什么目的。

为有“success”或“failure”的工作流实现combine

如果工作流有“success”或者“failure”的概念，则一个标准的方法是：

如果第一个表达式“succeeds”（执行成功），则使用表达式的值
否则，使用第二个表达式的值

在本例中，我们通常对Zero使用“failure”值。

在将一系列的“or else”表达式链接起来时，这个方法非常有用，第一个成功的表达式的值将成为整体的返回值。

if (do first expression)
or else (do second expression)
or else (do third expression)

例如对maybe工作流，如果第一个表达式结果是Some，则返回第一个表达式的值，否则返回第二个表达式的值，如下所示

type TraceBuilder() =
    // other members as before
    
    member this.Zero() = 
        printfn "Zero"
        None  // failure
    
    member this.Combine (a,b) = 
        printfn "Combining %A with %A" a b
        match a with
        | Some _ -> a  // a succeeds -- use it
        | None -> b    // a fails -- use b instead
        
// make a new instance        
let trace = new TraceBuilder()

例子：解析

试试一个有解析功能的例子，其实现如下

type IntOrBool = I of int | B of bool

let parseInt s = 
    match System.Int32.TryParse(s) with
    | true,i -> Some (I i)
    | false,_ -> None

let parseBool s = 
    match System.Boolean.TryParse(s) with
    | true,i -> Some (B i)
    | false,_ -> None

trace { 
    return! parseBool "42"  // fails
    return! parseInt "42"
    } |> printfn "Result for parsing: %A"

结果如下

Some (I 42)

可以看到第一个return！表达式结果为None，它被忽略掉，所以整个表达式结果为第二个表达式的值，Some (I 42)

例子：查字典

在这个例子中，我们在一些字典中查询一些键，并在找到对应的值的时候返回

let map1 = [ ("1","One"); ("2","Two") ] |> Map.ofList
let map2 = [ ("A","Alice"); ("B","Bob") ] |> Map.ofList

trace { 
    return! map1.TryFind "A"
    return! map2.TryFind "A"
    } |> printfn "Result for map lookup: %A"

结果如下

Result for map lookup: Some "Alice"

可以看到，第一个查询结果为None，它被忽略掉，故整个语句结果为第二次查询结果值

从上面的讨论可见，这个技术在解析或者计算一系列操作（可能不成功）时非常方便。

为带有顺序步骤的工作流实现“combine”

如果工作流的操作步骤是顺序的，那整体的结果就是最后一步的值，而前面步骤的计算仅是为了获得边界效应（副作用，如改变某些变量的值）。

通常在F#中，顺序步骤可能会写成这样

do some expression
do some other expression 
final expression

或者使用分号语法，即

some expression; some other expression; final expression

在普通的F#语句中，最后一个表达式除外的每个表达式的计算结果值均为unit。

Computation expression的等效顺序操作是将每个表达式（最后一个表达式除外）看成一个unit的包装类型值，然后将这个值传入下一个表达式，如此类推，直到最后一个表达式。

这就跟bind所做的事情差不多，所以最简单的实现就是再次利用Bind方法。当然，这里Zero就是unit的包装值

type TraceBuilder() =
    // other members as before

    member this.Zero() = 
        printfn "Zero"
        this.Return ()  // unit not None

    member this.Combine (a,b) = 
        printfn "Combining %A with %A" a b
        this.Bind( a, fun ()-> b )
        
// make a new instance        
let trace = new TraceBuilder()

与普通的bind不同的是，这个continuation有一个unit类型的输入，然后计算b。这反过来要求a是WrapperType<unit>类型，或者更具体地，如我们这里例子中的unit option

以下是一个顺序过程的例子，实现了Combine

trace { 
    if true then printfn "hello......."
    if false then printfn ".......world"
    return 1
    } |> printfn "Result for sequential combine: %A"

输出结果为

hello.......
Zero
Returning a unwrapped <null> as an option
Zero
Returning a unwrapped <null> as an option
Returning a unwrapped 1 as an option
Combining Some null with Some 1
Combining Some null with Some 1
Result for sequential combine: Some 1

注意整个语句的结果是最后一个表达式的值。

为创建数据结构的工作流实现“combine”

最后，还有一个工作流的常见模式是创建数据结构。在这种情况下，Combine应该合并两个数据结构，并且如果需要的话（如果可能），Zero方法应该创建一个空数据结构。

在前面的“list builder”例子中，我们使用的就是这个方法。Combine结合两个列表，并且Zero是空列表。

混合“Combine”与“Zero”的说明

我们已经看到关于option类型的两种不同的Combine实现。

第一个使用options指示“success/failure”，第一个成功的表达式结果即为最终的结果值，在这个情况下，Zero被定义成None。
第二个是顺序步骤操作的例子，在这种情况下，Zero被定义成Some ()

两种情况均能良好的工作，但是这两个例子是否只是侥幸能正常工作？有没有关于正确实现Combine和Zero的指导说明？

首先，如果输入参数交换位置，Combine不必返回相同的结果值，即，Combine(a,b)和Combine(b,a)不需要相同。“list builder”就是一个很好的例子

另外，把Zero与Combine连接起来是很有用的。

规则：Combine(a,Zero)应该与Combine(Zero,a)相同，而Combine(Zero,a)应该与a相同。

为了使用算法的类比，你可以把Combine看成加法（这不是一个差劲的类比——它确实将两个值相加）。当然，Zero就是数字0，故上面的这条规则可以表述成：

规则：a+0与0+a相同，与a相同，而+表示Combine，0表示Zero。

如果你观察有关option类型的第一个Combine实现（“success/failure”），你会发现它确实与这条规则符合，第二个实现（“bind” with Some()）也是如此。

另外一方面，如果我们已经使用“bind”来实现Combine，将Zero定义成None，则它不遵循这个规则，这意味着我们已经碰到一些错误。

不带bind的“Combine”

关于其他的builder方法，如果不需要它们，则不必实现这些方法。故对一个严格顺序的工作流而言，可以简单地创建一个包含Combine、Zero和Yield方法的建造类（builder class），也就是，不用实现Bind和Return。

以下是一个最简单的实现

type TraceBuilder() =

    member this.ReturnFrom(x) = x

    member this.Zero() = Some ()

    member this.Combine (a,b) = 
        a |> Option.bind (fun ()-> b )

    member this.Delay(f) = f()

// make an instance of the workflow                
let trace = new TraceBuilder()

使用方法如下

trace { 
    if true then printfn "hello......."
    if false then printfn ".......world"
    return! Some 1
    } |> printfn "Result for minimal combine: %A"

类似地，如果你有一个面向数据结构的工作流，可以只实现Combine和其他一些帮助方法。例如，以下为一个list builder类的简单实现

type ListBuilder() =

    member this.Yield(x) = [x]

    member this.For(m,f) =
        m |> List.collect f

    member this.Combine (a,b) = 
        List.concat [a;b]

    member this.Delay(f) = f()

// make an instance of the workflow                
let listbuilder = new ListBuilder()

尽管这是最简单的实现，我们依然可以如下写使用代码

listbuilder { 
    yield 1
    yield 2
    } |> printfn "Result: %A" 

listbuilder { 
    for i in [1..5] do yield i + 2
    yield 42
    } |> printfn "Result: %A"

独立的Combine函数

在上一篇中，我们看到“bind”函数通常被当成一个独立函数来使用，并用操作符 >>= 来表示。

Combine函数亦是如此，常被当成一个独立函数来使用。跟bind不同的是，Combine没有一个标准符号——它可以变化，取决于combine函数的用途。

一个符号化的combination操作通常写成 ++ 或者 <+>。我们之前对options使用的“左倾”的combination（即，如果第一个表达式失败，则只执行第二个表达式）有时候写成 <++。

以下是一个关于options的独立的左倾combination，跟上面那个查询字典的例子类似。

module StandaloneCombine = 

    let combine a b = 
        match a with
        | Some _ -> a  // a succeeds -- use it
        | None -> b    // a fails -- use b instead

    // create an infix version
    let ( <++ ) = combine

    let map1 = [ ("1","One"); ("2","Two") ] |> Map.ofList
    let map2 = [ ("A","Alice"); ("B","Bob") ] |> Map.ofList

    let result = 
        (map1.TryFind "A") 
        <++ (map1.TryFind "B")
        <++ (map2.TryFind "A")
        <++ (map2.TryFind "B")
        |> printfn "Result of adding options is: %A"

总结

这篇文章中我们学到Combine的哪些内容？

如果在一个computation expression中需要combine或者“add”不止一个的包装类型值，则需要实现Combine（和Delay）
Combine方法从后往前地将值成对地结合起来
没有一个通用的Combine实现能处理所有情况——需要根据工作流具体的需要定义不同的Combine实现
有将Combine关系到Zero的敏感规则
Combine不依赖Bind的实现
Combine可以被当成一个独立的函数暴露出来

下一篇中，当计算内部的表达式时，我们增加一些逻辑控制，并引入正确的短路和延迟计算。