HearthBuddy Ai调试实战1-->出牌的时候,少召唤了图腾就结束回合

期望通过ai的调试,来搞明白出牌的逻辑。

55是投火无面者
63是恐狼前锋
34是风怒
36是自动漩涡打击装置
13是空灵召唤者, "LocStringZhCn": "<b>亡语:</b> 随机将一张恶魔牌从你的手牌置入战场。",
64是对方英雄,术士

目前的输出结果是:【少了召唤图腾】

value of best board 66
Best actions as following:
Action1:
attacker: 63 enemy: 64
Action2:
play id 34 target 55 pos 1
Action3:
attacker: 55 enemy: 64
Action4:
attacker: 55 enemy: 64

ailoop汇总

在ai loop的时候计算了可能的操作
deep 1 len 8 dones 0
cut to len 8

deep 2 len 41 dones 0
cut to len 29

deep 3 len 90 dones 0
cut to len 49

deep 4 len 85 dones 0
cut to len 40

deep 5 len 31 dones 0
cut to len 13

deep 6 len 0 dones 0
cut to len 0

ailoop1

deep 0 len 8 dones 0
cut to len 8

ai计算了8个单次操作【根据反馈,风怒其实还可以给敌方随从的,这个直接被HearthBuddy排除了】

start print 8 actions startIndex = 0,endIndex = 1
a.print(); start
play id 34 target 55 pos 1 操作1,风怒给无面投火者
a.print(); end

a.print(); start
play id 34 target 63 pos 1 操作2,风怒给恐狼前锋
a.print(); end

a.print(); start
play id 36 pos 3 操作3,使用卡牌,自动漩涡打击装置
a.print(); end

a.print(); start
attacker: 55 enemy: 13 操作4,无面投火者攻击空灵召唤者
a.print(); end

a.print(); start
attacker: 55 enemy: 64 操作5,无面投火者攻击术士
a.print(); end

a.print(); start
attacker: 63 enemy: 13 操作6,恐狼前锋攻击空灵召唤者

a.print(); end
a.print(); start
attacker: 63 enemy: 64 操作7,恐狼前锋攻击术士
a.print(); end

a.print(); start
useability 操作8,使用英雄技能,召唤图腾
a.print(); end

end print 8 actions startIndex = 0,endIndex = 1,

ailoop2 在ailoop1的基础上,计算后续单次操作

start print 5 actions startIndex = 0,endIndex = 8  所以这里会循环8次,这里的8上ailoop1的8个操作

ailoop1里的8个单次操作,每个对应的后续可能单次操作分别为

 5+7+5+5+5+5+5+4 =41

deep 1 len 8 dones 0
cut to len 8

deep 2 len 41 dones 0
cut to len 29

41中操作,经过 cuttingposibilities(isLethalCheck);,剩下29种操作

deep 3 len 90 dones 0
cut to len 49

deep 4 len 85 dones 0
cut to len 40

deep 5 len 31 dones 0
cut to len 13

deep 6 len 0 dones 0
cut to len 0

 

startEnemyTurnSimThread1  对应ailoop1里面的操作1:风怒给无面投火者,剩下1费

a1.print(); start
attacker: 55 enemy: 13   操作1,投火无面者攻击空灵召唤者
a1.print(); end

a2.print(); start
attacker: 55 enemy: 64  操作2,投火无面者攻击术士
a2.print(); end

a3.print(); start
attacker: 63 enemy: 13   操作3,恐狼前锋攻击空灵召唤者
a3.print(); end

a4.print(); start
attacker: 63 enemy: 64  操作4,恐狼前锋攻击术士
a4.print(); end

a5.print(); start
useability              操作5,使用英雄技能
a5.print(); end

end print 5 actions startIndex = 0,endIndex = 8,

这里的操作4,经过cut,会被后面的操作取代。因为属于重复操作。

itemPlayfield9 chuck deep==2
play id 34 target 55 pos 1
attacker: 55 enemy: 64

itemPlayfield12 chuck deep==2
play id 34 target 55 pos 1
useability

itemPlayfield23 chuck deep==2
play id 34 target 55 pos 1
attacker: 63 enemy: 13


itemPlayfield24 chuck deep==2
play id 34 target 55 pos 1
attacker: 55 enemy: 13

itemPlayfield15 chuck deep==2
attacker: 63 enemy: 64        恐狼前锋攻击术士
play id 34 target 55 pos 1     风怒给无面投火者

调试之后的发现

ai本身在第五回合,是计算出最优解的【但是在第六回合,模拟敌方操作后,分数反而输给了一个第四回个的一个】

 itemPlayfield1 chuck deep==5 boardvalue==118
play id 34 target 55 pos 1
useability 
attacker: 55 enemy: 64
attacker: 55 enemy: 64
attacker: 63 enemy: 64 

ai在第六回合,会模拟敌方的操作,对上面第五回合的palyfield做一些操作。然后重新评分。

itemPlayfield1 chuck deep2==5 boardvalue==118
itemPlayfield2 chuck deep2==5 boardvalue==97
itemPlayfield3 chuck deep2==5 boardvalue==88
itemPlayfield4 chuck deep2==5 boardvalue==71
itemPlayfield5 chuck deep2==5 boardvalue==58
itemPlayfield6 chuck deep2==5 boardvalue==53
itemPlayfield7 chuck deep2==5 boardvalue==50
itemPlayfield8 chuck deep2==5 boardvalue==50
itemPlayfield9 chuck deep2==5 boardvalue==46
itemPlayfield10 chuck deep2==5 boardvalue==43
itemPlayfield11 chuck deep2==5 boardvalue==37
itemPlayfield12 chuck deep2==5 boardvalue==29
itemPlayfield13 chuck deep2==5 boardvalue==23

模拟敌方操作之后

itemPlayfield1 chuck deep1==6 boardvalue==63   上面的118评分变成只有63了。
itemPlayfield2 chuck deep1==6 boardvalue==38
itemPlayfield3 chuck deep1==6 boardvalue==32
itemPlayfield4 chuck deep1==6 boardvalue==22
itemPlayfield5 chuck deep1==6 boardvalue==8
itemPlayfield6 chuck deep1==6 boardvalue==-2
itemPlayfield7 chuck deep1==6 boardvalue==1
itemPlayfield8 chuck deep1==6 boardvalue==1
itemPlayfield9 chuck deep1==6 boardvalue==-16
itemPlayfield10 chuck deep1==6 boardvalue==-9
itemPlayfield11 chuck deep1==6 boardvalue==-15
itemPlayfield12 chuck deep1==6 boardvalue==-23
itemPlayfield13 chuck deep1==6 boardvalue==-29

模拟敌方操作后的最优秀的评分记录是,itemPlayfield2 chuck deep1==5 boardvalue==66

对应的模拟敌方操作前的记录是,itemPlayfield2 chuck deep2==4 boardvalue==111

具体操作是

itemPlayfield2 chuck deep2==4 boardvalue==111
attacker: 63 enemy: 64

play id 34 target 55 pos 1

attacker: 55 enemy: 64

attacker: 55 enemy: 64

第六回合对第五回合的结果进行敌方模拟

这里的temp中有13个元素

startEnemyTurnSimThread(temp, 0, temp.Count);

执行以下这段代码

  Ai.Instance.enemyTurnSim[threadnumber].simulateEnemysTurn(p, this.simulateSecondTurn, playaround, false, playaroundprob, playaroundprob2);

然后再计算botBase.getPlayfieldValue(p)就发现这个数值,从118变成63了。

RoutinesDefaultRoutineSilverfishaiEnemyTurnSimulator.cs

public void simulateEnemysTurn(Playfield rootfield, bool simulateTwoTurns, bool playaround, bool print, int pprob, int pprob2)

 打印board

 foreach (var itemPlayfield in temp)
                {
                    chuck2++;
                    var boardValue = botBase.getPlayfieldValue(itemPlayfield);
                    Helpfunctions.Instance.logg($"itemPlayfield{chuck2} chuck deep1=={deep} boardvalue=={boardValue}");
                    if (deep == 5 && chuck2 == 2)
                    {
                        itemPlayfield.printBoard();
                    }

                    if (deep == 6 && chuck2 == 1)
                    {
                        itemPlayfield.printBoard();
                    }
                }

itemPlayfield1 chuck deep1==6 boardvalue==63

+++++++ printBoard start +++++++++
board/hash/turn: 63 / 5571393420105 / 1 ++++++++++++++++++++++
pen 3
mana 0/5
cardsplayed: 1 handsize: 4 enemyhand: 8
ownhero:
ownherohp: 30 + 0
ownheroattac: 0
ownheroweapon: 0 0 unknown None 0 0
ownherostatus: frozenFalse
enemyherohp: 4 + 0
play id 34 target 55 pos 1
useability
attacker: 55 enemy: 64
attacker: 55 enemy: 64
attacker: 63 enemy: 64
OWN MINIONS################ 2
deckpos, name,ang, hp: 1, flamewreathedfaceless, 8, 7 55
deckpos, name,ang, hp: 2, direwolfalpha, 2, 2 63
ENEMY MINIONS############ 1
deckpos, name,ang, hp: 1, voidcaller, 3, 4 13
Own Handcards:
pos 1 thelichking 8 entity 56 ICC_314 0 0 0
pos 2 jadelightning 4 entity 46 CFM_707 0 0 0
pos 3 whirlingzapomatic 2 entity 36 GVG_037 0 0 0
pos 4 genngreymane 6 entity 45 GIL_692 0 0 0
+++++++ printBoard end +++++++++

itemPlayfield2 chuck deep1==5 boardvalue==66

+++++++ printBoard start +++++++++
board/hash/turn: 66 / 4571393421105 / 1 ++++++++++++++++++++++
pen 0
mana 1/5
cardsplayed: 1 handsize: 4 enemyhand: 8
ownhero:
ownherohp: 30 + 0
ownheroattac: 0
ownheroweapon: 0 0 unknown None 0 0
ownherostatus: frozenFalse
enemyherohp: 4 + 0
attacker: 63 enemy: 64
play id 34 target 55 pos 1
attacker: 55 enemy: 64
attacker: 55 enemy: 64
OWN MINIONS################ 2
deckpos, name,ang, hp: 1, flamewreathedfaceless, 8, 7 55
deckpos, name,ang, hp: 2, direwolfalpha, 2, 2 63
ENEMY MINIONS############ 1
deckpos, name,ang, hp: 1, voidcaller, 3, 4 13
Own Handcards:
pos 1 thelichking 8 entity 56 ICC_314 0 0 0
pos 2 jadelightning 4 entity 46 CFM_707 0 0 0
pos 3 whirlingzapomatic 2 entity 36 GVG_037 0 0 0
pos 4 genngreymane 6 entity 45 GIL_692 0 0 0
+++++++ printBoard end +++++++++

比较之后,发现更好的选择,有3个pen。惩罚值?

Helpfunctions.Instance.logg("pen " + this.evaluatePenality);

设置断点,发现惩罚值设置的地方

// its using the hero power--------------------------------
if (a.actionType == actionEnum.useHeroPower)
{
playHeroPower(a.target, a.penalty, this.isOwnTurn, a.druidchoice);
}

在movegenerate的时候加的惩罚值

if (cardPlayPenalty <= 499)
{
Action a = new Action(actionEnum.playcard, hc, null, bestPlace, targetMinion, cardPlayPenalty,
choice);
ret.Add(a);
}

 List<Action> actions = movegen.GetMoveList(p, usePenalityManager, useCutingTargets, true);

 cardPlayPenalty = pen.getPlayCardPenality(p.ownHeroAblility.card, targetMinion, p);

// use own ability
            if (own)
            {
                if (p.ownAbilityReady 
                    && p.mana >= p.ownHeroAblility.card.getManaCost(p, p.ownHeroAblility.manacost)) // if ready and enough manna
                {
                    CardDB.Card c = p.ownHeroAblility.card;
                    int isChoice = (c.choice) ? 1 : 0;
                    for (int choice = 0 + 1 * isChoice; choice < 1 + 2 * isChoice; choice++)
                    {
                        if (isChoice == 1)
                        {
                            c = pen.getChooseCard(p.ownHeroAblility.card, choice); // do all choice
                        }

                        int cardPlayPenalty = 0;
                        int bestPlace = p.ownMinions.Count + 1; //we can not manage it
                        targetMinions = p.ownHeroAblility.card.getTargetsForHeroPower(p, true);
                        foreach (Minion targetMinion in targetMinions)
                        {
                            if (usePenalityManager)
                            {
                                cardPlayPenalty = pen.getPlayCardPenality(p.ownHeroAblility.card, targetMinion, p);
                            }

                            if (cardPlayPenalty <= 499)
                            {
                                Action a = new Action(actionEnum.useHeroPower, p.ownHeroAblility, null, bestPlace, targetMinion,
                                    cardPlayPenalty, choice);
                                ret.Add(a);
                            }
                        }
                    }
                }
            }

 retval += getRandomPenaltiy(card, p, target);

原文地址:https://www.cnblogs.com/chucklu/p/11434356.html