MOTION-MATCHING IN UBISOFT’S FOR HONOR翻译

http://www.gameanim.com/2016/05/03/motion-matching-ubisofts-honor/

Introducing For Honor with a video, the team want it to be "The Call of Duty of Melee Combat", citing the Street Fighter series as another key influence for their desire to create multiplayer combat with precise controls. The game is now skill-based to the point that Simon has no chance playing against testers who have mastered the game over the course of development.

用一个视频来介绍For Honor，团队想要它成为"使命召唤的冷兵器版"，街霸系列是另一个主要影响。他们想要创建有准确控制的多人战斗。游戏现在是技巧取向的：Simon没机会胜过测试者，他们已经在开发阶段统治了游戏。

To showcase the technology Simon followed with a video of where they are now, highlighting the unique movement of a female knight sharing absolutely zero animations with the male samurai. Both characters had a uniqueness rarely seen in animation-intensive games requiring similar move-sets, immediately highlighting one benefit of the technology. Finishing with an example of the build-up to a fight, Simon described the phase of 'dancing' around opponents with stance switching before combat as 'where the magic happens.'

为了展示技术，Simon给出了一个视频。强调女骑士的独特移动，和男忍者完全没有共享的动作。两个角色都有独特性，很难在动作向的游戏中见到，因为一般需要相近的移动动作。强调出这个技术的一个好处。结束战斗前准备的例子后，Simon展示绕对手做的"跳舞"阶段，过程中姿势切换像"魔法"发生一样。

视频：Motion Matching 01

https://www.youtube.com/watch?time_continue=12&v=rIu90Mq0G8A

In The Beginning

Giving context, the presentation took us through a brief history of animation systems:

Play Anim – An animation is played. The original no-frills approach.
State Machines – Animations change as the character changes state.
Blend/Decision Trees – Blends come into play driven by state or input parameters.
Bone Masks – Upper-body only etc. Anims played only partially on characters.
Blend Trees per state – More complex version of stage 3.
Parametric Blends – Slope/strafe angle/speed etc. Blends must look similar.
Lists of Anims – highlighting the management required of these systems.

动画系统的简短历史：

Play Anim：播放动画。原始的不修饰的方法。
State Machines：动画随着角色改变状态而改变
Blend/Decision Trees：加入混合，通过状态或者输入参数驱动。
Bone Masks：例如仅上半身。动画只在角色上部分运行。
Blend Trees per state：阶段3的更复杂版本。
Parametric Blends：斜坡/跨步，角度/速度等。混合看上去更好。
Lists of Anims：强调这些系统需要的管理方式。

The Future

A decade ago I attended a lecture in San Francisco by (then) academic Lucas Kovar, whose papers have influenced my and others' ideas on the various forms of animation blending we've been using in the ensuing years. One of the ideas presented at that time was the unstructured list of animations, (massive sets of unedited mocap data automatically marked up to best find transitions and blends), which was simply infeasible at the time for video games given the memory requirements. As we transitioned into the current generation of consoles Simon recognised it as one potential avenue for next-gen animation systems.

十年前，我参加了San Francisco的一场Lucas Kovar的演讲，他的论文影响了我和其他人的想法，我们接下来多年里用在不同形式的动画混合上。那时提出的其中一个想法是"无组织的动画列表"（大量未编辑的mocap数据自动标记来找到最好的过渡和混合）。对于当时的内存需求是不可行的。当我们进入当前的主机时代，Simon认为它对下一代动画系统来说，是一个很有潜力的方法。

His first task was to figure out how to choose the next anim, with his instincts drawing him towards machine learning. Due to the requirement for many connecting motions between the data to afford the best control, a dense graph of interconnected transitions was sought after.

他的第一个任务是找出怎样选择下一个动作，直觉告诉他用机器学习。由于需要许多数据间的关联动作来达到最好的控制，一个交错过渡的密集图随后被发现。

Much work has already been done in academia towards this, (several linked to in the Academic Papers section of this site), and citing the paper Motion-fields for interactive Character Animation, Simon quoted the following paragraph…

大量工作已经被学术界完成了，（一些关联文章在Academic Papers），引用论文 Motion-fields for interactive Character Animation， Simon引用以下段落：

"Our representation organizes samples of motion data into a high-dimensional generalization of a vector field which we call a motion field. Our run-time motion synthesis mechanism freely flows through the motion field in response to user commands."

"我们的演示组织了动作数据的样本到一个高维的概括化的矢量场，称为motion field。响应用户命令时，我们的实时动作合成机制自由地运行在motion field。"

Roughly translated into English, this essentially means that the main problems with motion-fields are, in Simon's words, that 'the equations involved are scary'. He side-stepped this complexity by allowing his system to simply 'jump to any frame whenever we want.'

简单的翻译成英文，这个本质上说：用Simon的话来说，motion-fields的主要问题是：涉及到的方程很可怕。他避开这个复杂度，允许他的系统简单地"跳到任何我们想要的帧"。

With that covered, the next problem was how to choose the correct start-frame. He posited the following criteria:

Pose matching?
Velocity matching?
Precise end-position matching?
A mix of all of the above?

下一个问题是如何选择正确的开始帧。他指出以下的评判标准：

姿势匹配？
速度匹配？
精确的结束位置匹配？
以上的混合？

Ultimately, the solution was described as 'a ridiculously brute-force approach to animation selection', evaluating the best match for:

The character's current situation.
The motion that takes us where we want.

最终，解决方法被描述为"一个愚蠢的蛮力方法来进行动画选择"，为这些找到最优匹配：

角色当前状态。
使我们到达想要到的位置的动作。

The initial results of parsing and automatically deciding which sections of mocap to play on the in-game character were shown in a video with animation-blending disabled, highlighting the frequency of mocap segment selection. As the character ran around the prototype area he was always switching the desired animation segment to rapidly keep up with the player's controller input. Two coloured paths run out in front of the character, with a blue (currently dictated by the chosen animation) trajectory trying to match a red (desired input) trajectory.

用一个视频展示了初始的分析和自动选择哪个mocap片段能用在游戏内角色的结果，禁止了动画混合，重点是mocap片段节选的频率。随着角色在原型场地中奔跑，它总是切换到目标动画片段，以快速跟上玩家控制器输入。两个染色路径在角色面前跑出，蓝色（当前动画表述）的轨迹努力匹配红色（目标输入）轨迹。

As mentioned earlier, multiple characters don't share animations. This is possible due to the large anim budget as manual work to edit the animations is now less of a consideration. Simon said the mocap used for a move-set is now limited only by the stuntman's energy on the day of the shoot, (to which end he suggested capturing fatigued movements towards the end of the shoot).

之前提到的，多个角色不共享动画。作为人工工作来编辑动作需要大量动作开销，但现在不再考虑了。Simon说用来move-set的mocap现在只受拍摄时特技演员的体力限制，（他建议在拍摄结束时捕捉疲劳移动动作）。

Dance Cards

As used elsewhere by the Toronto team's own explorations, a mocap 'Dance-Card' lists all the actions required for a succesful mocap shoot. Stuntmen must provide the following actions to cover basic movement:

1. Walks and Runs.
2. Small repositions.
3. Starts and Stops.
4. Circles (Turns).
5. Plant and turn (foot down to change direction) 45, 90, 135 and 180 degrees.
6. Strafe in a square (forward, left, back right – contains 90 degree plants).
7. Strafe plants (foot down though not turning) for 180 direction shifts.

到处都在使用Toronto team's own explorations，mocap的"Dance-Card"列出了一个成功mocap表演所需的动作。特技演员必须提供以下的动作来覆盖基本的移动：

走和跑
小复位
开始和停止
转圈（转向）
Plant转向（脚放下来改变方向）45,90,135,180度
在方形区域跨步（前后左右-包含90度plants）
跨步plants（脚放下尽管不转向）180度切换。

视频Motion Matching 02 https://www.youtube.com/watch?time_continue=29&v=Y1OhkAVhCos

From there, Simon applies a cost-function to evaluate which section of mocap should be chosen for the desired action, essentially applying a value to each movement and looking for the cheapest solution to get from A (current pose) to B (desired pose), much like AI decision-making.

从那开始，Simon对捕捉的数据，应用cost-function来评估哪段mocap对应想要的动作。本质上用对每个动作用一个值，找最便宜的方法使人从当前姿势A到目标姿势B。就像AI决策树。

Mocap Candidacy

The 'tricks' he learned for comparing the current and candidate poses are:

Match only a few bones.
Match the local velocity.
Match feet positions and velocities.
Match weapon positions etc..

他在比较当前和候选姿势时学到的"小技巧"：

匹配很少的骨骼。
匹配局部速度。
匹配脚的位置和速度。
匹配武器位置等。

He uses around 10 factors in total but stressed you don't need to find the exact same phase for the feet placements as current systems do with regular walk/run blending etc. For example, a turn-on-spot has lots of foot-shuffling so ignore the feet for those movements. All decision-making for optimum blending is done offline, pre-computing all this metadata for speed at run-time.

他用总共大约10个因子，不需要找脚放置的准确相同位置，因为当前系统做walk/run混合。

例如，原地转向有大量脚shuffling，所以对那些移动忽视脚。所有对最优混合的决策是离线做的，在运行期为速度预先计算所有元数据。什么元数据？什么是最优混合？

With the pose-matching portion of decision-making done, the next task was how to maintain a desired trajectory. To calculate this, one must check where an anim brings you if you play it, using the following determining factors:

Desired stance.
Trajectory matching.
Future Position.
Future Orientation.
Future velocity/acceleration.

随着决策中姿势匹配的部分完成，下一个任务是怎样维持目标轨道。为了计算轨道，必须检查动作播放时会把角色带到什么位置。用以下的决定因子：

目标姿势
轨道匹配
未来位置
未来朝向
未来速度/加速度

Each of these steps from the lists above that are not matched add to the compute cost, making them less viable. This allows the system to hone in on the best sections of mocap to jump to for optimum results. But optimum doesn't just mean smoothest transitions when video games are concerned because fidelity and response are competing factors. Extended delays before jumping to the desired action can will naturally result in control lag.

上述的每个步骤，不匹配的会增加cost，使他们不可行。这允许系统趋向最优结果，找到最佳mocap片段。但是在游戏中，最优不仅意味着平滑的过渡，因为真实性和反应性是相斥的因子。在跳到目标动作之前的额外的延迟会导致控制延迟。

Simon solves this by implementing a simple slider between realism and comfort, where adjusting this will change the decision as to where to select mocap transitions, prioritising fidelity vs immediacy:

Realism VS Comfort
<——–|———————->

The slider is easy to tweak if motions are captured correctly via the dance card, and he assured us that a desirable balance can always be achieved, leaving both animators and gameplay designers happy.

通过实现简单的slider解决真实性和舒适性的问题。调节slider可以改变决策，选择mocap的过渡，调节真实性和及时性的优先级。

如果动作是根据dance card正确捕捉的，Slider很容易微调。总是能达到一个想要的平衡，使animator和designer都满意。

To aid the selection process, animators mark-up the long mocap takes with specific events like stance-changes. In the case that no perfect match for a transition can be found, he just blends between the anims. With this method, Simon joked that even the day before ship the team could capture more transitions if required and feed it into the automated system.

为了帮助选择的过程，animator对很长的mocap进行标记，如姿势改变等特定事件。在过渡时找不到最佳匹配，只需要混合。用这个方法，Simon开玩笑说即使是第二天就要发布，团队也能获得更多的需要的过渡动作，加入自动化系统。

Regarding optimisation, one of the first questions any sane developer will raise is the memory requirement of such a mocap-intensive system. Simon declined to go into detail because that subject is big enough for a future talk of its own, but did state that aggressive (Level Of Detail) LODing of characters avoids updating distant characters every frame – resulting in the system running efficiently on over 100 characters on screen simultaneously.

关于优化，一个很重要的问题是这个mocap-intensive系统的内存需求。Simon拒绝讨论细节因为话题太大，但提到角色的LOD避免每帧更新较远的角色，导致系统能跑超过100人同屏。

On the subject of trajectory choice, he explained that every engine chooses to move a character in the world by taking the displacement from the animation itself or from a simulation that will drive the animation. In For Honor, the latter approach is used with code deciding the desired trajectory – animation is 'a cosmetic detail on top'. As such, the trajectory animations are chosen to match a simulated point in the world, essentially a spring-damper on the desired velocity. While a delay is present, (around 1-metre for stopping and a ramp up from start), the consistency of control is what makes the delay acceptable to the player – something I too have come to learn over the years.

关于轨道选择，他解释说，每个引擎选择通过取动画本身的位移或者模拟驱动动画的移位来移动角色。

在For Honor中，用的是后者来决定目标轨道。动画是表面的细节。这样，轨道的动画被用来匹配世界上的模拟点。本质上是一个目标速度的弹簧阻尼器。尽管表现有延迟（大约停止和开始时的加速会有一米的延迟），控制的一致性使延迟对于玩家来说是可接受的。

"The goal is not to be as responsive as possible, the goal is to be as predictable as possible."

"目标不是尽快响应，而是尽可能预测"

Simon clamps the entity, (character's centre-of-mass), to 15cm around the simulated point. One offshoot of this approach is that it allows the character to predict obstacles such as walls and ledges, causing them to react to obstacles and select mocap that might avoid them.

Simon约束实体（角色质心）在模拟点的15cm左右。这个方法的一个分支是它允许角色预测如墙和边缘的障碍，导致角色能对障碍做出反应，选择可能避开他们的mocap。

Pipeline

The animator's job consists of an intital tweak of the mocap, importing into the engine and marking it up with events. The game-logic, (rather than the animation), still works as a state machine, with clips representing the logic of the game such as a branching combo in combat. The video below shows the process of adjusting the timing of attacks in the state machine.

动作师的工作包括初始对mocap的微调，导入引擎和标记事件。游戏逻辑（而不是动作）依然作为状态机工作，包括表现游戏逻辑例如战斗中的分支combo。下面视频展示了在状态机中调节攻击时机。

视频 Motion Matching 03

https://www.youtube.com/watch?v=riQSIn9QnFI

In order for this to be possible the animator must first place events on the animations. Importantly, they don't chop anims or create cycles, (the biggest time-sink of traditional game animation pipelines), instead marking only points where they 'want this event to happen exactly at this moment'. Not at the beginning or the end of a move, but at the most significant parts such as the sword connecting with the target.

为了实现这个目标，动作师必须先在动作上放置事件。重要的是，他们不用剪短动画或创建循环，（传统游戏动画管线最耗费时间的地方），而是只用标记他们想要事件发生的地方。不在移动的开始或结束，而在最重要的部分，如剑碰到目标。

The system doesn't have just a big flat (unstructured) list of animations. They are instead grouped in similar sets with tags such as Tired or Heavy Attack. Other variables that an animator might need to specify in order to make them easy to parse are Stance, Pose, Range or Type of attack as well as outcomes such as Block, Miss, Parry or Hit wall.

系统不是仅有大量未组织的动画。用Tired或者Heavy Attack这样的标记来把相近的动作分组。动作师可能需要根据攻击的站姿，姿势，范围或类型等变量来对动作分类，例如分成格挡，闪避，招架或者击中墙。

When a designer sets the speed of the character it will blend from the walk to run mocap, and when adjusting the range of an attack it can switch which anim is played in a combo. While this thought sounds scary, animators are comforted by completely owning complex two-character moves such as throws and finishing kills that will have less requirements on timing. In this case the animations drive the displacement of characters to keep them in sync.

当一个策划设置角色速度，角色会从walk到run的mocap混合。当调节攻击范围时，会切换在combo中正在播放的动作。尽管这个想法听上去很惊人，动作师很满意，他们能完全拥有复杂的双人移动动作，如投掷和结束击杀，那些动作对时机的要求不高。在这种情况下，动作驱动角色的位移，来保证他们同步。

视频 Motion Matching 04

https://www.youtube.com/watch?v=Qp1YPufjGV0

Corrections

Due to chosen mocap trajectories rarely being an exact match to the desired position, a rotation correction is applied over time to get the exact orientation. Minor procedural upper-body adjustment is also applied when changing combat targets while strafing. To adjust for speed Simon suggests that designers shouldn't timescale anims, (the bane of game animators everywhere), more than 10% faster or 20% slower. Instead, slide the characters and use foot IK to relieve foot-sliding.

由于选择的mocap轨道很少是和目标位置准确匹配的，应用旋转矫正来获得准确的朝向。当strafing时改变战斗目标时，会细微的逐步对上身调节。为了根据速度调节，Simon建议策划不要缩放动作时间（游戏动作师的痛苦来源），快10%或慢20%。替代的方式是，滑动角色，用FOOT IK来减少脚步滑动。

Switching animations rapidly always causes foot-sliding regardless, so Simon fixes this by locking the toe at run-time with a socket on the ground matching the toe's position in the animation. IK was also implemented for slopes, pulling the hips down to connect to the lowest foot on slopes and stairs. Importantly, dropping the hips must never break the pose or hyper-extend the knees, preserving the animator-created silhouettes.

快速切换动画总是导致滑步，Simon的解决方法是：运行时锁定toe的位置在地上的socket，匹配动画中toe的位置。IK也应用在斜坡上，把hip拉低来和在斜坡和台阶上的低脚连接。重要的是，放低hip必须不能破坏姿势或拉伸膝盖，保留动作师创建的姿势。

To avoid dealing with sword IK to match swords on slopes, he used the same solution as AC3 by simply pitching the characters' spines – keep whole upper-bodies in sync to solve the height differences. The final results below are impressive:

为了避免解决剑的IK来匹配在斜坡上的剑，他用和AC3相同的方法，通过简单pitch角色的脊椎。保持整个上身的同步，来解决高度差。最终结果如下（视频）。

视频 Motion Matching 05

https://www.youtube.com/watch?v=3uYzA_hKb3c

Simon's future thoughts on corrections and pose-matching involve better automated mocap selection via 'interaction partners' such as bones or surfaces, giving the examples of selecting mocap in a sports game based on the distance between the position of a football player, (or even his leg bone), to the football or a hockey player to the puck.

Simon关于矫正和姿势匹配的未来想法包括：更好的自动化mocap选择，通过"交互同伴"如骨骼或表面。例如：在体育游戏中基于足球运动员（或者甚至是他的腿骨骼）和足球的距离选择mocap。

Conclusion

Motion-matching is not a technology but instead a 'simple idea that helps us reason about movement description and control.' Animation data declares events in the mocap, gameplay declares what it wants, and in the middle the matching system finds the best animations to match the desired goals, with three main advantages:

High quality: will preserve details from the mocap stage.
Controllable responsiveness: responsive and tweakable for gameplay.
Minimal manual work: an unintentional side-effect, (not the goal), but teams can spend their money elsewhere. Essentially reducing the time it takes to get from the mocap stage into the game.

动作匹配不是一个技术，而是一个"能帮助我们解释动作描述和控制的简单想法"。动画数据定义mocap中的事件，游戏逻辑定义它想要的东西，在中间是匹配系统，找到最好的动作来实现想要的目标，有三个主要优点：

高质量：会保留mocap阶段的细节
可控的响应：对游戏逻辑响应和可调
极少的人工工作：未料到的副作用，但是团队能节约成本。本质上减少了从mocap阶段到游戏中的耗费时间。

For Honor looks set to be the first commercial production to use this new and exciting animation technology so the final proof will be in the near future. In closing, Simon left us with this thought:

"Let's just mark-up the mocap with the inputs that should trigger moves… and generate the game automatically."

For Hornor 看上去是第一个用这种新的exciting的动作技术的商业产品。所以最终证明（？）会在很近的将来。Simon的结束语：

"仅仅用会触发移动的输入标记好mocap，然后自动生成游戏。"