asp.net core处理中文的指南

参考资料:https://docs.microsoft.com/en-us/aspnet/core/security/cross-site-scripting

Customizing the Encoders

By default encoders use a safe list limited to the Basic Latin Unicode range and encode all characters outside of that range as their character code equivalents. This behavior also affects Razor TagHelper and HtmlHelper rendering as it will use the encoders to output your strings.

The reasoning behind this is to protect against unknown or future browser bugs (previous browser bugs have tripped up parsing based on the processing of non-English characters). If your web site makes heavy use of non-Latin characters, such as Chinese, Cyrillic or others this is probably not the behavior you want.

You can customize the encoder safe lists to include Unicode ranges appropriate to your application during startup, in ConfigureServices().

For example, using the default configuration you might use a Razor HtmlHelper like so;

Copy
html
<p>This link text is in Chinese: @Html.ActionLink("汉语/漢語", "Index")</p>

When you view the source of the web page you will see it has been rendered as follows, with the Chinese text encoded;

Copy
html
<p>This link text is in Chinese: <a href="/">&#x6C49;&#x8BED;/&#x6F22;&#x8A9E;</a></p>

To widen the characters treated as safe by the encoder you would insert the following line into the ConfigureServices()method in startup.cs;

Copy
C#
services.AddSingleton<HtmlEncoder>(
     HtmlEncoder.Create(allowedRanges: new[] { UnicodeRanges.BasicLatin,
                                               UnicodeRanges.CjkUnifiedIdeographs }));

This example widens the safe list to include the Unicode Range CjkUnifiedIdeographs. The rendered output would now become

Copy
html
<p>This link text is in Chinese: <a href="/">汉语/漢語</a></p>

Safe list ranges are specified as Unicode code charts, not languages. The Unicode standard has a list of code charts you can use to find the chart containing your characters. Each encoder, Html, JavaScript and Url, must be configured separately.

Note

Customization of the safe list only affects encoders sourced via DI. If you directly access an encoder via System.Text.Encodings.Web.*Encoder.Default then the default, Basic Latin only safelist will be used.

原文地址:https://www.cnblogs.com/a14907/p/6293151.html