Reading About KaTeX and Reviewing Security Implications
I need to review the security implications of letting users input their own KaTeX equations. I was looking over their function reference and saw some things like the "raw HTML" feature which may be potentially dangerous (but also that might be useful to allow users to use). I also want to learn more about / review LaTeX markup syntax.
Reference
Notes
- KaTeX is
The
fastest
math typesetting library on the web
- It has no dependencies.
- Renders its math synchronously - doesn't need to reflow the page.
- It is based on Donald Knuth's TeX
- KaTeX produces the same output regardless of browser or environment, so you can pre-render expressions using Node.js and send them as plain HTML
Installation
Node.js
- I install KaTeX on Node.js with npm:
npm install katex
- If you render a HTML string on the server, you still need to include the CSS files and the fonts in the browser
Browser
- I self-host KaTeX CSS and fonts, and I lazy-load the KaTeX JavaScript
- You can try to use the Font Loading API or Web Font Loader to prevent FOUT (Flash of Unstyled Text) or FOIT (Flash of Invisible Text), but I don't think that is too important for me right now. The example below is from the KaTeX website. You can also try preloading the fonts.
window.WebFontConfig = {
custom: {
families: ['KaTeX_AMS', 'KaTeX_Caligraphic:n4,n7', 'KaTeX_Fraktur:n4,n7',
'KaTeX_Main:n4,n7,i4,i7', 'KaTeX_Math:i4,i7', 'KaTeX_Script',
'KaTeX_SansSerif:n4,n7,i4', 'KaTeX_Size1', 'KaTeX_Size2', 'KaTeX_Size3',
'KaTeX_Size4', 'KaTeX_Typewriter'],
},
};
- If I every change the way that
MathNode
s, Lexical nodes that I use to render math expressions, are rendered, then I may want to detect whether an article contains aMathNode
and use the method above to more efficiently load the fonts. - How would I change the way that
MathNode
s are rendered? - I would probably have to add an attribute to the
MathNode
that can contain the HTML string of the outputmathml
expression. - Then, when calling
exportDOM
check whether this property exists on theMathNode
and render the HTML differently in the case that it exists. - This would also affect the
importDOM
behavior - I will at least try to implement this in the future since the current method of rendering produces significant Cumulative Layout Shifts on pages with a lot of rendered math expressions. - This might change how the node interacts with the rest of the editor though.
- I would probably have to add an attribute to the
- How would I change the way that
- If I every change the way that
Usage
- I mostly use the
katex.renderToString
function to generate a HTML string on the server and client:
const macros = {};
var html = katex.renderToString("c = \\pm\\sqrt{a^2 + b^2}", {
throwOnError: false,
macros
});
- I do this (rather than use
katex.render
) because sometimes I don't have access to the element that I want to render the element inside, and I think it is just easier for now.
- I do this (rather than use
- Note: The
throwOnError: false
option means that invalid inputs will render the TeX source code in red (which can probably be changed with CSS), with an error message as hover text. Without this option, invalid LaTeX will cause akatex.ParseError
exception to be thrown. - You can enable persistent macros by having a
macros
variable defined outside the render function and passing the object to therender
orrenderToString
function as an option. KaTeX will add the macros defined in the KaTeX expression to the macros object in this case. See the example above. - You probably don't want to do this with user generated TeX code (It's a security issue since macros can change the behavior of KaTeX - redefining standard commands).
- Another option, enabling only some macros that you want to define, is to pass in a fresh object each time - with that fresh object containing the macros that you want to include.
- There are CLI and auto-render extension options for rendering TeX to HTML.
Configuration
Options
- You can provide an object of options as the last argument to
katex.render
andkatex.renderToString
. Options are: displayMode
:boolean
(defaultfalse
)- If
true
, the math will be rendered in display mode. Iffalse
, the math will be rendered in inline mode. - Display mode starts in
\displaystyle
, so\int
and\sum
are large, for example; while inline mode starts in\textstyle
, where subscripts and superscripts usually don't stack on operators like\sum
. You can always manually switch between\displaystyle
and\textstyle
using those commands. - Display mode centers math on its own line and disables automatic line breaking.
- In inline mode, KaTeX allows like breaks after outermost relations (like
=
or<
) or binary operators (like+
or\times
), the same as TeX.
- Display mode starts in
- If
output
:string
- Determines the markup language of the output. The valid choices are
html
,mathml
, andhtmlAndMathml
(outputs HTML for visual rendering and includes MathML for accessibility).
- Determines the markup language of the output. The valid choices are
leqno
:boolean
- If
true
, display math\tag
s rendered on the left instead on the right, like\usepackage[leqno]{ansmath}
- If
fleqn
:boolean
- If
true
, display math rendered flush left with a2em
left padding, like\documentclass[fleqn]
in LaTeX with theamsmath
package
- If
throwOnError
:boolean
(defaulttrue
)- If
true
, KaTeX will throw aParseError
when it encounters an unsupported command or invalid LaTeX. If false, KaTeX will render unsupported commands as text, and render invalid LaTeX as its source code with hover text giving the error, in the color given byerrorColor
- If
errorColor
:string
- A color string given in the format
"#XXX"
or"#XXXXXX"
this option determines the color when unsupported command and invalid LaTeX are rendered in whenthrowOnError
is set tofalse
(default#cc0000
)
- A color string given in the format
macros
:object
- Each macros is a key-value pair where the key is a new command name and the value is the expression of the macro.
- Each property of
macros
can have a name that starts with a backslash like"\\foo"
(defining command\foo
or is a single character like "" (defining the equivalent TeX active character), and a value that is one of the following: - A string with the LaTeX expression of the macro.
- A function that accepts an instance of
MacroExpander
as first argument and returns the expansion as a string. - An expansion object matching an internal
MacroExpansion
specification, which is what results from global\def
or\let
.
- This object can be modified by the
render
function (see above).
minRuleThickness
:number
- Specifies a minimum thickness, in ems, for fraction lines,
\sqrt
top lines,{array}
vertical lines,\hline
,\hdashline
,\underline
,\overline
, and the borders of\fbox
,\boxed
, and\fcolorbox
. The usual value for these items is0.04
, so forminRuleThickness
to be effective it should probably take a value slightly above0.04
, like0.05
.
- Specifies a minimum thickness, in ems, for fraction lines,
colorIsTextColor
:boolean
- Leave as
false
for current LaTeX behavior.
- Leave as
maxSize
:number
- All user-specified sizes (like
]rule{500em}{500em}
will be capped tomaxSize
ems.
- All user-specified sizes (like
maxExpand
:number
- Limit the number of macro expansions to the specified number to prevent infinite max loops. Default is
1000
.
- Limit the number of macro expansions to the specified number to prevent infinite max loops. Default is
strict
:boolean
orstring
orfunction
(default"warn"
)- If
false
or"ignore"
, allow features that make writing LaTeX convenient but are not actually supported by (TeX) LaTeX. - If
true
or"error"
(LaTeX faithfulness mode), throw an error for any such transgressions. If"warn"
(the default), warn about such behavior viaconsole.warn
. - Provide a custom function
handler(errorCode, errorMsg, token)
to customize behavior depending on the type of transgression (summarized by the string codeerrorCode
and detailed inerrorMsg
); this function can also return"ignore"
,"error"
, or"warn"
to use a built-in behavior. A list of such features and theirerrorCode
s:
- If
trust
:boolean
- If
false
(do not trust input), prevent any command like\includegraphics
that could enable adverse behavior, rendering the instead inerrorColor
. Iftrue
, allow all such commands, Provide a custom functionhandler(context)
to customize behavior depending on the context. - A list of possible contexts:
- If
- Sample trust settings:
globalGroup
:boolean
(defaultfalse
)- Run KaTeX code in the global group. As a consequence, macros defined at the top level by
\def
and\new
command are added to themacros
argument and can be used in subsequent render calls. In LaTeX, constructs such as\begin{equation}
and$$
create a local group and prevent definitions other than\gdef
from becoming visible outside of those blocks, so this is KaTeX's default behavior.
- Run KaTeX code in the global group. As a consequence, macros defined at the top level by
Summary:
displayMode
should betrue
when rendering theMathNode
in block style. It should befalse
when rendering theMathNode
in inline style.output
should behtmlAndMathml
for accessibility.- On accessibility, you need to look into
exportDOM
for theMathNode
to render a more representativeMathNode
on output - this affects search
- On accessibility, you need to look into
leqno
andfleqn
should be customizable by the user (defaultundefined
)throwOnError
should befalse
when allowing the user to input TeX expressionserrorColor
should be whatever color you use to display errors on a pagemacros
- Need to look into this.
- It probably wouldn't be a bad idea to look up commonly used macros and pass in a constant object (so that users can not define their own persistent macros) into the
render
function (for the macros parameter)
minRuleThickness
: is probably fine to leave as the default value unless you to start to notice too much compressed stylingmaxSize
: need to define amaxSize
(probably equal to 380px) so that the user doesn't create something too confusing for other users to look atmaxExpand
: Either make it less than 1000, or you need to create a timeout for rendering expressions on the client and the server so that expressions can't take too much time- Need to run tests on this
strict
: Probably best to putfalse
. This can affect logging - the default iswarn
, which could affect logs - so you should customize this behavior.trust
: You probably want to define a function at some point so that you can allow users to input their own custom HTML. Until then, leave asfalse
.globalGroup
: leave asfalse
.
Security
Any HTML generated by KaTeX should be safe from<script>
or other code injection attacks.
Of course, it is always a good idea to sanitize the HTML, though you will need a rather generous whitelist (including some of SVG and MathML) to support all of KaTeX.
- Look at the options for more security options.
Handling Errors
- If KaTeX encounters an error (invalid of unsupported LaTeX) and
throwOnError
hasn't been set tofalse
, thenkatex.render
andkatex.renderToString
will throw an exception of typekatex.ParseError
. The message in this error includes some of the LaTeX source code, so it needs to be escaped if you want to render it to HTML.
Font
- You can change several properties of how KaTeX uses fonts.
- By default, KaTeX math is rendered in a 1.21x larger font than the surrounding context, making super and subscripts easier to read. You can control this with CSS:
.katex { font-size: 1.1rem; }
- KaTeX provides fonts in three different formats:
ttf
,woff
, andwoff2
Miscellaneous
Supported Functions
- List of KaTeX supported functions
HTML
The followingraw HTMLfeatures are potentially dangerous for untrusted inputs, so they are disabled by default, and attempting to use them produces the command names in red. To fully trust your LaTeX input, you need to pass an option oftrust: true;
you can also enable just some of the commands or for just some URLs via thetrust
option.
- I don't have the trust option set to true right now, but the options for inserting custom HTML with include:
- Inserting links with
\href{linkUrl}{text}
- Inserting links with
<a href="linkUrl">
<span class="mord text">
<span class="mord texttt">
text
</span>
</span>
</a>
- Inserting Images:
\includegraphic[height=0.8rem,totalheight=0.9em,width=0.9em,alt=Logo]{LINK_TO_IMAGE}
- Inserting Images:
<span class="katex-html" aria-hidden="true">
<span class="base">
<span class="strut" style="height:0.9em;vertical-align:-0.1em;"></span>
<img
src="LINK_TO_IMAGE"
alt="Logo"
style="height:0.9em;width:0.9em;vertical-align:-0.1em;" '=""
/>
</span>
</span>
- Inserting HTML ID:
\htmlId{bar}{text}
<span class="enclosing" id="bar">text</span>
- Inserting Class:
\htmlClass{foo}{text}
<span class="enclosing foo">text</span>
- Inserting Style:
\htmlStyle{color: red}{text}
<span style="color: red;" class="enclosing">text</span>
- Inserting Data attributes with
\htmlData{foo=a, bar=b}{text}
<span data-foo="a" data-bar="b" class="enclosing">text</span>
- Inserting HTML ID:
Macros
- There are multiple ways of defining macros.
- Macros can be defined in the KaTeX rendering options, as opposed to inline.
Style, Color, Size, and Font
- Note that
\color
acts like a siwtch. Other color functions expect the content to be a function argument.
| |
| |
| |
|
Units
- In KaTeX, units are proportioned as they are in TeX. KaTeX units are different than css units.
KaTeX Unit | Value |
---|---|
em | CSS em |
ex | CSS ex |
mu | 1/18 CSS em |
pt | 1/72.27 inch F G |
mm | 1mm F G |
cm | 1cm F G |
in | 1in F G |
- where
- F = (font size of surrounding HTML text)/10pt
- G = 1.21 by default, because KaTeX font-size is normally 1.21 x the surrounding font size.
Common Issues
- You need to include
<!DOCTYPE html>
at the tip of the HTML file, as otherwise your browser will render inquirks mode
which can cause KaTeX to sometimes render incorrectly. - I have seen this be an issue when rendering KaTeX in Node.js
- Be sure to remember that you can specify the spacing between lines:
\\[0.1em]
- Equivalents of MathJax
\class
,\cssId
, and\style
are\htmlClass
,\htmlId
, and\htmlStyle
, respectively, to avoid ambiguity.
Comments
There are currently no comments to show for this article.