Reading About KaTeX and Reviewing Security Implications

I need to review the security implications of letting users input their own KaTeX equations. I was looking over their function reference and saw some things like the "raw HTML" feature which may be potentially dangerous (but also that might be useful to allow users to use). I also want to learn more about / review LaTeX markup syntax.

Date Created:
Last Edited:

Reference



Notes


  • KaTeX is The fastest math typesetting library on the web
  • It has no dependencies.
  • Renders its math synchronously - doesn't need to reflow the page.
  • It is based on Donald Knuth's TeX
  • KaTeX produces the same output regardless of browser or environment, so you can pre-render expressions using Node.js and send them as plain HTML


Installation


Node.js

  • I install KaTeX on Node.js with npm:
npm install katex
  • If you render a HTML string on the server, you still need to include the CSS files and the fonts in the browser


Browser

  • I self-host KaTeX CSS and fonts, and I lazy-load the KaTeX JavaScript
  • You can try to use the Font Loading API or Web Font Loader to prevent FOUT (Flash of Unstyled Text) or FOIT (Flash of Invisible Text), but I don't think that is too important for me right now. The example below is from the KaTeX website. You can also try preloading the fonts.
window.WebFontConfig = {
custom: {
families: ['KaTeX_AMS', 'KaTeX_Caligraphic:n4,n7', 'KaTeX_Fraktur:n4,n7',
'KaTeX_Main:n4,n7,i4,i7', 'KaTeX_Math:i4,i7', 'KaTeX_Script',
'KaTeX_SansSerif:n4,n7,i4', 'KaTeX_Size1', 'KaTeX_Size2', 'KaTeX_Size3',
'KaTeX_Size4', 'KaTeX_Typewriter'],
},
};
    • If I every change the way that MathNodes, Lexical nodes that I use to render math expressions, are rendered, then I may want to detect whether an article contains a MathNode and use the method above to more efficiently load the fonts.
      • How would I change the way that MathNodes are rendered?
        • I would probably have to add an attribute to the MathNode that can contain the HTML string of the output mathml expression.
        • Then, when calling exportDOM check whether this property exists on the MathNode and render the HTML differently in the case that it exists.
        • This would also affect the importDOM behavior - I will at least try to implement this in the future since the current method of rendering produces significant Cumulative Layout Shifts on pages with a lot of rendered math expressions.
        • This might change how the node interacts with the rest of the editor though.


Usage


  • I mostly use the katex.renderToString function to generate a HTML string on the server and client:
const macros = {};
var html = katex.renderToString("c = \\pm\\sqrt{a^2 + b^2}", {
throwOnError: false,
macros
});
    • I do this (rather than use katex.render) because sometimes I don't have access to the element that I want to render the element inside, and I think it is just easier for now.
  • Note: The throwOnError: false option means that invalid inputs will render the TeX source code in red (which can probably be changed with CSS), with an error message as hover text. Without this option, invalid LaTeX will cause a katex.ParseError exception to be thrown.
  • You can enable persistent macros by having a macros variable defined outside the render function and passing the object to the render or renderToString function as an option. KaTeX will add the macros defined in the KaTeX expression to the macros object in this case. See the example above.
    • You probably don't want to do this with user generated TeX code (It's a security issue since macros can change the behavior of KaTeX - redefining standard commands).
    • Another option, enabling only some macros that you want to define, is to pass in a fresh object each time - with that fresh object containing the macros that you want to include.
  • There are CLI and auto-render extension options for rendering TeX to HTML.


Configuration


Options

  • You can provide an object of options as the last argument to katex.render and katex.renderToString. Options are:
    • displayMode: boolean (default false)
      • If true, the math will be rendered in display mode. If false, the math will be rendered in inline mode.
        • Display mode starts in \displaystyle, so \int and \sum are large, for example; while inline mode starts in \textstyle, where subscripts and superscripts usually don't stack on operators like \sum. You can always manually switch between \displaystyle and \textstyle using those commands.
        • Display mode centers math on its own line and disables automatic line breaking.
        • In inline mode, KaTeX allows like breaks after outermost relations (like = or <) or binary operators (like + or \times), the same as TeX.
    • output: string
      • Determines the markup language of the output. The valid choices are html, mathml, and htmlAndMathml (outputs HTML for visual rendering and includes MathML for accessibility).
    • leqno: boolean
      • If true, display math \tags rendered on the left instead on the right, like \usepackage[leqno]{ansmath}
    • fleqn: boolean
      • If true, display math rendered flush left with a 2em left padding, like \documentclass[fleqn] in LaTeX with the amsmath package
    • throwOnError: boolean (default true)
      • If true, KaTeX will throw a ParseError when it encounters an unsupported command or invalid LaTeX. If false, KaTeX will render unsupported commands as text, and render invalid LaTeX as its source code with hover text giving the error, in the color given by errorColor
    • errorColor: string
      • A color string given in the format "#XXX" or "#XXXXXX" this option determines the color when unsupported command and invalid LaTeX are rendered in when throwOnError is set to false (default #cc0000)
    • macros: object
      • Each macros is a key-value pair where the key is a new command name and the value is the expression of the macro.
      • Each property of macros can have a name that starts with a backslash like "\\foo" (defining command \foo or is a single character like "" (defining the equivalent TeX active character), and a value that is one of the following:
        • A string with the LaTeX expression of the macro.
        • A function that accepts an instance of MacroExpander as first argument and returns the expansion as a string.
        • An expansion object matching an internal MacroExpansion specification, which is what results from global \def or \let.
      • This object can be modified by the render function (see above).
    • minRuleThickness: number
      • Specifies a minimum thickness, in ems, for fraction lines, \sqrt top lines, {array} vertical lines, \hline, \hdashline, \underline, \overline, and the borders of \fbox, \boxed, and \fcolorbox. The usual value for these items is 0.04, so for minRuleThickness to be effective it should probably take a value slightly above 0.04, like 0.05.
    • colorIsTextColor: boolean
      • Leave as false for current LaTeX behavior.
    • maxSize: number
      • All user-specified sizes (like ]rule{500em}{500em} will be capped to maxSize ems.
    • maxExpand: number
      • Limit the number of macro expansions to the specified number to prevent infinite max loops. Default is 1000.
    • strict: boolean or string or function (default "warn")
      • If false or "ignore", allow features that make writing LaTeX convenient but are not actually supported by (TeX) LaTeX.
      • If true or "error" (LaTeX faithfulness mode), throw an error for any such transgressions. If "warn" (the default), warn about such behavior via console.warn.
      • Provide a custom function handler(errorCode, errorMsg, token) to customize behavior depending on the type of transgression (summarized by the string code errorCode and detailed in errorMsg); this function can also return "ignore", "error", or "warn" to use a built-in behavior. A list of such features and their errorCodes:

KaTeX Strict Features and Error Codes

    • trust: boolean
      • If false (do not trust input), prevent any command like \includegraphics that could enable adverse behavior, rendering the instead in errorColor. If true, allow all such commands, Provide a custom function handler(context) to customize behavior depending on the context.
        • A list of possible contexts:

A List of KaTeX Contexts for the Trust Option

        • Sample trust settings:

Sample Trust Settings

    • globalGroup: boolean (default false)
      • Run KaTeX code in the global group. As a consequence, macros defined at the top level by \def and \new command are added to the macros argument and can be used in subsequent render calls. In LaTeX, constructs such as \begin{equation} and $$ create a local group and prevent definitions other than \gdef from becoming visible outside of those blocks, so this is KaTeX's default behavior.


Summary:
  • displayMode should be true when rendering the MathNode in block style. It should be false when rendering the MathNode in inline style.
  • output should be htmlAndMathml for accessibility.
    • On accessibility, you need to look into exportDOM for the MathNode to render a more representative MathNode on output - this affects search
  • leqno and fleqn should be customizable by the user (default undefined)
  • throwOnError should be false when allowing the user to input TeX expressions
  • errorColor should be whatever color you use to display errors on a page
  • macros
    • Need to look into this.
    • It probably wouldn't be a bad idea to look up commonly used macros and pass in a constant object (so that users can not define their own persistent macros) into the render function (for the macros parameter)
  • minRuleThickness: is probably fine to leave as the default value unless you to start to notice too much compressed styling
  • maxSize : need to define a maxSize (probably equal to 380px) so that the user doesn't create something too confusing for other users to look at
  • maxExpand: Either make it less than 1000, or you need to create a timeout for rendering expressions on the client and the server so that expressions can't take too much time
    • Need to run tests on this
  • strict: Probably best to put false. This can affect logging - the default is warn, which could affect logs - so you should customize this behavior.
  • trust: You probably want to define a function at some point so that you can allow users to input their own custom HTML. Until then, leave as false.
  • globalGroup: leave as false.


Security

Any HTML generated by KaTeX should be safe from <script> or other code injection attacks.
Of course, it is always a good idea to sanitize the HTML, though you will need a rather generous whitelist (including some of SVG and MathML) to support all of KaTeX.
  • Look at the options for more security options.


Handling Errors

  • If KaTeX encounters an error (invalid of unsupported LaTeX) and throwOnError hasn't been set to false, then katex.render and katex.renderToString will throw an exception of type katex.ParseError. The message in this error includes some of the LaTeX source code, so it needs to be escaped if you want to render it to HTML.


Font

  • You can change several properties of how KaTeX uses fonts.
  • By default, KaTeX math is rendered in a 1.21x larger font than the surrounding context, making super and subscripts easier to read. You can control this with CSS:
.katex { font-size: 1.1rem; }
  • KaTeX provides fonts in three different formats: ttf, woff, and woff2


Miscellaneous


Supported Functions

  • List of KaTeX supported functions
HTML
The following raw HTML features are potentially dangerous for untrusted inputs, so they are disabled by default, and attempting to use them produces the command names in red. To fully trust your LaTeX input, you need to pass an option of trust: true; you can also enable just some of the commands or for just some URLs via the trust option.
  • I don't have the trust option set to true right now, but the options for inserting custom HTML with include:
    • Inserting links with \href{linkUrl}{text}
<a href="linkUrl">
<span class="mord text">
<span class="mord texttt">
text
</span>
</span>
</a>
    • Inserting Images: \includegraphic[height=0.8rem,totalheight=0.9em,width=0.9em,alt=Logo]{LINK_TO_IMAGE}
<span class="katex-html" aria-hidden="true">
<span class="base">
<span class="strut" style="height:0.9em;vertical-align:-0.1em;"></span>
<img
src="LINK_TO_IMAGE"
alt="Logo"
style="height:0.9em;width:0.9em;vertical-align:-0.1em;" '=""
/>
</span>
</span>
    • Inserting HTML ID: \htmlId{bar}{text}
      • <span class="enclosing" id="bar">text</span>
    • Inserting Class: \htmlClass{foo}{text}
      • <span class="enclosing foo">text</span>
    • Inserting Style: \htmlStyle{color: red}{text}
      • <span style="color: red;" class="enclosing">text</span>
    • Inserting Data attributes with \htmlData{foo=a, bar=b}{text}
      • <span data-foo="a" data-bar="b" class="enclosing">text</span>
Macros
  • There are multiple ways of defining macros.
  • Macros can be defined in the KaTeX rendering options, as opposed to inline.


Style, Color, Size, and Font
  • Note that \color acts like a siwtch. Other color functions expect the content to be a function argument.

\textcolor{blue}{F=ma}

\textcolor{#228B22}{F=ma}

\colorbox{aqua}{$F=ma$}

\fcolorbox{red}{aqua}{$F=ma$}


Units
  • In KaTeX, units are proportioned as they are in TeX. KaTeX units are different than css units.

KaTeX Unit

Value

em

CSS em

ex

CSS ex

mu

1/18 CSS em

pt

1/72.27 inch F G

mm

1mm F G

cm

1cm F G

in

1in F G

  • where
    • F = (font size of surrounding HTML text)/10pt
    • G = 1.21 by default, because KaTeX font-size is normally 1.21 x the surrounding font size.


Common Issues

  • You need to include <!DOCTYPE html> at the tip of the HTML file, as otherwise your browser will render in quirks mode which can cause KaTeX to sometimes render incorrectly.
    • I have seen this be an issue when rendering KaTeX in Node.js
  • Be sure to remember that you can specify the spacing between lines: \\[0.1em]
  • Equivalents of MathJax \class, \cssId, and \style are \htmlClass, \htmlId, and \htmlStyle, respectively, to avoid ambiguity.


Comments

You must be logged in to post a comment!

Insert Math Markup

ESC
About Inserting Math Content
Display Style:

Embed News Content

ESC
About Embedding News Content

Embed Youtube Video

ESC
Embedding Youtube Videos

Embed TikTok Video

ESC
Embedding TikTok Videos

Embed X Post

ESC
Embedding X Posts

Embed Instagram Post

ESC
Embedding Instagram Posts

Insert Details Element

ESC

Example Output:

Summary Title
You will be able to insert content here after confirming the title of the <details> element.

Insert Table

ESC
Customization
Align:
Preview:

Insert Horizontal Rule

#000000

Preview:


Insert Chart

ESC

View Content At Different Sizes

ESC

Edit Style of Block Nodes

ESC

Edit the background color, default text color, margin, padding, and border of block nodes. Editable block nodes include paragraphs, headers, and lists.

#ffffff
#000000

Edit Selected Cells

Change the background color, vertical align, and borders of the cells in the current selection.

#ffffff
Vertical Align:
Border
#000000
Border Style:

Edit Table

ESC
Customization:
Align:

Upload Lexical State

ESC

Upload a .lexical file. If the file type matches the type of the current editor, then a preview will be shown below the file input.

Upload 3D Object

ESC

Upload Jupyter Notebook

ESC

Upload a Jupyter notebook and embed the resulting HTML in the text editor.

Insert Custom HTML

ESC

Edit Image Background Color

ESC
#ffffff

Insert Columns Layout

ESC
Column Type:

Select Code Language

ESC
Select Coding Language