Theory Document_Preparation

(*:maxLineLen=78:*)

theory Document_Preparation
  imports Main Base
begin

chapter ‹Document preparation \label{ch:document-prep}›

text ‹
  Isabelle/Isar provides a simple document preparation system based on
  {PDF-\LaTeX}, with support for hyperlinks and bookmarks within that format.
  This allows to produce papers, books, theses etc.\ from Isabelle theory
  sources.

  {\LaTeX} output is generated while processing a ∗‹session› in batch mode, as
  explained in the ∗‹The Isabelle System Manual› \<^cite>‹"isabelle-system"›.
  The main Isabelle tools to get started with document preparation are
  @{tool_ref mkroot} and @{tool_ref build}.

  The classic Isabelle/HOL tutorial \<^cite>‹"isabelle-hol-book"› also explains
  some aspects of theory presentation.
›


section ‹Markup commands \label{sec:markup}›

text ‹
  \begin{matharray}{rcl}
    @{command_def "chapter"} & : & ‹any → any› \\
    @{command_def "section"} & : & ‹any → any› \\
    @{command_def "subsection"} & : & ‹any → any› \\
    @{command_def "subsubsection"} & : & ‹any → any› \\
    @{command_def "paragraph"} & : & ‹any → any› \\
    @{command_def "subparagraph"} & : & ‹any → any› \\
    @{command_def "text"} & : & ‹any → any› \\
    @{command_def "txt"} & : & ‹any → any› \\
    @{command_def "text_raw"} & : & ‹any → any› \\
  \end{matharray}

  Markup commands provide a structured way to insert text into the document
  generated from a theory. Each markup command takes a single @{syntax text}
  argument, which is passed as argument to a corresponding {\LaTeX} macro. The
  default macros provided by 🗏‹~~/lib/texinputs/isabelle.sty› can be
  redefined according to the needs of the underlying document and {\LaTeX}
  styles.

  Note that formal comments (\secref{sec:comments}) are similar to markup
  commands, but have a different status within Isabelle/Isar syntax.

  \<^rail>‹
    (@@{command chapter} | @@{command section} | @@{command subsection} |
      @@{command subsubsection} | @@{command paragraph} | @@{command subparagraph})
      @{syntax text} ';'? |
    (@@{command text} | @@{command txt} | @@{command text_raw}) @{syntax text}
  ›

    ➧ @{command chapter}, @{command section}, @{command subsection} etc.\ mark
    section headings within the theory source. This works in any context, even
    before the initial @{command theory} command. The corresponding {\LaTeX}
    macros are ▩‹\isamarkupchapter›, ▩‹\isamarkupsection›,
    ▩‹\isamarkupsubsection› etc.\

    ➧ @{command text} and @{command txt} specify paragraphs of plain text.
    This corresponds to a {\LaTeX} environment ▩‹\begin{isamarkuptext}› ‹…›
    ▩‹\end{isamarkuptext}› etc.

    ➧ @{command text_raw} is similar to @{command text}, but without any
    surrounding markup environment. This allows to inject arbitrary {\LaTeX}
    source into the generated document.

  All text passed to any of the above markup commands may refer to formal
  entities via ∗‹document antiquotations›, see also \secref{sec:antiq}. These
  are interpreted in the present theory or proof context.

  ┉
  The proof markup commands closely resemble those for theory specifications,
  but have a different formal status and produce different {\LaTeX} macros.
›


section ‹Document antiquotations \label{sec:antiq}›

text ‹
  \begin{matharray}{rcl}
    @{antiquotation_def "theory"} & : & ‹antiquotation› \\
    @{antiquotation_def "thm"} & : & ‹antiquotation› \\
    @{antiquotation_def "lemma"} & : & ‹antiquotation› \\
    @{antiquotation_def "prop"} & : & ‹antiquotation› \\
    @{antiquotation_def "term"} & : & ‹antiquotation› \\
    @{antiquotation_def term_type} & : & ‹antiquotation› \\
    @{antiquotation_def typeof} & : & ‹antiquotation› \\
    @{antiquotation_def const} & : & ‹antiquotation› \\
    @{antiquotation_def abbrev} & : & ‹antiquotation› \\
    @{antiquotation_def typ} & : & ‹antiquotation› \\
    @{antiquotation_def type} & : & ‹antiquotation› \\
    @{antiquotation_def class} & : & ‹antiquotation› \\
    @{antiquotation_def locale} & : & ‹antiquotation› \\
    @{antiquotation_def bundle} & : & ‹antiquotation› \\
    @{antiquotation_def "text"} & : & ‹antiquotation› \\
    @{antiquotation_def goals} & : & ‹antiquotation› \\
    @{antiquotation_def subgoals} & : & ‹antiquotation› \\
    @{antiquotation_def prf} & : & ‹antiquotation› \\
    @{antiquotation_def full_prf} & : & ‹antiquotation› \\
    @{antiquotation_def ML_text} & : & ‹antiquotation› \\
    @{antiquotation_def ML} & : & ‹antiquotation› \\
    @{antiquotation_def ML_def} & : & ‹antiquotation› \\
    @{antiquotation_def ML_ref} & : & ‹antiquotation› \\
    @{antiquotation_def ML_infix} & : & ‹antiquotation› \\
    @{antiquotation_def ML_infix_def} & : & ‹antiquotation› \\
    @{antiquotation_def ML_infix_ref} & : & ‹antiquotation› \\
    @{antiquotation_def ML_type} & : & ‹antiquotation› \\
    @{antiquotation_def ML_type_def} & : & ‹antiquotation› \\
    @{antiquotation_def ML_type_ref} & : & ‹antiquotation› \\
    @{antiquotation_def ML_structure} & : & ‹antiquotation› \\
    @{antiquotation_def ML_structure_def} & : & ‹antiquotation› \\
    @{antiquotation_def ML_structure_ref} & : & ‹antiquotation› \\
    @{antiquotation_def ML_functor} & : & ‹antiquotation› \\
    @{antiquotation_def ML_functor_def} & : & ‹antiquotation› \\
    @{antiquotation_def ML_functor_ref} & : & ‹antiquotation› \\
  \end{matharray}

  \begin{matharray}{rcl}
    @{antiquotation_def emph} & : & ‹antiquotation› \\
    @{antiquotation_def bold} & : & ‹antiquotation› \\
    @{antiquotation_def verbatim} & : & ‹antiquotation› \\
    @{antiquotation_def bash_function} & : & ‹antiquotation› \\
    @{antiquotation_def system_option} & : & ‹antiquotation› \\
    @{antiquotation_def session} & : & ‹antiquotation› \\
    @{antiquotation_def "file"} & : & ‹antiquotation› \\
    @{antiquotation_def "url"} & : & ‹antiquotation› \\
    @{antiquotation_def "cite"} & : & ‹antiquotation› \\
    @{antiquotation_def "nocite"} & : & ‹antiquotation› \\
    @{antiquotation_def "citet"} & : & ‹antiquotation› \\
    @{antiquotation_def "citep"} & : & ‹antiquotation› \\
    @{command_def "print_antiquotations"}‹⇧^*› & : & ‹context →› \\
  \end{matharray}

  The overall content of an Isabelle/Isar theory may alternate between formal
  and informal text. The main body consists of formal specification and proof
  commands, interspersed with markup commands (\secref{sec:markup}) or
  document comments (\secref{sec:comments}). The argument of markup commands
  quotes informal text to be printed in the resulting document, but may again
  refer to formal entities via ∗‹document antiquotations›.

  For example, embedding ▩‹@{term [show_types] "f x = a + x"}›
  within a text block makes
  \isa{{\isacharparenleft}f{\isasymColon}{\isacharprime}a\ {\isasymRightarrow}\ {\isacharprime}a{\isacharparenright}\ {\isacharparenleft}x{\isasymColon}{\isacharprime}a{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}a{\isasymColon}{\isacharprime}a{\isacharparenright}\ {\isacharplus}\ x} appear in the final {\LaTeX} document.

  Antiquotations usually spare the author tedious typing of logical entities
  in full detail. Even more importantly, some degree of consistency-checking
  between the main body of formal text and its informal explanation is
  achieved, since terms and types appearing in antiquotations are checked
  within the current theory or proof context.

  ┉
  Antiquotations are in general written as
  ▩‹@{›‹name›~▩‹[›‹options›▩‹]›~‹arguments›▩‹}›. The short form
  ▩‹\›▩‹<^›‹name›▩‹>›‹‹argument_content›› (without surrounding ▩‹@{›‹…›▩‹}›)
  works for a single argument that is a cartouche. A cartouche without special
  decoration is equivalent to ▩‹\<^cartouche>›‹‹argument_content››, which is
  equivalent to ▩‹@{cartouche›~‹‹argument_content››▩‹}›. The special name
  @{antiquotation_def cartouche} is defined in the context: Isabelle/Pure
  introduces that as an alias to @{antiquotation_ref text} (see below).
  Consequently, ‹‹foo_bar + baz ≤ bazar›› prints literal quasi-formal text
  (unchecked). A control symbol ▩‹\›▩‹<^›‹name›▩‹>› within the body text, but
  without a subsequent cartouche, is equivalent to ▩‹@{›‹name›▩‹}›.

  \begingroup
  \def\isasymcontrolstart{\isatt{\isacharbackslash\isacharless\isacharcircum}}
  \<^rail>‹
    @{syntax_def antiquotation}:
      '@{' antiquotation_body '}' |
      '\<controlstart>' @{syntax_ref name} '>' @{syntax_ref cartouche} |
      @{syntax_ref cartouche}
    ;
    options: '[' (option * ',') ']'
    ;
    option: @{syntax name} | @{syntax name} '=' @{syntax name}
    ;
  ›
  \endgroup

  Note that the syntax of antiquotations may ∗‹not› include source comments
  ▩‹(*›~‹…›~▩‹*)› nor verbatim text ▩‹{*›~‹…›~▩‹*}›.

  %% FIXME less monolithic presentation, move to individual sections!?
  \<^rail>‹
    @{syntax_def antiquotation_body}:
      (@@{antiquotation text} | @@{antiquotation cartouche} | @@{antiquotation theory_text})
        options @{syntax text} |
      @@{antiquotation theory} options @{syntax embedded} |
      @@{antiquotation thm} options styles @{syntax thms} |
      @@{antiquotation lemma} options @{syntax prop} @'by' @{syntax method} @{syntax method}? |
      @@{antiquotation prop} options styles @{syntax prop} |
      @@{antiquotation term} options styles @{syntax term} |
      @@{antiquotation (HOL) value} options styles @{syntax term} |
      @@{antiquotation term_type} options styles @{syntax term} |
      @@{antiquotation typeof} options styles @{syntax term} |
      @@{antiquotation const} options @{syntax term} |
      @@{antiquotation abbrev} options @{syntax term} |
      @@{antiquotation typ} options @{syntax type} |
      @@{antiquotation type} options @{syntax embedded} |
      @@{antiquotation class} options @{syntax embedded} |
      @@{antiquotation locale} options @{syntax embedded} |
      @@{antiquotation bundle} options @{syntax embedded} |
      (@@{antiquotation command} | @@{antiquotation method} | @@{antiquotation attribute})
        options @{syntax name}
    ;
    @{syntax antiquotation}:
      @@{antiquotation goals} options |
      @@{antiquotation subgoals} options |
      @@{antiquotation prf} options @{syntax thms} |
      @@{antiquotation full_prf} options @{syntax thms} |
      @@{antiquotation ML_text} options @{syntax text} |
      @@{antiquotation ML} options @{syntax text} |
      @@{antiquotation ML_infix} options @{syntax text} |
      @@{antiquotation ML_type} options @{syntax typeargs} @{syntax text} |
      @@{antiquotation ML_structure} options @{syntax text} |
      @@{antiquotation ML_functor} options @{syntax text} |
      @@{antiquotation emph} options @{syntax text} |
      @@{antiquotation bold} options @{syntax text} |
      @@{antiquotation verbatim} options @{syntax text} |
      @@{antiquotation bash_function} options @{syntax embedded} |
      @@{antiquotation system_option} options @{syntax embedded} |
      @@{antiquotation session} options @{syntax embedded} |
      @@{antiquotation path} options @{syntax embedded} |
      @@{antiquotation "file"} options @{syntax name} |
      @@{antiquotation dir} options @{syntax name} |
      @@{antiquotation url} options @{syntax embedded} |
      (@@{antiquotation cite} | @@{antiquotation nocite} |
       @@{antiquotation citet} | @@{antiquotation citep}) @{syntax embedded}
    ;
    styles: '(' (style + ',') ')'
    ;
    style: (@{syntax name} +)
    ;
    @@{command print_antiquotations} ('!'?)
  ›

  ➧ ‹@{text s}› prints uninterpreted source text ‹s›, i.e.\ inner syntax. This
  is particularly useful to print portions of text according to the Isabelle
  document style, without demanding well-formedness, e.g.\ small pieces of
  terms that should not be parsed or type-checked yet.

  It is also possible to write this in the short form ‹‹s›› without any
  further decoration.

  ➧ ‹@{theory_text s}› prints uninterpreted theory source text ‹s›, i.e.\
  outer syntax with command keywords and other tokens.

  ➧ ‹@{theory A}› prints the session-qualified theory name ‹A›, which is
  guaranteed to refer to a valid ancestor theory in the current context.

  ➧ ‹@{thm a⇩₁ … a⇩_n}› prints theorems ‹a⇩₁ … a⇩_n›. Full fact expressions are
  allowed here, including attributes (\secref{sec:syn-att}).

  ➧ ‹@{prop φ}› prints a well-typed proposition ‹φ›.

  ➧ ‹@{lemma φ by m}› proves a well-typed proposition ‹φ› by method ‹m› and
  prints the original ‹φ›.

  ➧ ‹@{term t}› prints a well-typed term ‹t›.
  
  ➧ ‹@{value t}› evaluates a term ‹t› and prints its result, see also
  @{command_ref (HOL) value}.

  ➧ ‹@{term_type t}› prints a well-typed term ‹t› annotated with its type.

  ➧ ‹@{typeof t}› prints the type of a well-typed term ‹t›.

  ➧ ‹@{const c}› prints a logical or syntactic constant ‹c›.
  
  ➧ ‹@{abbrev c x⇩₁ … x⇩_n}› prints a constant abbreviation ‹c x⇩₁ … x⇩_n ≡ rhs›
  as defined in the current context.

  ➧ ‹@{typ τ}› prints a well-formed type ‹τ›.

  ➧ ‹@{type κ}› prints a (logical or syntactic) type constructor ‹κ›.

  ➧ ‹@{class c}› prints a class ‹c›.

  ➧ ‹@{locale c}› prints a locale ‹c›.

  ➧ ‹@{bundle c}› prints a bundle ‹c›.

  ➧ ‹@{command name}›, ‹@{method name}›, ‹@{attribute name}› print checked
  entities of the Isar language.

  ➧ ‹@{goals}› prints the current ∗‹dynamic› goal state. This is mainly for
  support of tactic-emulation scripts within Isar. Presentation of goal states
  does not conform to the idea of human-readable proof documents!

  When explaining proofs in detail it is usually better to spell out the
  reasoning via proper Isar proof commands, instead of peeking at the internal
  machine configuration.
  
  ➧ ‹@{subgoals}› is similar to ‹@{goals}›, but does not print the main goal.
  
  ➧ ‹@{prf a⇩₁ … a⇩_n}› prints the (compact) proof terms corresponding to the
  theorems ‹a⇩₁ … a⇩_n›. Note that this requires proof terms to be switched on
  for the current logic session.
  
  ➧ ‹@{full_prf a⇩₁ … a⇩_n}› is like ‹@{prf a⇩₁ … a⇩_n}›, but prints the full
  proof terms, i.e.\ also displays information omitted in the compact proof
  term, which is denoted by ``‹_›'' placeholders there.

  ➧ ‹@{ML_text s}› prints ML text verbatim: only the token language is
  checked.

  ➧ ‹@{ML s}›, ‹@{ML_infix s}›, ‹@{ML_type s}›, ‹@{ML_structure s}›, and
  ‹@{ML_functor s}› check text ‹s› as ML value, infix operator, type,
  exception, structure, and functor respectively. The source is printed
  verbatim. The variants ‹@{ML_def s}› and ‹@{ML_ref s}› etc. maintain the
  document index: ``def'' means to make a bold entry, ``ref'' means to make a
  regular entry.

  There are two forms for type constructors, with or without separate type
  arguments: this impacts only the index entry. For example, ‹@{ML_type_ref
  ‹'a list›}› makes an entry literally for ``‹'a list›'' (sorted under the
  letter ``a''), but ‹@{ML_type_ref 'a ‹list›}› makes an entry for the
  constructor name ``‹list›''.

  ➧ ‹@{emph s}› prints document source recursively, with {\LaTeX} markup
  ▩‹\emph{›‹…›▩‹}›.

  ➧ ‹@{bold s}› prints document source recursively, with {\LaTeX} markup
  ▩‹\textbf{›‹…›▩‹}›.

  ➧ ‹@{verbatim s}› prints uninterpreted source text literally as ASCII
  characters, using some type-writer font style.

  ➧ ‹@{bash_function name}› prints the given GNU bash function verbatim. The
  name is checked wrt.\ the Isabelle system environment \<^cite>‹"isabelle-system"›.

  ➧ ‹@{system_option name}› prints the given system option verbatim. The name
  is checked wrt.\ cumulative ▩‹etc/options› of all Isabelle components,
  notably 🗏‹~~/etc/options›.

  ➧ ‹@{session name}› prints given session name verbatim. The name is checked
  wrt.\ the dependencies of the current session.

  ➧ ‹@{path name}› prints the file-system path name verbatim.

  ➧ ‹@{file name}› is like ‹@{path name}›, but ensures that ‹name› refers to a
  plain file.

  ➧ ‹@{dir name}› is like ‹@{path name}›, but ensures that ‹name› refers to a
  directory.

  ➧ ‹@{url name}› produces markup for the given URL, which results in an
  active hyperlink within the text.

  ➧ ‹@{cite arg}› produces the Bib{\TeX} citation macro ▩‹\cite[...]{...}›
  with its optional and mandatory argument. The analogous ▩‹\nocite›, and the
  ▩‹\citet› and ▩‹\citep› variants from the ▩‹natbib›
  package⁋‹🌐‹https://ctan.org/pkg/natbib›› are supported as well.

  The argument syntax is uniform for all variants and is usually presented in
  control-symbol-cartouche form: ‹\<^cite>‹arg››. The formal syntax of the
  nested argument language is defined as follows: \<^rail>‹arg: (embedded
  @'in')? (name + @'and') ⏎ (@'using' name)?›

  Here the embedded text is free-form {\LaTeX}, which becomes the optional
  argument of the ▩‹\cite› macro. The named items are Bib{\TeX} database
  entries and become the mandatory argument (separated by comma). The optional
  part ``⬚‹using name›'' specifies an alternative {\LaTeX} macro name.

  ➧ @{command "print_antiquotations"} prints all document antiquotations that
  are defined in the current context; the ``‹!›'' option indicates extra
  verbosity.
›


subsection ‹Styled antiquotations›

text ‹
  The antiquotations ‹thm›, ‹prop› and ‹term› admit an extra ∗‹style›
  specification to modify the printed result. A style is specified by a name
  with a possibly empty number of arguments; multiple styles can be sequenced
  with commas. The following standard styles are available:

  ➧ ‹lhs› extracts the first argument of any application form with at least
  two arguments --- typically meta-level or object-level equality, or any
  other binary relation.
  
  ➧ ‹rhs› is like ‹lhs›, but extracts the second argument.
  
  ➧ ‹concl› extracts the conclusion ‹C› from a rule in Horn-clause normal form
  ‹A⇩₁ ⟹ … A⇩_n ⟹ C›.
  
  ➧ ‹prem› ‹n› extract premise number ‹n› from from a rule in Horn-clause
  normal form ‹A⇩₁ ⟹ … A⇩_n ⟹ C›.
›


subsection ‹General options›

text ‹
  The following options are available to tune the printed output of
  antiquotations. Note that many of these coincide with system and
  configuration options of the same names.

    ➧ @{antiquotation_option_def show_types}~‹= bool› and
    @{antiquotation_option_def show_sorts}~‹= bool› control printing of
    explicit type and sort constraints.

    ➧ @{antiquotation_option_def show_structs}~‹= bool› controls printing of
    implicit structures.

    ➧ @{antiquotation_option_def show_abbrevs}~‹= bool› controls folding of
    abbreviations.

    ➧ @{antiquotation_option_def names_long}~‹= bool› forces names of types
    and constants etc.\ to be printed in their fully qualified internal form.

    ➧ @{antiquotation_option_def names_short}~‹= bool› forces names of types
    and constants etc.\ to be printed unqualified. Note that internalizing the
    output again in the current context may well yield a different result.

    ➧ @{antiquotation_option_def names_unique}~‹= bool› determines whether the
    printed version of qualified names should be made sufficiently long to
    avoid overlap with names declared further back. Set to ‹false› for more
    concise output.

    ➧ @{antiquotation_option_def eta_contract}~‹= bool› prints terms in
    ‹η›-contracted form.

    ➧ @{antiquotation_option_def display}~‹= bool› indicates if the text is to
    be output as multi-line ``display material'', rather than a small piece of
    text without line breaks (which is the default).

    In this mode the embedded entities are printed in the same style as the
    main theory text.

    ➧ @{antiquotation_option_def break}~‹= bool› controls line breaks in
    non-display material.

    ➧ @{antiquotation_option_def cartouche}~‹= bool› indicates if the output
    should be delimited as cartouche.

    ➧ @{antiquotation_option_def quotes}~‹= bool› indicates if the output
    should be delimited via double quotes (option @{antiquotation_option
    cartouche} takes precedence). Note that the Isabelle {\LaTeX} styles may
    suppress quotes on their own account.

    ➧ @{antiquotation_option_def mode}~‹= name› adds ‹name› to the print mode
    to be used for presentation. Note that the standard setup for {\LaTeX}
    output is already present by default, with mode ``‹latex›''.

    ➧ @{antiquotation_option_def margin}~‹= nat› and
    @{antiquotation_option_def indent}~‹= nat› change the margin or
    indentation for pretty printing of display material.

    ➧ @{antiquotation_option_def goals_limit}~‹= nat› determines the maximum
    number of subgoals to be printed (for goal-based antiquotation).

    ➧ @{antiquotation_option_def source}~‹= bool› prints the original source
    text of the antiquotation arguments, rather than its internal
    representation. Note that formal checking of @{antiquotation "thm"},
    @{antiquotation "term"}, etc. is still enabled; use the @{antiquotation
    "text"} antiquotation for unchecked output.

    Regular ‹term› and ‹typ› antiquotations with ‹source = false› involve a
    full round-trip from the original source to an internalized logical entity
    back to a source form, according to the syntax of the current context.
    Thus the printed output is not under direct control of the author, it may
    even fluctuate a bit as the underlying theory is changed later on.

    In contrast, @{antiquotation_option source}~‹= true› admits direct
    printing of the given source text, with the desirable well-formedness
    check in the background, but without modification of the printed text.

    Cartouche delimiters of the argument are stripped for antiquotations that
    are internally categorized as ``embedded''.

    ➧ @{antiquotation_option_def source_cartouche} is like
    @{antiquotation_option source}, but cartouche delimiters are always
    preserved in the output.

  For Boolean flags, ``‹name = true›'' may be abbreviated as ``‹name›''. All
  of the above flags are disabled by default, unless changed specifically for
  a logic session in the corresponding ▩‹ROOT› file.
›


section ‹Markdown-like text structure›

text ‹
  The markup commands @{command_ref text}, @{command_ref txt}, @{command_ref
  text_raw} (\secref{sec:markup}) consist of plain text. Its internal
  structure consists of paragraphs and (nested) lists, using special Isabelle
  symbols and some rules for indentation and blank lines. This quasi-visual
  format resembles ∗‹Markdown›⁋‹🌐‹http://commonmark.org››, but the full
  complexity of that notation is avoided.

  This is a summary of the main principles of minimal Markdown in Isabelle:

    ▪ List items start with the following markers
      ➧[itemize:] ▩‹▪›
      ➧[enumerate:] ▩‹▸›
      ➧[description:] ▩‹➧›

    ▪ Adjacent list items with same indentation and same marker are grouped
    into a single list.

    ▪ Singleton blank lines separate paragraphs.

    ▪ Multiple blank lines escape from the current list hierarchy.

  Notable differences to official Markdown:

    ▪ Indentation of list items needs to match exactly.

    ▪ Indentation is unlimited (official Markdown interprets four spaces as
    block quote).

    ▪ List items always consist of paragraphs --- there is no notion of
    ``tight'' list.

    ▪ Section headings are expressed via Isar document markup commands
    (\secref{sec:markup}).

    ▪ URLs, font styles, other special content is expressed via antiquotations
    (\secref{sec:antiq}), usually with proper nesting of sub-languages via
    text cartouches.
›


section ‹Document markers and command tags \label{sec:document-markers}›

text ‹
  \emph{Document markers} are formal comments of the form ‹✐‹marker_body››
  (using the control symbol ▩‹✐›) and may occur anywhere within the
  outer syntax of a command: the inner syntax of a marker body resembles that
  for attributes (\secref{sec:syn-att}). In contrast, \emph{Command tags} may
  only occur after a command keyword and are treated as special markers as
  explained below.

  \<^rail>‹
    @{syntax_def marker}: '✐' @{syntax cartouche}
    ;
    @{syntax_def marker_body}: (@{syntax name} @{syntax args} * ',')
    ;
    @{syntax_def tags}: tag*
    ;
    tag: '%' (@{syntax short_ident} | @{syntax string})
  ›

  Document markers are stripped from the document output, but surrounding
  white-space is preserved: e.g.\ a marker at the end of a line does not
  affect the subsequent line break. Markers operate within the semantic
  presentation context of a command, and may modify it to change the overall
  appearance of a command span (e.g.\ by adding tags).

  Each document marker has its own syntax defined in the theory context; the
  following markers are predefined in Isabelle/Pure:

  \<^rail>‹
    (@@{document_marker_def title} |
     @@{document_marker_def creator} |
     @@{document_marker_def contributor} |
     @@{document_marker_def date} |
     @@{document_marker_def license} |
     @@{document_marker_def description}) @{syntax embedded}
    ;
    @@{document_marker_def tag} (scope?) @{syntax name}
    ;
    scope: '(' ('proof' | 'command') ')'
  ›

    ➧ ‹✐‹title arg››, ‹✐‹creator arg››, ‹✐‹contributor arg››, ‹✐‹date arg››,
    ‹✐‹license arg››, and ‹✐‹description arg›› produce markup in the PIDE
    document, without any immediate effect on typesetting. This vocabulary is
    taken from the Dublin Core Metadata
    Initiative⁋‹🌐‹https://www.dublincore.org/specifications/dublin-core/dcmi-terms››.
    The argument is an uninterpreted string, except for @{document_marker
    description}, which consists of words that are subject to spell-checking.

    ➧ ‹✐‹tag name›› updates the list of command tags in the presentation
    context: later declarations take precedence, so ‹✐‹tag a, tag b, tag c››
    produces a reversed list. The default tags are given by the original
    ⬚‹keywords› declaration of a command, and the system option
    @{system_option_ref document_tags}.

    The optional ‹scope› tells how far the tagging is applied to subsequent
    proof structure: ``⬚‹("proof")›'' means it applies to the following proof
    text, and ``⬚‹(command)›'' means it only applies to the current command.
    The default within a proof body is ``⬚‹("proof")›'', but for toplevel goal
    statements it is ``⬚‹(command)›''. Thus a ‹tag› marker for ⬚‹theorem›,
    ⬚‹lemma› etc. does \emph{not} affect its proof by default.

    An old-style command tag ▩‹%›‹name› is treated like a document marker
    ‹✐‹tag (proof) name››: the list of command tags precedes the list of
    document markers. The head of the resulting tags in the presentation
    context is turned into {\LaTeX} environments to modify the type-setting.
    The following tags are pre-declared for certain classes of commands, and
    serve as default markup for certain kinds of commands:

    ┉
    \begin{tabular}{ll}
      ‹document› & document markup commands \\
      ‹theory› & theory begin/end \\
      ‹proof› & all proof commands \\
      ‹ML› & all commands involving ML code \\
    \end{tabular}
    ┉

  The Isabelle document preparation system \<^cite>‹"isabelle-system"› allows
  tagged command regions to be presented specifically, e.g.\ to fold proof
  texts, or drop parts of the text completely.

  For example ``⬚‹by auto›~‹✐‹tag invisible››'' causes that piece of proof to
  be treated as ‹invisible› instead of ‹proof› (the default), which may be
  shown or hidden depending on the document setup. In contrast, ``⬚‹by
  auto›~‹✐‹tag visible››'' forces this text to be shown invariably.

  Explicit tag specifications within a proof apply to all subsequent commands
  of the same level of nesting. For example, ``⬚‹proof›~‹✐‹tag invisible›
  …›~⬚‹qed›'' forces the whole sub-proof to be typeset as ‹visible› (unless
  some of its parts are tagged differently).

  ┉
  Command tags merely produce certain markup environments for type-setting.
  The meaning of these is determined by {\LaTeX} macros, as defined in
  🗏‹~~/lib/texinputs/isabelle.sty› or by the document author. The Isabelle
  document preparation tools also provide some high-level options to specify
  the meaning of arbitrary tags to ``keep'', ``drop'', or ``fold'' the
  corresponding parts of the text. Logic sessions may also specify ``document
  versions'', where given tags are interpreted in some particular way. Again
  see \<^cite>‹"isabelle-system"› for further details.
›


section ‹Railroad diagrams›

text ‹
  \begin{matharray}{rcl}
    @{antiquotation_def "rail"} & : & ‹antiquotation› \\
  \end{matharray}

  \<^rail>‹
    'rail' @{syntax text}
  ›

  The @{antiquotation rail} antiquotation allows to include syntax diagrams
  into Isabelle documents. {\LaTeX} requires the style file
  🗏‹~~/lib/texinputs/railsetup.sty›, which can be used via
  ▩‹\usepackage{railsetup}› in ▩‹root.tex›, for example.

  The rail specification language is quoted here as Isabelle @{syntax string}
  or text @{syntax "cartouche"}; it has its own grammar given below.

  \begingroup
  \def\isasymnewline{\isatt{\isacharbackslash\isacharless newline\isachargreater}}
  \<^rail>‹
  rule? + ';'
  ;
  rule: ((identifier | @{syntax antiquotation}) ':')? body
  ;
  body: concatenation + '|'
  ;
  concatenation: ((atom '?'?) +) (('*' | '+') atom?)?
  ;
  atom: '(' body? ')' | identifier |
    '@'? (string | @{syntax antiquotation}) |
    '⏎'
  ›
  \endgroup

  The lexical syntax of ‹identifier› coincides with that of @{syntax
  short_ident} in regular Isabelle syntax, but ‹string› uses single quotes
  instead of double quotes of the standard @{syntax string} category.

  Each ‹rule› defines a formal language (with optional name), using a notation
  that is similar to EBNF or regular expressions with recursion. The meaning
  and visual appearance of these rail language elements is illustrated by the
  following representative examples.

  ▪ Empty ▩‹()›

  \<^rail>‹()›

  ▪ Nonterminal ▩‹A›

  \<^rail>‹A›

  ▪ Nonterminal via Isabelle antiquotation ▩‹@{syntax method}›

  \<^rail>‹@{syntax method}›

  ▪ Terminal ▩‹'xyz'›

  \<^rail>‹'xyz'›

  ▪ Terminal in keyword style ▩‹@'xyz'›

  \<^rail>‹@'xyz'›

  ▪ Terminal via Isabelle antiquotation ▩‹@@{method rule}›

  \<^rail>‹@@{method rule}›

  ▪ Concatenation ▩‹A B C›

  \<^rail>‹A B C›

  ▪ Newline inside concatenation ▩‹A B C ⏎ D E F›

  \<^rail>‹A B C ⏎ D E F›

  ▪ Variants ▩‹A | B | C›

  \<^rail>‹A | B | C›

  ▪ Option ▩‹A ?›

  \<^rail>‹A ?›

  ▪ Repetition ▩‹A *›

  \<^rail>‹A *›

  ▪ Repetition with separator ▩‹A * sep›

  \<^rail>‹A * sep›

  ▪ Strict repetition ▩‹A +›

  \<^rail>‹A +›

  ▪ Strict repetition with separator ▩‹A + sep›

  \<^rail>‹A + sep›
›

end