How Sphinx Index Works

How Sphinx Index Works#

Sphinx's Domain provides custom indices support by implementing sphinx.domains.Index abstract class:

An Index is the description for a domain-specific index. To add an index to a domain, subclass Index, overriding the three name attributes:

  • name is an identifier used for generating file names. It is also used for a hyperlink target for the index. Therefore, users can refer the index page using ref role and a string which is combined domain name and name attribute (ex. :ref:`py-modindex`).

  • localname is the section title for the index.

  • shortname is a short name for the index, for use in the relation bar in HTML output. Can be empty to disable entries in the relation bar.

and providing a sphinx.domains.Index.generate() method.Then, add the index class to your domain's indices list. Extensions can add indices to existing domains using add_index_to_domain().

Index Entry Generation#

Every Index instance should implementation the generarte function, and returns groups of sphinx.domains.IndexEntry:

@abstractmethod
def generate(self, docnames: Iterable[str] | None = None,
             ) -> tuple[list[tuple[str, list[IndexEntry]]], bool]:

The function is well documented:

Get entries for the index.

If docnames is given, restrict to entries referring to these docnames.

The return value is a tuple of (content, collapse):

collapse

A boolean that determines if sub-entries should start collapsed (for output formats that support collapsing sub-entries).

content:

A sequence of (letter, entries) tuples, where letter is the "heading" for the given entries, usually the starting letter, and entries is a sequence of single entries. Each entry is a sequence [name, subtype, docname, anchor, extra, qualifier, descr].

Check out sphinx.domains.IndexEntry for more details. The text alone may not be intuitive enough, so let's create an example, the following code implements an Index (code for adding Index to Domain is omitted):

from typing import Iterable
from sphinx.domains import Index, IndexEntry

class MyIndex(Index):
    name = 'metavar'
    localname = 'Meta Variable Reference Index'
    shortname = 'references'

    def generate(
        self, docnames: Iterable[str] | None = None
    ) -> tuple[list[tuple[str, list[IndexEntry]]], bool]:
        idx1 = IndexEntry('foo', 0, 'docname', 'anchor', 'extra', 'qualifier', 'desc')
        idx2 = IndexEntry('bar', 0, 'docname', 'anchor', 'extra', 'qualifier', 'desc')
        idx3 = IndexEntry('baz', 1, 'docname', 'anchor', 'extra', 'qualifier', 'desc')
        idx4 = IndexEntry('qux', 2, 'docname', 'anchor', 'extra', 'qualifier', 'desc')
        idx5 = IndexEntry('quux', 1, 'docname', 'anchor', 'extra', 'qualifier', 'desc')

        return (
            # entry list
            [
                ("letter1", [idx1, idx2]),
                ("letter2", [idx3, idx4, idx5]),
            ],
            # collapse
            False,
        )

Then index page is generated:

../../../_images/%E7%81%AB%E7%8B%90%E6%88%AA%E5%9B%BE_2024-08-24T02-36-40.835Z.png

http://silverrainz.me/sphinx-index-example/std-metavar.html#

  1. The hyperlink of name targets to document whose docname is "docname"

  2. IndexEntry are grouped by letter, A.K.A category

    提示

    In General Index (genindex), the category is usually a single first letter, this is why category is called "letter" here.

  3. Entry with sub-entries (subtype=1) is collapsible (baz, quux)

  4. Sub-entry (subtype=2) follow after with its entry with subtype=1 in the list[IndexEntry] (baz has 1 sub-entry while quxx has not)

General Index (genindex)#

Sphinx also provides index role and index directive to create index entries from reStructuredText, which are hard to understand for me :'(and I personally don't use them.

Index generated by these markups are quite special: It does not logically belong to any Domain (such as py, std, and so on...), so it is called General Index (genindex), and can be referenced by :ref:`genindex`.

These markups generate node sphinx.addnodes.index, which carries a list of 5-tuples:

Node for index entries.

This node is created by the index directive and has one attribute, entries. Its value is a list of 5-tuples of (entrytype, entryname, target, ignored, key).

entrytype is one of "single", "pair", "double", "triple".

key is categorization characters (usually a single character) for general index page. For the details of this, please see also: glossary and issue sphinx-doc/sphinx#2302

Nodes are collected by sphinx.domains.index.IndexDomain.process_doc()' (yes, genindex actually belongs to a domain called "index") ` and stored in domain's data storage sphinx.domains.Domain.data, so builders can access it by sphinx.environment.BuildEnvironment.domains['index'].

For sphinx.builders.html.StandaloneHTMLBuilder, the write_genindex method loads 5-tuple list from buildenv, it is quite complex and I don't fully understand its

备注

We found that there is not any call to sphinx.domains.Index.generate(). This shows that genindex is another independent implementation. I don’t know why this is done, maybe for historical reasons.

B.T.W the wirting of non-genindex is done in write_domain_indices.

评论

如果你有任何意见,请在此评论。

评论系统故障,修复中,预计 24.11 前可用。如果你有任何意见,请邮件至