24. Changing CPython’s Grammar¶
24.1. Abstract¶
There’s more to changing Python’s grammar than editing
Grammar/Grammar. This document aims to be a
checklist of places that must also be fixed.
It is probably incomplete. If you see omissions, submit a bug or patch.
This document is not intended to be an instruction manual on Python grammar hacking, for several reasons.
24.2. Rationale¶
People are getting this wrong all the time; it took well over a
year before someone noticed
that adding the floor division
operator (//) broke the parser module.
24.3. Checklist¶
Note: sometimes things mysteriously don’t work. Before giving up, try make clean.
Grammar/Grammar: OK, you’d probably worked this one out. :-) After changing it, runmake regen-grammar, to regenerateInclude/graminit.handPython/graminit.c. (This runs Python’s parser generator,Python/pgen).Grammar/Tokensis a place for adding new token types. After changing it, runmake regen-tokento regenerateInclude/token.h,Parser/token.c,Lib/token.pyandDoc/library/token-list.inc. If you change bothGrammarandTokens, runmake regen-tokensbeforemake regen-grammar.Parser/Python.asdlmay need changes to match the Grammar. Then runmake regen-astto regenerateInclude/Python-ast.handPython/Python-ast.c.Parser/tokenizer.ccontains the tokenization code. This is where you would add a new type of comment or string literal, for example.Python/ast.cwill need changes to create the AST objects involved with the Grammar change.- The Design of CPython’s Compiler has its own page.
- The
parsermodule. Add some of your new syntax totest_parser, bang onModules/parsermodule.cuntil it passes. - Add some usage of your new syntax to
test_grammar.py. - Certain changes may require tweaks to the library module
pyclbr. Lib/tokenize.pyneeds changes to match changes to the tokenizer.Lib/lib2to3/Grammar.txtmay need changes to match the Grammar.- Documentation must be written!
