Skip to content

Commit 72669ab

Browse files
committed
docs(man): write about the way to use regular expressions to choose a parser
Signed-off-by: Masatake YAMATO <[email protected]>
1 parent d27c6d3 commit 72669ab

File tree

4 files changed

+152
-42
lines changed

4 files changed

+152
-42
lines changed

docs/man/ctags-optlib.7.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ readers should read :ref:`ctags(1) <ctags(1)>` of Universal Ctags first.
3131
Following options are for defining (or customizing) a parser:
3232

3333
* ``--langdef=<name>``
34-
* ``--map-<LANG>=[+|-]<extension>|<pattern>``
34+
* ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
3535
* ``--kinddef-<LANG>=<letter>,<name>,<description>``
3636
* ``--regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/[<flags>]``
3737
* ``--mline-regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/{mgroup=<N>}[<flags>]``
@@ -103,7 +103,7 @@ Overview for defining a parser
103103

104104
3. Give a file pattern or file extension for activating the parser
105105

106-
Use ``--map-<LANG>=[+|-]<extension>|<pattern>``.
106+
Use ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``.
107107

108108
4. Define kinds
109109

docs/man/ctags.1.rst

Lines changed: 74 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -499,26 +499,68 @@ Language Selection and Mapping Options
499499
Exuberant Ctags. See :ref:`ctags-incompatibilities(7) <ctags-incompatibilities(7)>` for the background of
500500
this incompatible change.
501501

502-
``--map-<LANG>=[+|-]<extension>|<pattern>``
502+
``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
503503
This option provides the way to control mapping(s) of file names to
504504
languages in a more fine-grained way than ``--langmap`` option.
505505

506506
In ctags, more than one language can map to a
507-
file name *<pattern>* or file *<extension>* (*N:1 map*). Alternatively,
508-
``--langmap`` option handle only *1:1 map*, only one language
509-
mapping to one file name *<pattern>* or file *<extension>*. A typical N:1
510-
map is seen in C++ and ObjectiveC language; both languages have
511-
a map to ``.h`` as a file extension.
512-
513-
A file extension is specified by preceding the extension with a period (e.g. ``.c``).
514-
A file name pattern is specified by enclosing the pattern in parentheses (e.g.
515-
``([Mm]akefile)``). A prefixed plus ('``+``') sign is for adding, and
507+
relative-path regular expression (*<rexpr>*), file name *<pattern>*, or
508+
file *<extension>* (*N:1 map*). Alternatively, ``--langmap``
509+
option handle only *1:1 map*, only one language mapping to one
510+
file name *<pattern>* or file *<extension>*. A typical N:1 map is
511+
seen in C++ and ObjectiveC language; both languages have a map to
512+
``.h`` as a file extension.
513+
514+
A file extension is specified by preceding the extension with a period
515+
(e.g. ``.c``). A file name pattern is specified by enclosing the pattern in
516+
parentheses (e.g. ``([Mm]akefile)``). A relative-path regular expression is
517+
specified by enclosing the expressions in percent signs '``%``'
518+
(e.g. ``%include/.*\.h%``). To include a literal percent sign
519+
inside the regular expression, escape it as ``\%``.
520+
521+
A prefixed plus ('``+``') sign is for adding, and
516522
minus ('``-``') is for removing. No prefix means replacing the map of *<LANG>*.
517523

518-
Unlike ``--langmap``, *<extension>* (or *<pattern>*) is not a list.
519-
``--map-<LANG>`` takes one extension (or pattern). However,
520-
the option can be specified with different arguments multiple times
521-
in a command line.
524+
Unlike ``--langmap``, ``--map-<LANG>`` does not take a list; ``--map-<LANG>``
525+
takes one extension, one pattern, or one regular expression. However, the
526+
option can be specified with different arguments multiple times in a command
527+
line.
528+
529+
For file extensions and file name patterns, the match is performed
530+
with a base file name, a file without any directory components.
531+
For relative-path regular expressions, the match is performed with
532+
a relative-path incorporating the directory components. A
533+
relative-path is relative to the directory where ctags launches.
534+
535+
Assume your shell is in ``/project/x`` directory and you have the following
536+
source tree under the directory.
537+
538+
.. code-block::
539+
540+
src
541+
└── lib
542+
├── data.c
543+
└── logic.c
544+
545+
If you run ctags with ``ctags -R src``,
546+
the match is performed with ``src/lib/data.c`` and ``src/lib/logic.c`` If you
547+
give ``--map-YourParser='%src/lib/.*\.c%'``, ctags
548+
chooses ``YourParser`` parser for processing ``data.c`` and ``logic.c`` in the
549+
tree.
550+
551+
If your shell is in ``/project/x/src`` and you run
552+
``ctags -R lib``, ctags may not choose
553+
``YourParser`` because the match is performed with ``lib/data.c`` and
554+
``lib/logic.c``.
555+
556+
A relative-path regular expression can take a flag controlling its testing.
557+
The flag comes after the last percent sign. Currently only one available flag:
558+
559+
``{icase}`` (one-letter form '``i``')
560+
The regular expression is to be applied in a case-insensitive
561+
manner. (e.g. ``%include/.*\.h%i`` or ``%include/.*\.h%{icase}``
562+
563+
The relative-path regular expression is available since version 6.3.0.
522564

523565
.. _option_tags_file_contents:
524566

@@ -1243,14 +1285,24 @@ Listing Options
12431285
languages, and then exits.
12441286
``all`` is used as default value if the option argument is omitted.
12451287

1246-
``--list-maps[=(<language>|all)]``
1247-
Lists file name patterns and the file extensions which associate a file
1288+
``--list-map-rexprs[=(<language>|all)]``
1289+
Lists the relative-path regular expressions which associate a file
12481290
name with a language for either the specified *<language>* or ``all``
12491291
languages, and then exits.
12501292
``all`` is used as default value if the option argument is omitted.
12511293

1252-
To list the file extensions or file name patterns individually, use
1253-
``--list-map-extensions`` or ``--list-map-patterns`` option.
1294+
(since version 6.3.0)
1295+
1296+
``--list-maps[=(<language>|all)]``
1297+
Lists the file name patterns, the file extensions, and the relative-path
1298+
regular extensions which associate a file name with a language for either the
1299+
specified *<language>* or ``all`` languages, and then exits.
1300+
``all`` is used as default value if the option argument is omitted.
1301+
1302+
To list the file extensions, file name patterns, or relative-path regular
1303+
expressions individually, use ``--list-map-extensions``,
1304+
``--list-map-patterns``, or ``--list-map-rexprs`` option.
1305+
12541306
See the ``--langmap`` option, and "`Determining file language`_", above.
12551307

12561308
This option does not work with ``--machinable`` nor
@@ -1507,10 +1559,13 @@ are mapped to C++, C and ObjectiveC. These mappings can cause
15071559
issues. ctags tries to select the proper parser
15081560
for the source file by applying heuristics to its content, however
15091561
it is not perfect. In case of issues one can use ``--language-force=<language>``,
1510-
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>``
1562+
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
15111563
options. (Some of the heuristics are applied whether ``--guess-language-eagerly``
15121564
is given or not.)
15131565

1566+
The order of testing is relative-path regular expressions (specified with
1567+
``--map-<LANG>=<rexpr>``), file name patterns, then file extensions.
1568+
15141569
.. TODO: all heuristics??? To be confirmed.
15151570
15161571
Heuristically guessing

man/ctags-optlib.7.rst.in

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ readers should read ctags(1) of Universal Ctags first.
3131
Following options are for defining (or customizing) a parser:
3232

3333
* ``--langdef=<name>``
34-
* ``--map-<LANG>=[+|-]<extension>|<pattern>``
34+
* ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
3535
* ``--kinddef-<LANG>=<letter>,<name>,<description>``
3636
* ``--regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/[<flags>]``
3737
* ``--mline-regex-<LANG>=/<line_pattern>/<name_pattern>/<kind-spec>/{mgroup=<N>}[<flags>]``
@@ -103,7 +103,7 @@ Overview for defining a parser
103103

104104
3. Give a file pattern or file extension for activating the parser
105105

106-
Use ``--map-<LANG>=[+|-]<extension>|<pattern>``.
106+
Use ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``.
107107

108108
4. Define kinds
109109

man/ctags.1.rst.in

Lines changed: 74 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -499,26 +499,68 @@ Language Selection and Mapping Options
499499
Exuberant Ctags. See ctags-incompatibilities(7) for the background of
500500
this incompatible change.
501501

502-
``--map-<LANG>=[+|-]<extension>|<pattern>``
502+
``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
503503
This option provides the way to control mapping(s) of file names to
504504
languages in a more fine-grained way than ``--langmap`` option.
505505

506506
In @CTAGS_NAME_EXECUTABLE@, more than one language can map to a
507-
file name *<pattern>* or file *<extension>* (*N:1 map*). Alternatively,
508-
``--langmap`` option handle only *1:1 map*, only one language
509-
mapping to one file name *<pattern>* or file *<extension>*. A typical N:1
510-
map is seen in C++ and ObjectiveC language; both languages have
511-
a map to ``.h`` as a file extension.
512-
513-
A file extension is specified by preceding the extension with a period (e.g. ``.c``).
514-
A file name pattern is specified by enclosing the pattern in parentheses (e.g.
515-
``([Mm]akefile)``). A prefixed plus ('``+``') sign is for adding, and
507+
relative-path regular expression (*<rexpr>*), file name *<pattern>*, or
508+
file *<extension>* (*N:1 map*). Alternatively, ``--langmap``
509+
option handle only *1:1 map*, only one language mapping to one
510+
file name *<pattern>* or file *<extension>*. A typical N:1 map is
511+
seen in C++ and ObjectiveC language; both languages have a map to
512+
``.h`` as a file extension.
513+
514+
A file extension is specified by preceding the extension with a period
515+
(e.g. ``.c``). A file name pattern is specified by enclosing the pattern in
516+
parentheses (e.g. ``([Mm]akefile)``). A relative-path regular expression is
517+
specified by enclosing the expressions in percent signs '``%``'
518+
(e.g. ``%include/.*\.h%``). To include a literal percent sign
519+
inside the regular expression, escape it as ``\%``.
520+
521+
A prefixed plus ('``+``') sign is for adding, and
516522
minus ('``-``') is for removing. No prefix means replacing the map of *<LANG>*.
517523

518-
Unlike ``--langmap``, *<extension>* (or *<pattern>*) is not a list.
519-
``--map-<LANG>`` takes one extension (or pattern). However,
520-
the option can be specified with different arguments multiple times
521-
in a command line.
524+
Unlike ``--langmap``, ``--map-<LANG>`` does not take a list; ``--map-<LANG>``
525+
takes one extension, one pattern, or one regular expression. However, the
526+
option can be specified with different arguments multiple times in a command
527+
line.
528+
529+
For file extensions and file name patterns, the match is performed
530+
with a base file name, a file without any directory components.
531+
For relative-path regular expressions, the match is performed with
532+
a relative-path incorporating the directory components. A
533+
relative-path is relative to the directory where ctags launches.
534+
535+
Assume your shell is in ``/project/x`` directory and you have the following
536+
source tree under the directory.
537+
538+
.. code-block::
539+
540+
src
541+
└── lib
542+
├── data.c
543+
└── logic.c
544+
545+
If you run @CTAGS_NAME_EXECUTABLE@ with ``@CTAGS_NAME_EXECUTABLE@ -R src``,
546+
the match is performed with ``src/lib/data.c`` and ``src/lib/logic.c`` If you
547+
give ``--map-YourParser='%src/lib/.*\.c%'``, @CTAGS_NAME_EXECUTABLE@
548+
chooses ``YourParser`` parser for processing ``data.c`` and ``logic.c`` in the
549+
tree.
550+
551+
If your shell is in ``/project/x/src`` and you run
552+
``@CTAGS_NAME_EXECUTABLE@ -R lib``, @CTAGS_NAME_EXECUTABLE@ may not choose
553+
``YourParser`` because the match is performed with ``lib/data.c`` and
554+
``lib/logic.c``.
555+
556+
A relative-path regular expression can take a flag controlling its testing.
557+
The flag comes after the last percent sign. Currently only one available flag:
558+
559+
``{icase}`` (one-letter form '``i``')
560+
The regular expression is to be applied in a case-insensitive
561+
manner. (e.g. ``%include/.*\.h%i`` or ``%include/.*\.h%{icase}``
562+
563+
The relative-path regular expression is available since version 6.3.0.
522564

523565
.. _option_tags_file_contents:
524566

@@ -1243,14 +1285,24 @@ Listing Options
12431285
languages, and then exits.
12441286
``all`` is used as default value if the option argument is omitted.
12451287

1246-
``--list-maps[=(<language>|all)]``
1247-
Lists file name patterns and the file extensions which associate a file
1288+
``--list-map-rexprs[=(<language>|all)]``
1289+
Lists the relative-path regular expressions which associate a file
12481290
name with a language for either the specified *<language>* or ``all``
12491291
languages, and then exits.
12501292
``all`` is used as default value if the option argument is omitted.
12511293

1252-
To list the file extensions or file name patterns individually, use
1253-
``--list-map-extensions`` or ``--list-map-patterns`` option.
1294+
(since version 6.3.0)
1295+
1296+
``--list-maps[=(<language>|all)]``
1297+
Lists the file name patterns, the file extensions, and the relative-path
1298+
regular extensions which associate a file name with a language for either the
1299+
specified *<language>* or ``all`` languages, and then exits.
1300+
``all`` is used as default value if the option argument is omitted.
1301+
1302+
To list the file extensions, file name patterns, or relative-path regular
1303+
expressions individually, use ``--list-map-extensions``,
1304+
``--list-map-patterns``, or ``--list-map-rexprs`` option.
1305+
12541306
See the ``--langmap`` option, and "`Determining file language`_", above.
12551307

12561308
This option does not work with ``--machinable`` nor
@@ -1507,10 +1559,13 @@ are mapped to C++, C and ObjectiveC. These mappings can cause
15071559
issues. @CTAGS_NAME_EXECUTABLE@ tries to select the proper parser
15081560
for the source file by applying heuristics to its content, however
15091561
it is not perfect. In case of issues one can use ``--language-force=<language>``,
1510-
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>``
1562+
``--langmap=<map>[,<map>[...]]``, or the ``--map-<LANG>=[+|-]<extension>|<pattern>|<rexpr>``
15111563
options. (Some of the heuristics are applied whether ``--guess-language-eagerly``
15121564
is given or not.)
15131565

1566+
The order of testing is relative-path regular expressions (specified with
1567+
``--map-<LANG>=<rexpr>``), file name patterns, then file extensions.
1568+
15141569
.. TODO: all heuristics??? To be confirmed.
15151570

15161571
Heuristically guessing

0 commit comments

Comments
 (0)