diff options
| author | Tyge Løvset <[email protected]> | 2023-02-08 16:16:49 +0100 |
|---|---|---|
| committer | Tyge Løvset <[email protected]> | 2023-02-08 17:18:24 +0100 |
| commit | c4441f5fc665194fbd7a894a67a64a08c3beac42 (patch) | |
| tree | 82f231b6e8fcb75625166f98aa785baaa265a3d6 /docs/cregex_api.md | |
| parent | 673dd5319a488d4b702b94dd9aeda4e497ae4fbc (diff) | |
| download | STC-modified-c4441f5fc665194fbd7a894a67a64a08c3beac42.tar.gz STC-modified-c4441f5fc665194fbd7a894a67a64a08c3beac42.zip | |
Changed to use lowercase flow-control macros in examples (uppercase will still be supported). Improved many examples to use c_make() to init containers.
Diffstat (limited to 'docs/cregex_api.md')
| -rw-r--r-- | docs/cregex_api.md | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/docs/cregex_api.md b/docs/cregex_api.md index a115b4af..8cabb6fc 100644 --- a/docs/cregex_api.md +++ b/docs/cregex_api.md @@ -130,19 +130,19 @@ if (cregex_find_pattern(pattern, input, match, CREG_DEFAULT)) To compile, use: `gcc first_match.c src/cregex.c src/utf8code.c`. In order to use a callback function in the replace call, see `examples/regex_replace.c`. -### Iterate through regex matches, *c_FORMATCH* +### Iterate through regex matches, *c_formatch* To iterate multiple matches in an input string, you may use ```c csview match[5] = {0}; while (cregex_find(&re, input, match, CREG_M_NEXT) == CREG_OK) - c_FORRANGE (k, cregex_captures(&re)) + c_forrange (k, cregex_captures(&re)) printf("submatch %lld: %.*s\n", k, c_SVARG(match[k])); ``` There is also a safe macro which simplifies this: ```c -c_FORMATCH (it, &re, input) - c_FORRANGE (k, cregex_captures(&re)) +c_formatch (it, &re, input) + c_forrange (k, cregex_captures(&re)) printf("submatch %lld: %.*s\n", k, c_SVARG(it.match[k])); ``` @@ -223,7 +223,7 @@ For reference, **cregex** uses the following files: ## Limitations The main goal of **cregex** is to be small and fast with limited but useful unicode support. In order to reach these goals, **cregex** currently does not support the following features (non-exhaustive list): -- In order to limit table sizes, most general UTF8 character classes are missing, like \p{L}, \p{S}, and all specific scripts like \p{Greek} etc. Some/all of these may be added in the future as an alternative source file with unicode tables to link with. +- In order to limit table sizes, most general UTF8 character classes are missing, like \p{L}, \p{S}, and most specific scripts like \p{Tibetan}. Some/all of these may be added in the future as an alternative source file with unicode tables to link with. Currently, only characters from from the Basic Multilingual Plane (BMP) are supported, which contains most commonly used characters (i.e. none of the "supplementary planes"). - {n, m} syntax for repeating previous token min-max times. - Non-capturing groups - Lookaround and backreferences (cannot be implemented efficiently). |
