summaryrefslogtreecommitdiffhomepage
path: root/docs/cregex_api.md
diff options
context:
space:
mode:
authorTyge Løvset <[email protected]>2023-02-08 16:16:49 +0100
committerTyge Løvset <[email protected]>2023-02-08 17:18:24 +0100
commitc4441f5fc665194fbd7a894a67a64a08c3beac42 (patch)
tree82f231b6e8fcb75625166f98aa785baaa265a3d6 /docs/cregex_api.md
parent673dd5319a488d4b702b94dd9aeda4e497ae4fbc (diff)
downloadSTC-modified-c4441f5fc665194fbd7a894a67a64a08c3beac42.tar.gz
STC-modified-c4441f5fc665194fbd7a894a67a64a08c3beac42.zip
Changed to use lowercase flow-control macros in examples (uppercase will still be supported). Improved many examples to use c_make() to init containers.
Diffstat (limited to 'docs/cregex_api.md')
-rw-r--r--docs/cregex_api.md10
1 files changed, 5 insertions, 5 deletions
diff --git a/docs/cregex_api.md b/docs/cregex_api.md
index a115b4af..8cabb6fc 100644
--- a/docs/cregex_api.md
+++ b/docs/cregex_api.md
@@ -130,19 +130,19 @@ if (cregex_find_pattern(pattern, input, match, CREG_DEFAULT))
To compile, use: `gcc first_match.c src/cregex.c src/utf8code.c`.
In order to use a callback function in the replace call, see `examples/regex_replace.c`.
-### Iterate through regex matches, *c_FORMATCH*
+### Iterate through regex matches, *c_formatch*
To iterate multiple matches in an input string, you may use
```c
csview match[5] = {0};
while (cregex_find(&re, input, match, CREG_M_NEXT) == CREG_OK)
- c_FORRANGE (k, cregex_captures(&re))
+ c_forrange (k, cregex_captures(&re))
printf("submatch %lld: %.*s\n", k, c_SVARG(match[k]));
```
There is also a safe macro which simplifies this:
```c
-c_FORMATCH (it, &re, input)
- c_FORRANGE (k, cregex_captures(&re))
+c_formatch (it, &re, input)
+ c_forrange (k, cregex_captures(&re))
printf("submatch %lld: %.*s\n", k, c_SVARG(it.match[k]));
```
@@ -223,7 +223,7 @@ For reference, **cregex** uses the following files:
## Limitations
The main goal of **cregex** is to be small and fast with limited but useful unicode support. In order to reach these goals, **cregex** currently does not support the following features (non-exhaustive list):
-- In order to limit table sizes, most general UTF8 character classes are missing, like \p{L}, \p{S}, and all specific scripts like \p{Greek} etc. Some/all of these may be added in the future as an alternative source file with unicode tables to link with.
+- In order to limit table sizes, most general UTF8 character classes are missing, like \p{L}, \p{S}, and most specific scripts like \p{Tibetan}. Some/all of these may be added in the future as an alternative source file with unicode tables to link with. Currently, only characters from from the Basic Multilingual Plane (BMP) are supported, which contains most commonly used characters (i.e. none of the "supplementary planes").
- {n, m} syntax for repeating previous token min-max times.
- Non-capturing groups
- Lookaround and backreferences (cannot be implemented efficiently).