summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorTyge Løvset <[email protected]>2020-06-21 20:57:01 +0200
committerGitHub <[email protected]>2020-06-21 20:57:01 +0200
commit0f6c0308983e63d8cb9fb4ce11945e00bd6945a7 (patch)
tree95cc05b039f701d7d565643f46538b0f75178699
parentd659ccdc85769a681f3981456659cb778b56fcf1 (diff)
downloadSTC-modified-0f6c0308983e63d8cb9fb4ce11945e00bd6945a7.tar.gz
STC-modified-0f6c0308983e63d8cb9fb4ce11945e00bd6945a7.zip
Update README.md
-rw-r--r--README.md65
1 files changed, 40 insertions, 25 deletions
diff --git a/README.md b/README.md
index d5ab26b1..0de7b921 100644
--- a/README.md
+++ b/README.md
@@ -7,13 +7,13 @@ Introduction
An elegant, modern, generic, customizable, typesafe, consistent, user-friendly, and very efficient standard container library for C99. This is a small headers only library with the most used container components, and a few algorithms:
- **cstring.h** - Compact and powerful **string** class.
- **cvector.h** - Dynamic generic **vector** class.
-- **chash.h** - Unordered **map** and **set**.
+- **chash.h** - Unordered **map** and **set**. Highly customizable and fast.
- **carray.h** - Multi-dimensional dynamic **array**
-- **clist.h** - A circular singly linked **list**, suited to be used as **queue**. Supports *pushBack, pushFront, and popFront*, as well as *splice* functions and (merge) *sorting*.
-- **coption.h** - Header-only implementation of **getopt_long**-like function, to parse command line arguments.
-- **crandom.h** - Header-only collection of efficent modern random number generators **xoroshiro128ss**, **sfc32/64** and Mersenne Twister **mt19937**. It also implements the crypto-strong **siphash** algorithm.
+- **clist.h** - A circular singly linked **list**, may be used as a **queue** (supports *pushBack, pushFront, and popFront*). Also contains various *splice* functions and (merge) *sorting*.
+- **coption.h** - Implementation of *getopt_long*-"like" function, *coption_get*, to parse command line arguments.
+- **crandom.h** - Collection of some efficent modern random number generators *xoroshiro128ss*, *sfc32/64* and Mersenne Twister *mt19937*. It also implements the crypto-strong *siphash* algorithm.
-The usage of containers is similar to c++ standard containers, so it should be easy for those who are familiar with that.
+The usage of the containers is similar to the C++ standard containers, so it should be easier for those who are familiar with them.
All containers mentioned above, except for CString are generic (similar to templates in C++). A simple example:
```
@@ -26,10 +26,23 @@ int main(void) {
cvector_i_destroy(&vec);
}
```
+Motivation
+----------
+
+The goal of this project was to finally have a library that fulfills the requirements for a "standard container library" for the C language. I believe that the following list covers the most imortant requirements, and that earlier attempts to create such a library has failed because they did not meet several of those:
+- Easy to use, intuitive naming and consistency across the library.
+- Type safe. Minimized usage of casting and void* pointers.
+- Highly efficient. Both in speed and memory usage.
+- Customizable without loosing efficiency. E.g. inline replacable compare, hash, allocation functions per container type instantiation.
+- Relative small code base size and easy to install, deploy and maintain.
+- Avoid bloat. Should not try to cover all thinkable functions, but limit itself to the most useful and commonly used.
+
+That said, this library is far from complete or free of possible bugs, but I believe it is a good foundation.
+
Installation
------------
-Because it is headers only, files can simply be included in your program. The functions will be inlined by default. If containers are extensively used accross many files with the same instantiated type, it is recommended to build as a library to minimize executable size. In this case, specify **-DSTC_HEADER** to the compiler, and put all the instantiations of the containers used in one C file, e.g.
+Because it is headers only, files can simply be included in your program. The functions will be inlined by default. If containers are extensively used accross many tranlation units with common instantiated container types, it is recommended to build as a library, to minimize executable size. In this case, specify **-DSTC_HEADER** as compiler argument, and put all the instantiations of the containers used in one C file, e.g.
```
#define STC_IMPLEMENTATION
#include <stc/cvector.h>
@@ -43,13 +56,13 @@ declare_CHash(64, set, int64_t);
Performance
-----------
-The library is very efficent. The containers have templated "intrusive"/in-place elements. Possibly the most speed critical is the **CHash map / CHash set** implementation. This is among the fastest of C and C++ map implementations: benchmark.c compiled with g++ v9.2.0 -O3 on windows (results are similar with Visual Studio or g++ on linux):
+This library is very efficent. Containers have templated intrusive elements. One of the most performance critical containers is the **CHash map / CHash set**. Thankfully, is it among the fastest C / C++ map implementations: **examples/benchmark.c** compiled with g++ v9.2.0 -O3 on windows (the results are similar with VC++ and g++ on linux):
**CMAP=this**, KMAP=khash, UMAP=std::unordered_map, BMAP=ska::bytell_hash_map, FMAP=ska::flat_hash_map, RMAP=robin_hood::unordered_map
```
Random keys are in range [0, 2^20):
map<uint64_t, uint64_t>: 7000000 repeats of Insert random key + (try to) remove a different random key:
-CMAP(ii): sz: 523938, bucks: 1013337, time: 0.39, sum: 24500003500000, erase: 3237392 (fastest)
+CMAP(ii): sz: 523938, bucks: 1013337, time: 0.39, sum: 24500003500000, erase: 3237392 --> fastest
KMAP(ii): sz: 523938, bucks: 2097152, time: 0.46, sum: 24500003500000, erase: 3237392
UMAP(ii): sz: 523938, bucks: 1056323, time: 2.21, sum: 24500003500000, erase: 3237392
BMAP(ii): sz: 523938, bucks: 1048576, time: 0.46, sum: 24500003500000, erase: 3237392
@@ -57,7 +70,7 @@ FMAP(ii): sz: 523938, bucks: 1048576, time: 0.43, sum: 24500003500000, erase: 32
RMAP(ii): sz: 523938, bucks: 838860, time: 0.82, sum: 24500003500000, erase: 3237392
map<uint64_t, uint64_t>: Insert 10000000 sequensial keys, then remove them in same order:
-CMAP(ii): sz: 0, bucks: 17001171, time: 0.75, erase 10000000 (second)
+CMAP(ii): sz: 0, bucks: 17001171, time: 0.75, erase 10000000 --> second fastest
KMAP(ii): sz: 0, bucks: 16777216, time: 0.48, erase 10000000
UMAP(ii): sz: 0, bucks: 17961079, time: 1.04, erase 10000000
BMAP(ii): sz: 0, bucks: 16777216, time: 1.04, erase 10000000
@@ -65,7 +78,7 @@ FMAP(ii): sz: 0, bucks: 16777216, time: 0.94, erase 10000000
RMAP(ii): sz: 0, bucks: 13421772, time: 0.84, erase 10000000
map<uint64_t, uint64_t>: Insert 10000000 random keys, then remove them in same order:
-CMAP(ii): sz: 0, bucks: 1621347, time: 0.41, erase 1048490 (fastest)
+CMAP(ii): sz: 0, bucks: 1621347, time: 0.41, erase 1048490 --> fastest
KMAP(ii): sz: 0, bucks: 2097152, time: 0.77, erase 1048490
UMAP(ii): sz: 0, bucks: 2144977, time: 1.67, erase 1048490
BMAP(ii): sz: 0, bucks: 2097152, time: 0.52, erase 1048490
@@ -75,11 +88,11 @@ RMAP(ii): sz: 0, bucks: 1677721, time: 0.65, erase 1048490
Memory efficiency
-----------------
-Near optimal memory usage for all the containers. The circular list is intrusive so only one allocation is needed for each node, however a custom allocator and various techniques could improve the linked lists memory usage.
-- **CString**, **CVector**: one pointer for representation. Heap allocation holds size and capacity.
-- **CList**: one pointer for representation. Each node allocates block storing value and next pointer.
-- **CHash set**: Representation size of 4 pointers. One array of Key per bucket, one array of one byte per buckets.
-- **CHash map**: Representation size of 4 pointers. One array of (Key, Value) per bucket, one array of one byte per buckets.
+The containers are memory efficent. E.g. the circular list is intrusive so only one allocation is needed for each node, however a custom allocator and various techniques could improve the linked lists memory usage.
+- **CString**, **CVector**: Representaion: one pointer size. The size and capacity is stored as part of the heap allocation that also holds the vector elements.
+- **CList**: Representaion: one pointer size. Each node allocates block storing value and next pointer.
+- **CHash set**: Representation: 4 pointers size. The hash table stores a key per bucket, and one table of "used/hash-value", occupying only one byte per bucket.
+- **CHash map**: Same as CHash set, but each bucket in the array stores a (key, value) pair, not only the key.
Usage by examples
-----------------
@@ -89,25 +102,27 @@ Usage by examples
#include <stc/cstring.h>
int main() {
- CString cs = cstring_make("one-nine-three-seven-five");
+ CString s1 = cstring_makeFmt("%10.g", 123.0 / 13.0);
+ CString s2 = cstring_make("one-nine-three-seven-five");
printf("%s.\n", cs.str);
- cstring_insert(&cs, 3, "-two");
+ cstring_insert(&s2, 3, "-two");
printf("%s.\n", cs.str);
- cstring_erase(&cs, 7, 5); // -nine
+ cstring_erase(&s2, 7, 5); // -nine
printf("%s.\n", cs.str);
- cstring_replace(&cs, 0, "seven", "four");
- printf("%s.\n", cs.str);
- printf("find: %s\n", cs.str + cstring_find(cs, 0, "four"));
+ cstring_replace(&s2, 0, "seven", "four");
+ printf("%s.\n", s2.str);
+ printf("find: %s\n", s2.str + cstring_find(s2, 0, "four"));
// reassign:
- cstring_assign(&cs, "one two three four five six seven");
- cstring_append(&cs, " eight");
- printf("append: %s\n", cs.str);
+ cstring_assign(&s2, "one two three four five six seven");
+ cstring_append(&s2, " eight");
+ printf("append: %s\n", s2.str);
- cstring_destroy(&cs);
+ cstring_destroy(&s1);
+ cstring_destroy(&s2);
}
```
**CVector** of *int64_t*