summaryrefslogtreecommitdiffhomepage
path: root/src
AgeCommit message (Collapse)Author
2020-11-10Reduce memory usage of Hash objectKOBAYASHI Shuji
## Implementation Summary * Change entry list from segmented list to flat array. * Change value of hash bucket from pointer to entry to index of entry list, and represent it by variable length bits according to capacity of hash buckets. * Store management information about entry list and hash table to `struct RHash` as much as possible. ## Benchmark Summary Only the results of typical situations on 64-bit Word-boxing are present here. For more detailed information, including consideration, see below (although most of the body is written in Japanese). * https://shuujii.github.io/mruby-hash-benchmark ### Memory Usage Lower value is better. | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 344B | 256B | 0.74419x | | 40 | 1,464B | 840B | 0.57377x | | 200 | 8,056B | 3,784B | 0.46971x | | 500 | 17,169B | 9,944B | 0.57949x | ### Performance Higher value is better. #### `mrb_hash_set` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 1.41847M i/s | 1.36004M i/s | 0.95881x | | 40 | 0.39224M i/s | 0.31888M i/s | 0.81296x | | 200 | 0.03780M i/s | 0.04290M i/s | 1.13494x | | 500 | 0.01225M i/s | 0.01314M i/s | 1.07275x | #### `mrb_hash_get` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 26.05920M i/s | 30.19543M i/s | 1.15872x | | 40 | 44.26420M i/s | 32.75781M i/s | 0.74005x | | 200 | 44.55171M i/s | 31.56926M i/s | 0.70860x | | 500 | 39.19250M i/s | 29.73806M i/s | 0.75877x | #### `mrb_hash_each` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 25.11964M i/s | 30.34167M i/s | 1.20789x | | 40 | 11.74253M i/s | 13.25539M i/s | 1.12884x | | 200 | 2.01133M i/s | 2.97214M i/s | 1.47770x | | 500 | 0.87411M i/s | 1.21178M i/s | 1.38631x | #### `Hash#[]=` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 0.50095M i/s | 0.56490M i/s | 1.12764x | | 40 | 0.19132M i/s | 0.18392M i/s | 0.96129x | | 200 | 0.03624M i/s | 0.03256M i/s | 0.89860x | | 500 | 0.01527M i/s | 0.01236M i/s | 0.80935x | #### `Hash#[]` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 11.53211M i/s | 12.78806M i/s | 1.10891x | | 40 | 15.26920M i/s | 13.37529M i/s | 0.87596x | | 200 | 15.28550M i/s | 13.36410M i/s | 0.87430x | | 500 | 14.57695M i/s | 12.75388M i/s | 0.87494x | #### `Hash#each` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 0.30462M i/s | 0.27080M i/s | 0.88898x | | 40 | 0.12912M i/s | 0.11704M i/s | 0.90642x | | 200 | 0.02638M i/s | 0.02402M i/s | 0.91071x | | 500 | 0.01066M i/s | 0.00959M i/s | 0.89953x | #### `Hash#delete` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 7.84167M i/s | 6.96419M i/s | 0.88810x | | 40 | 6.91292M i/s | 7.41427M i/s | 1.07252x | | 200 | 3.75952M i/s | 7.32080M i/s | 1.94727x | | 500 | 2.10754M i/s | 7.05963M i/s | 3.34970x | #### `Hash#shift` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 14.66444M i/s | 13.18876M i/s | 0.89937x | | 40 | 11.95124M i/s | 11.10420M i/s | 0.92913x | | 200 | 5.53681M i/s | 7.88155M i/s | 1.42348x | | 500 | 2.96728M i/s | 5.40405M i/s | 1.82121x | #### `Hash#dup` | Hash Size | Baseline | New | Factor | |----------:|--------------:|--------------:|-----------:| | 16 | 0.15063M i/s | 5.37889M i/s | 35.71024x | | 40 | 0.06515M i/s | 3.38196M i/s | 51.91279x | | 200 | 0.01359M i/s | 1.46538M i/s | 107.84056x | | 500 | 0.00559M i/s | 0.75411M i/s | 134.88057x | ### Binary Size Lower value is better. | File | Baseline | New | Factor | |:-----------|--------------:|--------------:|----------:| | mruby | 730,408B | 734,176B | 1.00519x | | libmruby.a | 1,068,134B | 1,072,846B | 1.00441x | ## Other Fixes The following issues have also been fixed in the parts where there was some change this time. * [Heap use-after-free in `Hash#value?`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-heap-use-after-free-in-hash-value-md) * [Heap use-after-free in `ht_hash_equal`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-heap-use-after-free-in-ht_hash_equal-md) * [Heap use-after-free in `ht_hash_func`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-heap-use-after-free-in-ht_hash_func-md) * [Heap use-after-free in `mrb_hash_merge`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-heap-use-after-free-in-mrb_hash_merge-md) * [Self-replacement does not work for `Hash#replace`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-self-replacement-does-not-work-for-hash-replace-md) * [Repeated deletes and inserts increase memory usage of `Hash`](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-repeated-deletes-and-inserts-increase-memory-usage-of-hash-md) * [`Hash#rehash` does not reindex completely](https://gist.github.com/shuujii/30e4fcd5844a4112a0ecd4a5b3483101#file-hash-rehash-does-not-reindex-completely-md) * `mrb_hash_delete_key` does not cause an error for frozen object * `mrb_hash_new_capa` does not allocate required space first * [`mrb_os_memsize_of_hash_table` result is incorrect](https://github.com/mruby/mruby/pull/5032#discussion_r457994075)
2020-11-06Skip too big left shift in `flo_shift()`.Yukihiro "Matz" Matsumoto
2020-11-06Avoid negating `MRB_INT_MIN` which is impossible.Yukihiro "Matz" Matsumoto
2020-11-06Fix wrong integer casting.Yukihiro "Matz" Matsumoto
2020-11-05Fix a bug with printing `(null)` local variable name for a register.Yukihiro "Matz" Matsumoto
2020-11-04Add a new instruction `OP_LOADI32`.Yukihiro "Matz" Matsumoto
That loads 32 bit integer bypassing pool access.
2020-11-03Add new instructions to handle symbols/literals >255; fix #5109Yukihiro "Matz" Matsumoto
New instructions: * OP_LOADL16 * OP_LOADSYM16 * OP_STRING16 Size of pools, symbols are `int16_t` but offset representation in the bytecode was 8 bits. Size of child `irep` array is `int16_t`, too but this change does not address it.
2020-11-02format '%p' expects argument of type 'void *'; #5107Yukihiro "Matz" Matsumoto
2020-11-02Make Ranges frozen as Ruby3.0.Yukihiro "Matz" Matsumoto
2020-10-29Merge pull request #5102 from dearblue/c++-excYukihiro "Matz" Matsumoto
Fixed build with `conf.enable_cxx_exception`
2020-10-28Fix `mrb_obj_id` to `Float`KOBAYASHI Shuji
2020-10-28Fixed build with `conf.enable_cxx_exception`dearblue
The problem was manifested by commit 5069fb15e41998dffef8e0ba566b3a82be369ba3.
2020-10-24Reorganize `env_new()` as `mrb_env_new()`dearblue
The `mrb_env_new()` function is a global function, but it is still treated as an internal function.
2020-10-23Merge pull request #5099 from dearblue/getargs-arrayYukihiro "Matz" Matsumoto
Prohibit array changes by "a"/"*" specifier of `mrb_get_args()`
2020-10-22Prohibit array changes by `mrb_get_argv()`dearblue
The `mrb_get_argv()` function will now return `const mrb_value *`. This is because it is difficult for the caller to check if it is a splat argument (array object) and to write-barrier if necessary.
2020-10-22Prohibit array changes by "a"/"*" specifier of `mrb_get_args()`dearblue
The "a"/"*" specifier of the `mrb_get_args()` function will now return `const mrb_value *`. This is because it is difficult for the caller to check if it is an array object and write-barrier if necessary. And it requires calling `mrb_ary_modify()` on the unmodified array object, which is also difficult (this is similar to #5087).
2020-10-16Add startless range (another part of #5085)Yukihiro "Matz" Matsumoto
Ref #5093; close #5085
2020-10-16Remove uninitialized local variable warning.Yukihiro "Matz" Matsumoto
Fix for #5093
2020-10-15Merge branch 'work_for_merge' of https://github.com/zubycz/mruby into ↵Yukihiro "Matz" Matsumoto
zubycz-work_for_merge
2020-10-15Fix out of bound access in `parse.y`.Yukihiro "Matz" Matsumoto
2020-10-14Add indent to `lv` in the C dump.Yukihiro "Matz" Matsumoto
2020-10-13Introduce endless range (a part of #5085)taiyoslime
Co-Authored-By: n4o847 <[email protected]> Co-Authored-By: smallkirby <[email protected]>
2020-10-12Revert "Add a new function `mrb_exc_protect()`."Yukihiro "Matz" Matsumoto
This reverts commit 8746a6fe4e7bda8a0fbc0eaece9314ec51a0c255. We already have `mrb_protect()`, `mrb_ensure()` and `mrb_rescue()` functions. If you need to handle exceptions from C functions, use those functions above.
2020-10-12Add a new function `mrb_exc_protect()`.Yukihiro "Matz" Matsumoto
`mrb_exc_protect()` takes two C functions, `body` to be executed first, and `resc` to be executed when an error happens during `body` execution. Since `mrb_exc_protect()` should be compiled with the proper compiler, we will not see the problem like #5088 that was caused by `setjmp()` and `throw` mixture.
2020-10-12No need to get the `irep` record size twice.Yukihiro "Matz" Matsumoto
2020-10-12Update `MRB_FLOAT_FMT` to always use double precision.Yukihiro "Matz" Matsumoto
2020-10-12Remove the length of `Float' pool from the binary dump.Yukihiro "Matz" Matsumoto
Also fixed the size calculation of `irep` dump, that could cause memory corruption.
2020-10-12Remove `DEBUG_ONLY_EXPR()` from `CHECKPOINT_*` macros; ref #5060Yukihiro "Matz" Matsumoto
To allow C++ compilation. Fix suggested by @dearblue.
2020-10-12Unify `mrb_str_to_str` to `mrb_obj_as_string`.Yukihiro "Matz" Matsumoto
Redirect `mrb_str_to_str` to `mrb_obj_as_string` via C macro. Inspired by #5082
2020-10-12Dump/load 16 bits for `ilen` and `slen` in `irep`.Yukihiro "Matz" Matsumoto
Those types are `uint16_t` in definition. Also we no longer need padding for `iseq`.
2020-10-12Should use `PRId32` to dump `.i32`; ref #5084Yukihiro "Matz" Matsumoto
The fix was proposed by @dearblue
2020-10-12Use `NULL` instead of `0`; close #2467Yukihiro "Matz" Matsumoto
The PR was from @cubicdaiya.
2020-10-12Restore old function names for compatibility; ref #5070Yukihiro "Matz" Matsumoto
- `mrb_check_intern()` to return `mrb_value` - `mrb_intern_check()` to return `mrb_sym` [NEW] Other new functions: - `mrb_intern_check_cstr()` - `mrb_intern_check_str()`
2020-10-12Restore old function names for compatibility; fix #5070Yukihiro "Matz" Matsumoto
Rename new functions: - `mrb_convert_type(mrb,val,type,tname,method)` => `mrb_type_convert(mrb,val,type,tname,method)` - `mrb_check_convert_type(mrb,val,type,tname,method)` => `mrb_type_convert_check(mrb,val,type,tname,method)` Old names are defined by macros (support `tname` drop and `char*` => `mrb_sym` conversion).
2020-10-12Fix warning from VC regarding implicit int conversion.Yukihiro "Matz" Matsumoto
2020-10-12Avoid `unsigned int`; Use `mrb_int` instead.Yukihiro "Matz" Matsumoto
2020-10-12Use `mrb_int` extensively instead of `int`.Yukihiro "Matz" Matsumoto
The mixture causes warnings on 64 bit Windows (VC).
2020-10-12Use `goto` to avoid problems with `DIRECT_THREADED`.Yukihiro "Matz" Matsumoto
You can now use `NEXT` within `switch` statement like 7c087eb.
2020-10-12Extract `div` code in VM and make them shared by `div` methods.Yukihiro "Matz" Matsumoto
2020-10-12Don't use `NEXT` within `switch` statement.Yukihiro "Matz" Matsumoto
On non-`gcc` compatible environment, `NEXT` is translated to `break`.
2020-10-12Better malloc_trim define nameRory OConnell
2020-10-12Add call to malloc_trim on a full GCRory OConnell
2020-10-12Remove obsolete `MRB_WITHOUT_FLOAT` macro from `numeric.c`.Yukihiro "Matz" Matsumoto
2020-10-12Change some `int` variables to `mrb_int`.Yukihiro "Matz" Matsumoto
To silence some warnings. This change cancels part of 7ef3604134.
2020-10-12Update `mrb_get_args()` keyword argument support [incompatible]Yukihiro "Matz" Matsumoto
* `mrb_kwargs` structure reordered (`values` and `rest` come last) * take symbols instead of C `char*`
2020-10-12Separate jump destination check in `OP_R_RETURN`.Yukihiro "Matz" Matsumoto
In the past code, the current `callinfo (ci)` was modified, thus it was possible to pop `ci` beyond the `cibase`, that could cause out of memory bound access for the code like the following: ```ruby def m2 lambda { Proc.new { return :return # return from the method } }.call.call :never_reached end p m2 ```
2020-10-12Make the scope of `const struct RProc *dst` narrower.Yukihiro "Matz" Matsumoto
2020-10-12Redefine `CHECKPOINT_*` macros.Yukihiro "Matz" Matsumoto
By definition `mrb_assert()` called only when `MRB_DEBUG` is defined too. But make I wanted to make clear that the local variable `current_checkpoint_tag` is only accessed when `MRB_DEBUG` is set by wrapping with `DEBUG_ONLY_EXPR()`.
2020-10-12Include `mruby/endian.h` only when `MRB_NO_FLOAT` is undefined.Yukihiro "Matz" Matsumoto
2020-10-12Abandon packing all lower case symbols with 6 characters.Yukihiro "Matz" Matsumoto
To make packed inline symbols within 31 bits, because the new method hash tables allows only 31 bits of symbols. They use top 1 bit to maek unused slots.