Discussion:
[erlang-patches] Improving ETS performances
unknown
2013-09-10 10:22:35 UTC
Permalink
G'day people,
I would like to share with you an "improvement" patch, in the hope
that some day it will be included in the official releases.
For this is going to be a long e-mail, I have split it into two section,
a brief one for those in a hurry and a detailed one (an EEPlight) for
those who want/need to know more.

NOTE: This e-mail contain some ASCII graphics, please read it using a
constant width font.

BRIEF SECTION

The patch targets ETS tables of the 'bag' and 'duplicate_bag' types,
improving performances of deletion by objects (i.e., ets:delete_object/2
not ets:delete/2) and insertion when working with many object sharing
the same key.

The patch is available at:

git fetch https://github.com/fltt/otp.git ets_nested_lht

https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht.patch

To get a feel of the performance gain you can run the following snippet
of code both before and after applying the patch:


-module(ets_hash_test).
-export([test2/1]).

write_loop(N, Tab)
when N > 0 ->
ok=mnesia:activity(sync_dirty,
fun() ->
mnesia:write(Tab, {Tab, N, erlang:integer_to_list(N), N rem 10}, write)
end,
mnesia_frag),
write_loop(N-1, Tab);
write_loop(_, _) ->
ok.

verify_loop(N, Tab)
when N > 0 ->
Expected={Tab, N, erlang:integer_to_list(N), N rem 10},
[Expected]=
mnesia:activity(sync_dirty,
fun() ->
mnesia:read(Tab, N, read)
end,
mnesia_frag),
verify_loop(N-1, Tab);
verify_loop(_, _) ->
ok.

test2(TabSize) ->
TabName=frag_test,
ThisNode=erlang:node(),
ok=mnesia:start(),
{atomic, ok}=
mnesia:create_table(TabName, [{type, set},
{attributes, [key, datum1, datum2]},
{frag_properties, [{n_fragments, 1},
{node_pool, [ThisNode]}]}]),
ok=write_loop(TabSize, TabName),
{Time, {atomic, ok}}=
timer:tc(fun() ->
mnesia:change_table_frag(TabName, {add_frag, [ThisNode]})
end),
ok=verify_loop(TabSize, TabName),
stopped=mnesia:stop(),
{ok, Time}.


On my own PC I obtain the following results:

Before:
test2(10000) -> {ok, 1691416}
test2(100000) -> {ok, 538015238}

After:
test2(10000) -> {ok, 55239}
test2(100000) -> {ok, 745721}

A last note: THIS IS EXPERIMENTAL CODE, USE IT AT YOUR OWN RISK.
In other words, this patch is still a work in progress and not
production ready.
For more details read on.

DETAILED SECTION (EPPlight)

Why this new feature?

First of all this is not a new feature -- from the point of view of the
developer there are no new modules, functions, language constructs nor
options.

It all started by the need to dynamically add fragments to a NON-EMPTY
fragmented table (using the mnesia_frag access module).
As you know, a fragment with as little as 100000 takes far more than
tollerable to be split.

Analyzing mnesia's source code, it came out that the culprit is the
temporary storage of all the write operations under the same key in an
ETS table of the bag type.
ETS of the set, bag and duplicate_bag types use linear hashing to store
objects, thus inserting objects _with different keys_ in an ETS has O(1)
time complexity.
However, if all the objects inserted have the same key then time
complexity changes to O(k), where k is the number of objects sharing the
same key.
This is so for, in the current implementation, objects with the same
keys are all put in a simply linked list and when looking for duplicates
the whole list must be traversed.

Risks/uncertain artifacts

None, as far I can tell _now_.
However this patch is still experimental and I have tested it only under
Linux (ArchLinux).
Moreover, it should be optimized to reduce memory consuption, and some
code clean-up wouldn't harm.

How did I solve it?

Nested linear hash tables. But first some internal data structures must
be reorganized.
Reorganization is achieved by use of "bidimensional lists", i.e.,
something like this:

NULL NULL NULL
^ kprevious ^ kprevious ^
kprevious | | |
----------+ +------+ next +------+ next +------+ next
bucket[i] |-->| Obj1 |------->| Obj3 |------->| Obj4 |----->NULL
----------+ +------+ +------+ +------+
^ | knext knext | ^ | knext
kprevious | V V kprevious | V
+------+ NULL +------+
| Obj2 | | Obj5 |
+------+ +------+
| knext | knext
V V
NULL NULL

All the ObjXs use the same data structure (a modified HashDbTerm).
Obj1 and Obj2 share the same key, as well as Obj4 and Obj5 do.
Obj2 and Obj5's next pointers are unused.
In the code Obj1, Obj3 and Obj4 are also referenced as "root nodes".

Why the doubly linked list? As the "next chain" is traversed
sequentially, adding and removing objects is easy thing.
Whereas, to achieve the O(1) complexity, the "kprevious/knext chain"
must be accessible randomly (we'll see how in a moment).
Also, the order of insertion must be preserved, thus we cannot throw
away the linked list and substitute it with a hash table.

Now we are ready to introduce the nested linear hash table.
When the kprevious/knext chain reach some minimum length
(DB_BAG_LHT_LEN, which I arbitrarily set at 128), an empty linear hash
table is created and a pointer to it is installed in the root node (Obj1
or Obj4, not Obj2 nor Obj5).
Then it is filled with references to the objects of the kprevious/knext
chain, keyed with the hash value of the whole objects (Obj1 and Obj2 in
Obj1's nested LHT or Obj4 and Obj5 in Obj4's nested LHT).

By now it should be clear how things work: whenever an object is looked
for, first the next chain is traversed to find the key, then, if a
nested LHT is installed in the root node found, a lookup is performed on
the hash value of the object.
If there's no nested LHT, a sequential traversal is performed.

Sorry, I exceeded the 200 words limit, but I thing that that is the
minimum required to understand the whys and hows of this patch.

There is more to be said, however the source code is more appropriate to
describe all the minute details of the implementation.
Also, note that the patch introduces some specially marked comments, like
this:

/* ??? some question ??? */

These are questions and doubts I couldn't answer by myself and that I
ask you to find an appropriate answer.

Comments are welcome, especially on how to optimize for speed and
memory consumption.
Please, don't hesitate to contact me for further details.

Have a nice day.
--
FRANCESCO LATTANZIO : SYSTEM & SOFTWARE
A-TONO TECHNOLOGY : VIA DEL CHIESINO, 10 - 56025 PONTEDERA (PI) : T +39 0587 59221 : F +39 0587 59221 : SKYPE franz.lattanzio
a-tono.com : twitter.com/ATonoOfficial : facebook.com/ATonoOfficial

Information in this email is confidential and may be privileged. It is intended for the addresses only.
If you have received it in error, please notify the sender immediately and delete it from your system.
You should not otherwise copy it, retransmit it or use or disclose its content to anyone. Thank you for your co-operation.
unknown
2013-09-11 07:24:37 UTC
Permalink
Post by unknown
G'day people,
I would like to share with you an "improvement" patch, in the hope
that some day it will be included in the official releases.
For this is going to be a long e-mail, I have split it into two section,
a brief one for those in a hurry and a detailed one (an EEPlight) for
those who want/need to know more.
NOTE: This e-mail contain some ASCII graphics, please read it using a
constant width font.
BRIEF SECTION
The patch targets ETS tables of the 'bag' and 'duplicate_bag' types,
improving performances of deletion by objects (i.e., ets:delete_object/2
not ets:delete/2) and insertion when working with many object sharing
the same key.
git fetch https://github.com/fltt/otp.git ets_nested_lht
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht.patch
To get a feel of the performance gain you can run the following snippet
-module(ets_hash_test).
-export([test2/1]).
write_loop(N, Tab)
when N> 0 ->
ok=mnesia:activity(sync_dirty,
fun() ->
mnesia:write(Tab, {Tab, N, erlang:integer_to_list(N), N rem 10}, write)
end,
mnesia_frag),
write_loop(N-1, Tab);
write_loop(_, _) ->
ok.
verify_loop(N, Tab)
when N> 0 ->
Expected={Tab, N, erlang:integer_to_list(N), N rem 10},
[Expected]=
mnesia:activity(sync_dirty,
fun() ->
mnesia:read(Tab, N, read)
end,
mnesia_frag),
verify_loop(N-1, Tab);
verify_loop(_, _) ->
ok.
test2(TabSize) ->
TabName=frag_test,
ThisNode=erlang:node(),
ok=mnesia:start(),
{atomic, ok}=
mnesia:create_table(TabName, [{type, set},
{attributes, [key, datum1, datum2]},
{frag_properties, [{n_fragments, 1},
{node_pool, [ThisNode]}]}]),
ok=write_loop(TabSize, TabName),
{Time, {atomic, ok}}=
timer:tc(fun() ->
mnesia:change_table_frag(TabName, {add_frag, [ThisNode]})
end),
ok=verify_loop(TabSize, TabName),
stopped=mnesia:stop(),
{ok, Time}.
test2(10000) -> {ok, 1691416}
test2(100000) -> {ok, 538015238}
test2(10000) -> {ok, 55239}
test2(100000) -> {ok, 745721}
A last note: THIS IS EXPERIMENTAL CODE, USE IT AT YOUR OWN RISK.
In other words, this patch is still a work in progress and not
production ready.
For more details read on.
DETAILED SECTION (EPPlight)
Why this new feature?
First of all this is not a new feature -- from the point of view of the
developer there are no new modules, functions, language constructs nor
options.
It all started by the need to dynamically add fragments to a NON-EMPTY
fragmented table (using the mnesia_frag access module).
As you know, a fragment with as little as 100000 takes far more than
tollerable to be split.
Analyzing mnesia's source code, it came out that the culprit is the
temporary storage of all the write operations under the same key in an
ETS table of the bag type.
ETS of the set, bag and duplicate_bag types use linear hashing to store
objects, thus inserting objects _with different keys_ in an ETS has O(1)
time complexity.
However, if all the objects inserted have the same key then time
complexity changes to O(k), where k is the number of objects sharing the
same key.
This is so for, in the current implementation, objects with the same
keys are all put in a simply linked list and when looking for duplicates
the whole list must be traversed.
Risks/uncertain artifacts
None, as far I can tell _now_.
However this patch is still experimental and I have tested it only under
Linux (ArchLinux).
Moreover, it should be optimized to reduce memory consuption, and some
code clean-up wouldn't harm.
How did I solve it?
Nested linear hash tables. But first some internal data structures must
be reorganized.
Reorganization is achieved by use of "bidimensional lists", i.e.,
NULL NULL NULL
^ kprevious ^ kprevious ^
kprevious | | |
----------+ +------+ next +------+ next +------+ next
bucket[i] |-->| Obj1 |------->| Obj3 |------->| Obj4 |----->NULL
----------+ +------+ +------+ +------+
^ | knext knext | ^ | knext
kprevious | V V kprevious | V
+------+ NULL +------+
| Obj2 | | Obj5 |
+------+ +------+
| knext | knext
V V
NULL NULL
All the ObjXs use the same data structure (a modified HashDbTerm).
Obj1 and Obj2 share the same key, as well as Obj4 and Obj5 do.
Obj2 and Obj5's next pointers are unused.
In the code Obj1, Obj3 and Obj4 are also referenced as "root nodes".
Why the doubly linked list? As the "next chain" is traversed
sequentially, adding and removing objects is easy thing.
Whereas, to achieve the O(1) complexity, the "kprevious/knext chain"
must be accessible randomly (we'll see how in a moment).
Also, the order of insertion must be preserved, thus we cannot throw
away the linked list and substitute it with a hash table.
Now we are ready to introduce the nested linear hash table.
When the kprevious/knext chain reach some minimum length
(DB_BAG_LHT_LEN, which I arbitrarily set at 128), an empty linear hash
table is created and a pointer to it is installed in the root node (Obj1
or Obj4, not Obj2 nor Obj5).
Then it is filled with references to the objects of the kprevious/knext
chain, keyed with the hash value of the whole objects (Obj1 and Obj2 in
Obj1's nested LHT or Obj4 and Obj5 in Obj4's nested LHT).
By now it should be clear how things work: whenever an object is looked
for, first the next chain is traversed to find the key, then, if a
nested LHT is installed in the root node found, a lookup is performed on
the hash value of the object.
If there's no nested LHT, a sequential traversal is performed.
Sorry, I exceeded the 200 words limit, but I thing that that is the
minimum required to understand the whys and hows of this patch.
There is more to be said, however the source code is more appropriate to
describe all the minute details of the implementation.
Also, note that the patch introduces some specially marked comments, like
/* ??? some question ??? */
These are questions and doubts I couldn't answer by myself and that I
ask you to find an appropriate answer.
Comments are welcome, especially on how to optimize for speed and
memory consumption.
Please, don't hesitate to contact me for further details.
Have a nice day.
Hello Francesco,
I've fetched your patch and assigned it to be reviewed by the
responsible team.
Thanks,
--
BR Fredrik Gustafsson
Erlang OTP Team
unknown
2013-11-29 15:22:30 UTC
Permalink
Hi

This patch has now been discussed by OTP technical board.

It's a desirable feature but the patch is not acceptable as-is,
mainly due to too large memory overhead per object.

We would need

* No extra overhead for table type 'set'.
* Minimized overhead per object for 'bag' and 'duplicate_bag',
especially when the nested hash feature is not used.

Another weakness is the lack of yielding when long key chains are
traversed. BIFs should not execute too long but rather yield by
returning back to the emulator loop to be continued later by the
scheduler (as is done by the ets:select operations).
The patch removes this problem for unrelated keys but it still arise
at lookup/delete on a lot of objects with same key.
If an otherwise accepted patch is supplied we can undertake the job
to fix the yielding. It can be good to have in mind though while doing
the design.



Some tips and tricks to minimize memory overhead:
(some of them are probably mutually exclusive)

* Use lower bits in pointers as flags.
* Combine next and kprevious as only one of them is used.
* Move nkitems and lht into a separate root struct that is allocated
when the first duplicate key is inserted.
* Move knext and kprevious to NestedDbTerm to only allocate them when
needed.
* Use really dirty trick to allocate a mutilated struct with an unused
prologue.


typedef struct {
HashDbTerm* previous; /* NOT ACCESSABLE FOR SET */
HashDbTerm* next;
HashValue hvalue;
DbTerm dbterm;
}HashDbTerm;

/* For bags */
ptr = malloc(sizeof(HashDbTerm));

/* For sets */
ptr = malloc(sizeof(HashDbTerm) - offsetof(HashDbTerm,next));
ptr = (char*)ptr - offsetof(HashDbTerm,next);

ptr->next = ...;
ptr->hvalue = ...;
ptr->dbterm = ...;

ptr->previous = NULL; /* MEMORY CORRUPTION!!! */

free((char*)ptr + offsetof(HashDbTerm,next));


Maybe there are more sensible ways to accomplish the same thing.


/Sverker
Post by unknown
G'day people,
I would like to share with you an "improvement" patch, in the hope
that some day it will be included in the official releases.
For this is going to be a long e-mail, I have split it into two section,
a brief one for those in a hurry and a detailed one (an EEPlight) for
those who want/need to know more.
NOTE: This e-mail contain some ASCII graphics, please read it using a
constant width font.
BRIEF SECTION
The patch targets ETS tables of the 'bag' and 'duplicate_bag' types,
improving performances of deletion by objects (i.e., ets:delete_object/2
not ets:delete/2) and insertion when working with many object sharing
the same key.
git fetch https://github.com/fltt/otp.git ets_nested_lht
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht.patch
To get a feel of the performance gain you can run the following snippet
-module(ets_hash_test).
-export([test2/1]).
write_loop(N, Tab)
when N > 0 ->
ok=mnesia:activity(sync_dirty,
fun() ->
mnesia:write(Tab, {Tab, N, erlang:integer_to_list(N), N rem 10}, write)
end,
mnesia_frag),
write_loop(N-1, Tab);
write_loop(_, _) ->
ok.
verify_loop(N, Tab)
when N > 0 ->
Expected={Tab, N, erlang:integer_to_list(N), N rem 10},
[Expected]=
mnesia:activity(sync_dirty,
fun() ->
mnesia:read(Tab, N, read)
end,
mnesia_frag),
verify_loop(N-1, Tab);
verify_loop(_, _) ->
ok.
test2(TabSize) ->
TabName=frag_test,
ThisNode=erlang:node(),
ok=mnesia:start(),
{atomic, ok}=
mnesia:create_table(TabName, [{type, set},
{attributes, [key, datum1, datum2]},
{frag_properties, [{n_fragments, 1},
{node_pool, [ThisNode]}]}]),
ok=write_loop(TabSize, TabName),
{Time, {atomic, ok}}=
timer:tc(fun() ->
mnesia:change_table_frag(TabName, {add_frag, [ThisNode]})
end),
ok=verify_loop(TabSize, TabName),
stopped=mnesia:stop(),
{ok, Time}.
test2(10000) -> {ok, 1691416}
test2(100000) -> {ok, 538015238}
test2(10000) -> {ok, 55239}
test2(100000) -> {ok, 745721}
A last note: THIS IS EXPERIMENTAL CODE, USE IT AT YOUR OWN RISK.
In other words, this patch is still a work in progress and not
production ready.
For more details read on.
DETAILED SECTION (EPPlight)
Why this new feature?
First of all this is not a new feature -- from the point of view of the
developer there are no new modules, functions, language constructs nor
options.
It all started by the need to dynamically add fragments to a NON-EMPTY
fragmented table (using the mnesia_frag access module).
As you know, a fragment with as little as 100000 takes far more than
tollerable to be split.
Analyzing mnesia's source code, it came out that the culprit is the
temporary storage of all the write operations under the same key in an
ETS table of the bag type.
ETS of the set, bag and duplicate_bag types use linear hashing to store
objects, thus inserting objects _with different keys_ in an ETS has O(1)
time complexity.
However, if all the objects inserted have the same key then time
complexity changes to O(k), where k is the number of objects sharing the
same key.
This is so for, in the current implementation, objects with the same
keys are all put in a simply linked list and when looking for duplicates
the whole list must be traversed.
Risks/uncertain artifacts
None, as far I can tell _now_.
However this patch is still experimental and I have tested it only under
Linux (ArchLinux).
Moreover, it should be optimized to reduce memory consuption, and some
code clean-up wouldn't harm.
How did I solve it?
Nested linear hash tables. But first some internal data structures must
be reorganized.
Reorganization is achieved by use of "bidimensional lists", i.e.,
NULL NULL NULL
^ kprevious ^ kprevious ^
kprevious | | |
----------+ +------+ next +------+ next +------+ next
bucket[i] |-->| Obj1 |------->| Obj3 |------->| Obj4 |----->NULL
----------+ +------+ +------+ +------+
^ | knext knext | ^ | knext
kprevious | V V kprevious | V
+------+ NULL +------+
| Obj2 | | Obj5 |
+------+ +------+
| knext | knext
V V
NULL NULL
All the ObjXs use the same data structure (a modified HashDbTerm).
Obj1 and Obj2 share the same key, as well as Obj4 and Obj5 do.
Obj2 and Obj5's next pointers are unused.
In the code Obj1, Obj3 and Obj4 are also referenced as "root nodes".
Why the doubly linked list? As the "next chain" is traversed
sequentially, adding and removing objects is easy thing.
Whereas, to achieve the O(1) complexity, the "kprevious/knext chain"
must be accessible randomly (we'll see how in a moment).
Also, the order of insertion must be preserved, thus we cannot throw
away the linked list and substitute it with a hash table.
Now we are ready to introduce the nested linear hash table.
When the kprevious/knext chain reach some minimum length
(DB_BAG_LHT_LEN, which I arbitrarily set at 128), an empty linear hash
table is created and a pointer to it is installed in the root node (Obj1
or Obj4, not Obj2 nor Obj5).
Then it is filled with references to the objects of the kprevious/knext
chain, keyed with the hash value of the whole objects (Obj1 and Obj2 in
Obj1's nested LHT or Obj4 and Obj5 in Obj4's nested LHT).
By now it should be clear how things work: whenever an object is looked
for, first the next chain is traversed to find the key, then, if a
nested LHT is installed in the root node found, a lookup is performed on
the hash value of the object.
If there's no nested LHT, a sequential traversal is performed.
Sorry, I exceeded the 200 words limit, but I thing that that is the
minimum required to understand the whys and hows of this patch.
There is more to be said, however the source code is more appropriate to
describe all the minute details of the implementation.
Also, note that the patch introduces some specially marked comments, like
/* ??? some question ??? */
These are questions and doubts I couldn't answer by myself and that I
ask you to find an appropriate answer.
Comments are welcome, especially on how to optimize for speed and
memory consumption.
Please, don't hesitate to contact me for further details.
Have a nice day.
unknown
2013-12-02 08:01:25 UTC
Permalink
Thank you for the reply and the tips.

This was really a proof-of-concept patch, made just to hear from you if
you were interested in it -- so optimising things was not a pressing
issue.

The code in the "erl_db_hash.[ch]" is getting too convoluted for me to
tollerate and adding more checks to minimize memory consumption would
make things worse.
Thus I was thinking to split those files in two: one for the 'set' type
and the other for the 'bag' and 'duplicate_bag' types.
I suppose that increasing code size is not an issue as it is increasing
memory consuption.

At the moment I'm working on something else and don't know when I can
start working on this patch again -- maybe a few weeks, maybe a couple
of months.

Regards.
Post by unknown
Hi
This patch has now been discussed by OTP technical board.
It's a desirable feature but the patch is not acceptable as-is,
mainly due to too large memory overhead per object.
We would need
* No extra overhead for table type 'set'.
* Minimized overhead per object for 'bag' and 'duplicate_bag',
especially when the nested hash feature is not used.
Another weakness is the lack of yielding when long key chains are
traversed. BIFs should not execute too long but rather yield by
returning back to the emulator loop to be continued later by the
scheduler (as is done by the ets:select operations).
The patch removes this problem for unrelated keys but it still arise
at lookup/delete on a lot of objects with same key.
If an otherwise accepted patch is supplied we can undertake the job
to fix the yielding. It can be good to have in mind though while doing
the design.
(some of them are probably mutually exclusive)
* Use lower bits in pointers as flags.
* Combine next and kprevious as only one of them is used.
* Move nkitems and lht into a separate root struct that is allocated
when the first duplicate key is inserted.
* Move knext and kprevious to NestedDbTerm to only allocate them when
needed.
* Use really dirty trick to allocate a mutilated struct with an unused
prologue.
typedef struct {
HashDbTerm* previous; /* NOT ACCESSABLE FOR SET */
HashDbTerm* next;
HashValue hvalue;
DbTerm dbterm;
}HashDbTerm;
/* For bags */
ptr = malloc(sizeof(HashDbTerm));
/* For sets */
ptr = malloc(sizeof(HashDbTerm) - offsetof(HashDbTerm,next));
ptr = (char*)ptr - offsetof(HashDbTerm,next);
ptr->next = ...;
ptr->hvalue = ...;
ptr->dbterm = ...;
ptr->previous = NULL; /* MEMORY CORRUPTION!!! */
free((char*)ptr + offsetof(HashDbTerm,next));
Maybe there are more sensible ways to accomplish the same thing.
--
FRANCESCO LATTANZIO : SYSTEM & SOFTWARE
A-TONO TECHNOLOGY : VIA DEL CHIESINO, 10 - 56025 PONTEDERA (PI) : T +39 02 32069100 : SKYPE franz.lattanzio
a-tono.com : twitter.com/ATonoOfficial : facebook.com/ATonoOfficial

Information in this email is confidential and may be privileged. It is intended for the addresses only.
If you have received it in error, please notify the sender immediately and delete it from your system.
You should not otherwise copy it, retransmit it or use or disclose its content to anyone. Thank you for your co-operation.
unknown
2014-06-25 19:38:06 UTC
Permalink
Hello again Sverker,
I've made some progress with this patch. Briefly:

* set tables and bag/duplicate_bag tables are now managed by dedicated
code -- erl_db_hash.[ch] and erl_db_nested_hash.[ch] respectively
* reduced memory overhead

You can compute the difference (in words) between the new and the old
implementations' memory footprints (when nested LHT is not active) of a
collection of 'n' objects sharing the same key as:

MemDiff = 4 - n

I used this formula to fix the ets suite's memory test case.

With regard to the yielding problem, I deciced not to fix it now for the
following reasons:

1) it is unrelated to the nested LHT, as the problem manifests itself
even in the current (non-nested LHT) implementation
2) you offered to fix it yourself

The patch is available at:

git fetch https://github.com/fltt/otp.git ets_nested_lht_3

https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3.patch

It is based on the maint branch.

I've tested it on Linux x86_64, latest kernel (all the test suites) and
FreeBSD/i386 9.2 (only the emulator, kernel and stdlib test suites).

Have another nice day.
Post by unknown
Hi
This patch has now been discussed by OTP technical board.
It's a desirable feature but the patch is not acceptable as-is,
mainly due to too large memory overhead per object.
We would need
* No extra overhead for table type 'set'.
* Minimized overhead per object for 'bag' and 'duplicate_bag',
especially when the nested hash feature is not used.
Another weakness is the lack of yielding when long key chains are
traversed. BIFs should not execute too long but rather yield by
returning back to the emulator loop to be continued later by the
scheduler (as is done by the ets:select operations).
The patch removes this problem for unrelated keys but it still arise
at lookup/delete on a lot of objects with same key.
If an otherwise accepted patch is supplied we can undertake the job
to fix the yielding. It can be good to have in mind though while doing
the design.
(some of them are probably mutually exclusive)
* Use lower bits in pointers as flags.
* Combine next and kprevious as only one of them is used.
* Move nkitems and lht into a separate root struct that is allocated
when the first duplicate key is inserted.
* Move knext and kprevious to NestedDbTerm to only allocate them when
needed.
* Use really dirty trick to allocate a mutilated struct with an unused
prologue.
typedef struct {
HashDbTerm* previous; /* NOT ACCESSABLE FOR SET */
HashDbTerm* next;
HashValue hvalue;
DbTerm dbterm;
}HashDbTerm;
/* For bags */
ptr = malloc(sizeof(HashDbTerm));
/* For sets */
ptr = malloc(sizeof(HashDbTerm) - offsetof(HashDbTerm,next));
ptr = (char*)ptr - offsetof(HashDbTerm,next);
ptr->next = ...;
ptr->hvalue = ...;
ptr->dbterm = ...;
ptr->previous = NULL; /* MEMORY CORRUPTION!!! */
free((char*)ptr + offsetof(HashDbTerm,next));
Maybe there are more sensible ways to accomplish the same thing.
--
Francesco Lattanzio
unknown
2014-06-27 08:42:16 UTC
Permalink
Removed dots from commit messages.
A new pull request has been submitted. Here is the new patch:

git fetch https://github.com/fltt/otp.git ets_nested_lht_4

https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_4
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_4.patch
--
Francesco Lattanzio
Post by unknown
Hello again Sverker,
* set tables and bag/duplicate_bag tables are now managed by dedicated
code -- erl_db_hash.[ch] and erl_db_nested_hash.[ch] respectively
* reduced memory overhead
You can compute the difference (in words) between the new and the old
implementations' memory footprints (when nested LHT is not active) of a
MemDiff = 4 - n
I used this formula to fix the ets suite's memory test case.
With regard to the yielding problem, I deciced not to fix it now for the
1) it is unrelated to the nested LHT, as the problem manifests itself
even in the current (non-nested LHT) implementation
2) you offered to fix it yourself
git fetch https://github.com/fltt/otp.git ets_nested_lht_3
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3.patch
It is based on the maint branch.
I've tested it on Linux x86_64, latest kernel (all the test suites) and
FreeBSD/i386 9.2 (only the emulator, kernel and stdlib test suites).
Have another nice day.
Данила Федящин
2014-10-31 15:29:47 UTC
Permalink
Hello,

Any news regarding this patch?
Post by unknown
Removed dots from commit messages.
git fetch https://github.com/fltt/otp.git ets_nested_lht_4
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_4
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_4.patch
--
Francesco Lattanzio
Post by unknown
Hello again Sverker,
* set tables and bag/duplicate_bag tables are now managed by dedicated
code -- erl_db_hash.[ch] and erl_db_nested_hash.[ch] respectively
* reduced memory overhead
You can compute the difference (in words) between the new and the old
implementations' memory footprints (when nested LHT is not active) of a
MemDiff = 4 - n
I used this formula to fix the ets suite's memory test case.
With regard to the yielding problem, I deciced not to fix it now for the
1) it is unrelated to the nested LHT, as the problem manifests itself
even in the current (non-nested LHT) implementation
2) you offered to fix it yourself
git fetch https://github.com/fltt/otp.git ets_nested_lht_3
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3
https://github.com/fltt/otp/compare/erlang:maint...ets_nested_lht_3.patch
Post by unknown
It is based on the maint branch.
I've tested it on Linux x86_64, latest kernel (all the test suites) and
FreeBSD/i386 9.2 (only the emulator, kernel and stdlib test suites).
Have another nice day.
_______________________________________________
erlang-patches mailing list
http://erlang.org/mailman/listinfo/erlang-patches
Loading...