Discussion:
[erlang-patches] [erlang-questions] How to Update List Elements A Lot
ruanbeihong
2015-07-21 07:45:30 UTC
Permalink
I think it's the last thing to think about when writing codes. You'd expect the compiler to do such 'update inline' optimization for you.

I'm not quite sure that the compiler do does that kind of optimization, but I will consider to use C port when performance is really critical problem when profile test proved it.


On Tuesday 21 July 2015 12:00:04 Hugo Wang wrote:


Hi all,


I'm writing an encrypt/decrypt stuff in Erlang, the input and output are both lists (or binary). In other words, I need to generate a list of bytes based on another list of bytes. During the generating there are a lot of band, bor, bxor, bsl, bsr, etc operations. Every element in the new list (the output) would be updated many times.


In other language, like C or Python, we can init an output list and then update its elements inline. In Erlang, what I currently do, would make a copy of the list every time when an element need to update. This looks not quite right.


Any comments/ideas on this? I want to use pure Erlang to implement it, rather than making a C port.


Thanks,
Hugo



James Ruan
Jesper Louis Andersen
2015-07-21 14:16:15 UTC
Permalink
Post by ruanbeihong
I think it's the last thing to think about when writing codes. You'd
expect the compiler to do such 'update inline' optimization for you.
It is a very true statement. But there are limits to what kind of
transformations you can expect of a compiler. If you have used lists with
lots of lists:keyreplace/4 for instance, then the compiler is not clever
enough to transform this to a list comprehension, which turns an O(n^2)
algorithm into a O(n) algorithm.

Also, there are limits to a functional programming language. In-place
updates requires the compiler to prove that access to the data is truly
linear. If you have lots of cross-module calls, the Erlang evaluation model
somewhat constrains you, because if a module is loaded while the
cross-module calls are being done, then you expect the code to jump to the
new version. And the new version may use the data in a non-linear fashion.

Programming languages which support linear access for updates usually
annotate access in a type/effect system (ATS, Rust comes to mind). And thus
they have an easier time optimizing since they can rely on the program
being well-typed or well-effected.

All of this said, functional languages commonly have GCs which are
extremely good at handling high allocation rates. Non-live data on the heap
has 0 reclamation cost in the Erlang BEAM VMs GC for instance. The primary
problem here is the memory bottleneck in modern CPUs: papers from the 90'es
show that usually, memory is less of an issue than first thought. But here,
20 years later, it is about time to redo those old findings.
--
J.
Loading...