[C/C++] Tips for OOP – module initialization.

OOP SW design is usually compose of one main control routine (henceforth MCR) and lots of sub modules.
But, MCR don’t need to (must not need to) know inside implementation of every sub modules.
Sometimes, MCR don’t need to know even existence of some sub modules.
The problem is, most sub modules require initialization.
How can sub modules whose existence is now known, be initialized.
Due to this issue, principle of information hiding is sometimes broken.
Let’s see below example.

FILE : main.c
-------------
int main(int argc, char* argv[]) {
        ...

}

FILE : moduleA.c
----------------
...

FILE : moduleB.c
----------------
...

Assume that, each module requires initialization and main.c don’t need to know existence of each module.
How can we resolve this issue?
Easiest way is calling initialization function of each module with damaging principle of information hiding little bit, like mentioned above.

FILE : main.c
-------------
extern void moduleA_init();
extern void moduleB_init();

int main(int argc, char* argv[]) {
        ...
        moduleA_init();
        moduleB_init();
        ...
}

FILE : moduleA.c
----------------
...
void moduleA_init() { ... }

FILE : moduleB.c
----------------
...
void moduleB_init() { ... }

At above code, main.c becomes to know existence of moduleA and moduleB.
That is, in terms of modules, principle of information hiding is damaged although it’s very little.
Additionally, global symbol space is dirtier.
Regarding maintenance, whenever new module is added, modifying main.c is unavoidable.
But, main.c doesn’t have any dependency on newly added module.
With this and that, this way is uncomfortable.
How can we clean up these?
Using constructor leads us to better way.

Functions in constructor are executed before main function.
So, it is very useful for this case.
Easiest way is setting every initialization function as constructor.
But, in this case, we cannot control the moment when module is initialized at.
Therefore, it is better that each module’s initialization function is registered to MCR, and MCR calls these registered function at right moment.
Following pseudo code is simple implementation of this concept.

FILE : main.c
-------------
void register_initfn(void (*fn)()) {
        list_add(initfn_list, fn);
}

int main(int argc, char* argv[]) {
        ...
        /* initialize modules */
        foreach(initfn_list, fn)
                (*fn)();
        ...
}

FILE : module.h
---------------
extern void register_initfn(void (*fn)());
#define MODULE_INITFN(fn)                               \
        static void __##fn##__() __attribute__ ((constructor)); \
        static void __##fn##__() { register_initfn(&fn); }

FILE : moduleA.c
----------------
...
#include "module.h"
...
static void _myinit() { ... }
MODULE_INITFN(_myinit)

FILE : moduleB.c
----------------
...
#include "module.h"
...
static void _myinit() { ... }
MODULE_INITFN(_myinit)

Now, MCR don’t need to know existence of each modules.
And, MCR can also control the moment of each module’s initialization.
In addition, adding new module doesn’t require any modification of MCR side.
It is closer to OOP’s concept, isn’t it?

We can improve this concept by customizing memory section.
Here is rough description of this.

* Declare special memory section for initializer function.
    - In gcc, ld script should be modified.

* Put initializer function into this section.
    - __attribute__ ((__section__("xxxx"))) can be used.

* MCR can read this section and call these functions at appropriate moment.

Actually, this way is better than using constructor in terms of SW design.
Linux kernel uses this concept in it’s driver model.
(For deeper analysis, kernel source code can be good reference.)
But, in many cases, using this concept may lead to over-engineering.
So, if there isn’t any other concern, using constructor is enough.

Advertisements

[Prog] Separating mechanism from policy (microscopical view) – function implementation.

Separating mechanism from policy is very important issue of SW design.
This is also true for microscopic area – implementing function.
Here is one of simplest example – function that calculates area of rectangle.

int rect_area(int left, int top, int right, int bottom) {
        return (right - left) * (bottom - top);
}

Simple, isn’t it?
But, this function doesn’t have any exception/error handling.
Let’s add some of them.

int rect_area(int left, int top, int right, int bottom) {
        if (left >= right)
                left = right;
        if (top >= bottom)
                top = bottom;
        return (right - left) * (bottom - top);
}

It’s seems good.
Soon after, another function that calculates area of rectangle is needed.
But, this function should return error value if input rectangle is invalid.
I think quick solution is

int rect_area2(int left, int top, int right, int bottom) {
        if (left > right || top > bottom)
                return -1;
        return (right - left) * (bottom - top);
}

But, in this solution, code for calculating area of rectangle is duplicated.
At above example, this is just one-line code. So, duplicating is not a big deal.
But, it’s not good in terms of code structure.
Why did this happen?
Calculating rectangle area is ‘Mechanism’.
But, exception/error handling is ‘Policy’ at this example.
So, above example should be implemented like below.

static inline int _rect_area(int left, int top, int right, int bottom) {
        return (right - left) * (bottom - top);
}

int rect_area(int left, int top, int right, int bottom) {
        if (left >= right)
                left = right;
        if (top >= bottom)
                top = bottom;
        return _rect_area(left, top, right, bottom);
}
int rect_area2(int left, int top, int right, int bottom) {
        if (left > right || top > bottom)
                return -1;        
        return _rect_area(left, top, right, bottom);
}

Can you know the difference?
‘_rect_area’ is implementation of pure ‘Mechanism’.
And policy is implemented at each interface function.
Even for simple function, developer should consider CONCEPT of separating Mechanism from Policy.

[Essay][Prog] Multi-Threading SW Programming 방법론에 대한 단상.

Multi-Threading(이하 MT)  SW Programming시 동기화는 무엇보다 중요한 이슈가 된다.
생각해보면, 동기화가 문제가 되는 이유는, Programming Language의 기본 전제가 “필요한 곳에 동기화를 한다.” 이기 때문이다.
이런 정책에 100% 동의한다. MT을 통해, HW 자원을 효율적으로 사용하고자 한다면 더더욱 그러하다.
그런데, 만약 상당히 많은 부분을 서로 공유하는 어떤 MT SW가 있다고 가정한다면 (그런 SW는 디자인 자체가 잘못된 것이라는 등의 이야기는 일단 접어두자.) 어떨까?
이런 경우 “필요한 곳에 동기화”란 개념으로 보면, 너무 많은 “필요한 곳”이 존재하게 되므로, 쉽지 않다.
이럴 경우, 차라리 개념을 “필요한 곳에서 rescheduling”이라는 걸로 개념을 바꾸어 보는건 어떻까?
일단 다른 측면들은 모두 차치하고서라도, Programming은 좀더 쉬워질 것 같지 않은가?
이런 식의 개념을 지원하도록 SW구조를 잡는 것은 그리 어려운 일은 아니다.
물론, HW 자원을 효과적으로 쓰도록 만들기는 앞의 개념에 비해서 상당히 더 어려울 것이다.
결국 아래와 같은 Trade-off가 발생한다.

“HW자원의 효율적인 사용 vs. Programming의 단순와”

너무 무식한 방법이라고 비판할 수도 있겠지만, Performance가 큰 문제가 되지 않고, Async한 동작자체가 중요한 경우라면, 충분히 한번 고려해 볼만하지 않은가? 일단 Bug는 많이 줄일 수 있으니…

[Linux][Prog] Understanding Standard IO

What is stdout, stdin and stderr?
Actually, I am using these without deep understanding.
Now, it’s time to deep-dive to it.

Let me focus on stdout. (other twos are similar.)
What type of file default stdout is? Is it normal file? No, definitely. It’s device file.
So, two different processes can write data to same stdout directly.
(Normal file doesn’t allow this! )
I can easily know which device is used as stdout by checking devices.
(By entering following command in console.

ls -al /dev | grep stdout
    -> stdout -> /proc/self/fd/1

Interesting isn’t it?
Let me move forward.

ls -al /proc/self/fd/1
    -> 1 -> /dev/pts/7

Right!.
Even if every process access standard output device with same name ‘stdout’, it branches to appropriate devices by symbolic link – ‘/proc/self’ is used.
Now I can guess what redirecting stdout is. Let me check it.

# redirecting stdout with sample program.
# something like this.
# main() { for(int i=0; i<10000000; i++) { sleep(1); printf("+"); fflush(stdout); }}
$ ./a.out > t &
$ ls -al /proc/<child pid>/fd/1
    -> 1 -> /xxxx/xxx.../t <= path of target file 't'

Done!

[Tips] Cyclic Queue

It’s very-well-known tip for cyclic queue!
Here is pseudo code!
(Even if this is very -well-known and not difficult to think of, it’s worth to remind.)

TYPE Q
    item[SZ] // queue array
    i        // current index
...

FUNC addQ (q, item)
// This is Naive way.
    item[q.i] = item
    if (q.i >= SZ) than q.i = 0

// This is well-known way
    item[q.i] = item
    q.i &= SZ-1 // For this, SZ should be (2^n (n>0))

    

[Prog] Object Oriented Programming

The most important thing of OOP is “Making objects that developer don’t need to look inside of it.”. Reliable objects are foundation of OOP. Then let’s think about this more.

* Object should respond at all cases.

That is, crashing inside object should not be happened! object should be in valid state at any case. If software is crashed inside object, developer should analyze internal code of object, and this is what we want to avoid.

* Reducing possibility of misuse interface should be considered when interface is design.

That is, usage of interfaces should be easy and intuitive. If not, developer need to investigate for object’s internal parts to know correct usage. And, usually, interdependent interfaces leads to misuse. For example, “Interface B should be used after interface A is called”, “Interface A should not be called after interface B is called” and so on. So, if possible, reduce dependency among interfaces of objects.

* Interface should be well-commented.

In the same context with above, developer don’t need to look into the object that is well-commented. In my opinion, comments of interface should include at least following things.
  + behavior of interface.
  + prerequisite to use the interface.
  + explanation about each parameters.
  + description about return value.
And, adding usage example is recommended.

* For object to starts supporting minimal set of interface is better.

Adding interface is easy. But deleting interface is difficult because it may affect to number of other objects that already use the interface. And, stabilizing object having small number of interface, is easier. Starting from minimal set can also help developer escape from temptation of over-engineering

 
Next important point is relations among objects.
Software created by OOP consists of lots of objects and relations. To understand software’s behavior, knowing relations and inter-operations among objects is significant.

* Software documentation.

Design concept, policy, block diagram, used patterns and so on is needed to be documented to help reader to understand overall shape of software.

 
There is also disadvantage of OOP.

* Performance drop.

In every object, there are lots of routines for checking and handling unexpected cases. And in general, software in OOP consists of lots of objects. So, inevitably, number of duplicated check and handling are unavoidable – all objects may check same thing. And sometimes this drops performance very much.

[Prog] Take care of errata when using float/double at all times!

It’s very famous issue. That’s why using floating point operation is difficult.
Especially, round-off can make up to 0.5f erratum.
In calculation for drawing on pixels, round-off is frequently used – Do not consider blending and anti-aliasing. And even sum of two errata can make 1 pixel erratum. So, we should always keep this in mind when implement pixel-relative-calculation.

Here is example.

Line L passes point P0 and P1.
Drawing two lines those are orthogonal to L, L-symmetric, passes point P2, P3 respectively and length is R.

As you know, there two lines should be parallel.
But, in this case, we should calculate two end points – the results may be float type. To draw this we should make those as an integer value (usually by using round-off).
For each line, up to 0.5f erratum can exist. So, for two lines, up to 1 pixel erratum can be raised. So, when these two lines are drawn on the canvas, they may not be parallel.