Distributed
Genetic Algorithms
Abdo Achkar
Fall 2005
Distributed
Systems
Introduction:
According
to Anaxagoras, the man is the most intelligent among animals because he has
hands. Alan Turing also, speaking about evolution, states that the form of a
system is irrelevant to its intelligence [1]. Which leads us to the question,
if humans’ hands are a result of evolution, can evolution be intelligent?
In order to answer this
question, one will have to define intelligence or, like Alan Turing did, assume
that every system that behaves in a human-like manner, is intelligent.
Since evolution’s concept
is hard to compare to human’s behavior, one might want to consider evolution’s
results: converging chromosomes, or bit strings, for a certain problem. The
problem is approached in a hill-climbing-like fashion: whenever a fitter
chromosome is encountered, it is placed in the generation. In order to evolve,
the population will have to reproduce, be mutilated, and crossed-over. For more
information about evolutionary computation, refer to [2].
The hill-climbing
algorithm always looks forward to maximizing (or minimizing) an outcome. The
algorithm might fall on a local maximum/minimum, a ridge and plateaux [3].

1- Project’s Use Case
Genetic algorithms are indeed hill-climbing algorithms. And,
even though their representation is completely different, they tend to maximize
or minimize some objective function, and have their own ways to escape false
convergence: Mutations and crossings-over. We usually use this approach when a
problem has no algorithmic solution, ie: NP-Hard problems or, the algorithmic
solution is not practical.
In addition, by making
the problem distributed, we make it less prone to false convergence (same
concept as simulated annealing. Refer to [4]). We also assume more computing
power in optimization problems leads to faster convergence. No tests were made,
even though the point of the project was to somehow prove this concept, because
of lack of time (Which is not a good reason but a valid one).
Accordingly, the program
considers the following input: A Population of Chromosomes that
have a “fitness function”. The output is the best Chromosome we can
reach after a specified amount of time. For instance, as an application for
this problem, we consider a theoretical animal that has a weight W and n number
of legs where n can be any number between 0 and a maximum number
LIMIT_LEGS_NUMBER. Each leg has an initial oriented angle a0 that it forms with the horizontal and a
final angle af.
We assume the leg reaches a maximum “velocity” (even though a rotational
velocity should have been considered) at the final angle. The leg has a weight
and a length that have upper limits too.
The distributed genetic
algorithms program simulates the evolution of this animal and returns the
fittest (fastest) animal after a certain number of generations. Image 1.1
Image 1.1: Use Case Diagram

2- The Object Oriented
Design
In order to make the system expandable, extensible and reusable,
we considered the following design for the distributed system. Image 2.1
Image 2.1: The Object Oriented Design

Whenever a NetNode is
initialized, its NODE_TYPE is a NODE_CLIENT. The program starts by calling NetNode’s
run() method which creates instances of
DatagramBroadcaster in order to check for a coordinator and a DatagramListener
to wait for a reply. The DatagramListener takes the portNumber we want to
listen to, the listen time (use -1 to make it listen endlessly), the type of
messages we’re waiting for (i.e: MsgClass.ANY_BODY_HERE) and a reference to the
parent NetNode. The DatagramListener will pass the references to the
DatagramListenerWorker (extends thread) whose job is to listen according to the
given parameters. Whenever a packet is received, it’s passed to the
PacketHandler whose job will be discussed next. So basically, almost all our
event handling (Packet reception) will be handled in the PacketHandler.
Image 2.2: The
PacketHandler’s job.

Any packet received by
the PacketHandler is passed to the MsgParser which returns a message that
describes the packet’s contents. The packet handler fires events accordingly.
For instance, if a packet containing the message “msg:anybody here” is received
at the PacketHandler level and passed to the MsgParser, the event
“MSG_ANYBODY_HERE” will be fired next by calling the method doCommand(msg,
DatagramPacket);
A problem is encountered
when no packets are received and events need to be fired. For example, a NetNode
is waiting for server’s data and a timeout has occurred. The ListenerWorker
will have to handle these timeouts. (We could have designed a TimeoutHandler,
but this was realized a little bit late).
3- A Closer Look to
the Flow
The
program’s loop starts when the NetNode’s run() method
is first called. A DgramListener will be created and ListenerWorker will be
assigned a listen job that is handled by the PacketHandler is a packet is
received within the allotted time or by the worker itself if a timeout occurs.
This determines the type of the NetNode: a NODE_SERVER or a NODE_CLIENT.
If a NetNode is a
NODE_CLIENT, it has to wait for server’s data and run the genetic algorithm in
a separate thread. If the NODE_CLIENT receives the data, they give it to the
Population that decides, according to the fitness function, whether to add it
to the population or not. (Note that the population has
serializing/deserializing methods that convert from and to the correct data
type). If the NODE_CLIENT does not receive the server’s data, it assumes that
the server is dead. (If the client is disconnected, the following will not
affect the algorithm in any way except for losing this client’s contribution)
Accordingly, the client
will broadcast a vote message and listen for the votes. When a broadcast
message is received, even by the server, the recipient NetNode will send its
vote (ie: “msg:vote127.0.0.1”) and wait for a reply.
The client that called for the vote will collect all the IPs in an IpList and
after the vote time ends, it will message the NetNode with the highest IP (who
is currently waiting for a message) that it (the waiting node) is the server.
When such a message is received, the node will broadcast that it’s the server,
and the recipients will update the server’s IP (for future use).
In the other hand, if a
NedNode is a NODE_SERVER, it has to wait for clients’ connections and work on
the genetic algorithm in a separate thread. The server will not respond to the
client explicitly. However, it’ll broadcast its data whenever a client asks for
a server. This will help the “late” generations (since we’re using an
unreliable protocol, UDP) to catch up by checking the newest version of the
“best Chromosomes”. The server has also to run a timer that, whenever it ticks,
the server has to broadcast again. If the client is sending data, the server
will update his Population accordingly.
Figure 3.1: The Flow

4- Tests (Ideas)
Since the NetNodes’s type
are dynamic that is, the client can become a server, we cannot test the program
on a single computer. For the listening port will be reused as a transmission
port hence failing to bind a socket. A NetNode, since we’re working with
broadcasts, should also ignore any packet whose IP is ours.
Accordingly, the program
was tested “by chunks”. Debugging distributed systems is not an easy job,
especially if the system is as dynamic as ours.
The system fires the
correct events when messages are received. However, we could not test whether
the voting algorithm is working or not.
We also wanted to test
the behavior of the whole system (convergence wise) when more nodes are added,
if the network is “dynamic”, that is, a lot of node keep joining or quitting
the network. Unfortunately, neither time nor resources were provided.
Another interesting test
would be to compare this distributed system with a client-server based system.
Since the degree of failure in a LAN is very low, we might be wasting resources
by waiting for a server’s failure all time.
5- Conclusion
If only one NetNode is running, the program still succeeds to
converge since the fitness function is easy to compute and the Chromosomes are
of short length (which makes it easier to decode them and therefore evaluate
their fitness faster). However, if we have NP-Hard problems for which no
algorithms currently exist or problems that require a big amount of
computational power, it’s obvious that we want to share the problem with many
other NetNodes. An interesting point that might have come to our minds but
haven’t discussed yet in the paper is consistency. Notice that we do not need
to check for consistency in distributed genetic algorithms since we’re working
on optimization problems. Of course, this project needs a lot of improvement,
and it was designed to be easily modified. We might have failed to demonstrate
the distributed system criterion of the project since no tangible tests were
made, but still the theoretical ideas and design exits and are challenges for
future development and improvement.
References
[1] Turing, A.M: Computing machinery and
intelligence (1950)
[2] Negivinski, M.: Artificial Intelligence
(2002)
[3] Russel, S. and Norvig, P.: Artificial Intelligence A Modern approach (2004)
[4] Tanenbaum, A.S and Van Steen, M:
Distributed Systems (2003)