Distributed Genetic Algorithms

Abdo Achkar

Fall 2005

Distributed Systems

 

 

 The Presentation

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Introduction:

 

     According to Anaxagoras, the man is the most intelligent among animals because he has hands. Alan Turing also, speaking about evolution, states that the form of a system is irrelevant to its intelligence [1]. Which leads us to the question, if humans’ hands are a result of evolution, can evolution be intelligent?

 

In order to answer this question, one will have to define intelligence or, like Alan Turing did, assume that every system that behaves in a human-like manner, is intelligent.

Since evolution’s concept is hard to compare to human’s behavior, one might want to consider evolution’s results: converging chromosomes, or bit strings, for a certain problem. The problem is approached in a hill-climbing-like fashion: whenever a fitter chromosome is encountered, it is placed in the generation. In order to evolve, the population will have to reproduce, be mutilated, and crossed-over. For more information about evolutionary computation, refer to [2].

 

The hill-climbing algorithm always looks forward to maximizing (or minimizing) an outcome. The algorithm might fall on a local maximum/minimum, a ridge and plateaux [3].

 

 

 

 

 

 

 

 

 

 

 

1- Project’s Use Case

 

     Genetic algorithms are indeed hill-climbing algorithms. And, even though their representation is completely different, they tend to maximize or minimize some objective function, and have their own ways to escape false convergence: Mutations and crossings-over. We usually use this approach when a problem has no algorithmic solution, ie: NP-Hard problems or, the algorithmic solution is not practical.

 

In addition, by making the problem distributed, we make it less prone to false convergence (same concept as simulated annealing. Refer to [4]). We also assume more computing power in optimization problems leads to faster convergence. No tests were made, even though the point of the project was to somehow prove this concept, because of lack of time (Which is not a good reason but a valid one).

 

Accordingly, the program considers the following input: A Population of Chromosomes that have a “fitness function”. The output is the best Chromosome we can reach after a specified amount of time. For instance, as an application for this problem, we consider a theoretical animal that has a weight W and n number of legs where n can be any number between 0 and a maximum number LIMIT_LEGS_NUMBER. Each leg has an initial oriented angle a0 that it forms with the horizontal and a final angle af. We assume the leg reaches a maximum “velocity” (even though a rotational velocity should have been considered) at the final angle. The leg has a weight and a length that have upper limits too.

The distributed genetic algorithms program simulates the evolution of this animal and returns the fittest (fastest) animal after a certain number of generations. Image 1.1

 

Image 1.1: Use Case Diagram

 

 

 

 

 

2- The Object Oriented Design

 

     In order to make the system expandable, extensible and reusable, we considered the following design for the distributed system. Image 2.1

 

 

Image 2.1: The Object Oriented Design

 

 

Whenever a NetNode is initialized, its NODE_TYPE is a NODE_CLIENT. The program starts by calling NetNode’s run() method which creates instances of DatagramBroadcaster in order to check for a coordinator and a DatagramListener to wait for a reply. The DatagramListener takes the portNumber we want to listen to, the listen time (use -1 to make it listen endlessly), the type of messages we’re waiting for (i.e: MsgClass.ANY_BODY_HERE) and a reference to the parent NetNode. The DatagramListener will pass the references to the DatagramListenerWorker (extends thread) whose job is to listen according to the given parameters. Whenever a packet is received, it’s passed to the PacketHandler whose job will be discussed next. So basically, almost all our event handling (Packet reception) will be handled in the PacketHandler.

 

 

Image 2.2: The PacketHandler’s job.

 

Any packet received by the PacketHandler is passed to the MsgParser which returns a message that describes the packet’s contents. The packet handler fires events accordingly. For instance, if a packet containing the message “msg:anybody here” is received at the PacketHandler level and passed to the MsgParser, the event “MSG_ANYBODY_HERE” will be fired next by calling the method doCommand(msg, DatagramPacket);

 

A problem is encountered when no packets are received and events need to be fired. For example, a NetNode is waiting for server’s data and a timeout has occurred. The ListenerWorker will have to handle these timeouts. (We could have designed a TimeoutHandler, but this was realized a little bit late).

 

 

3- A Closer Look to the Flow

 

The program’s loop starts when the NetNode’s run() method is first called. A DgramListener will be created and ListenerWorker will be assigned a listen job that is handled by the PacketHandler is a packet is received within the allotted time or by the worker itself if a timeout occurs. This determines the type of the NetNode: a NODE_SERVER or a NODE_CLIENT.

 

If a NetNode is a NODE_CLIENT, it has to wait for server’s data and run the genetic algorithm in a separate thread. If the NODE_CLIENT receives the data, they give it to the Population that decides, according to the fitness function, whether to add it to the population or not. (Note that the population has serializing/deserializing methods that convert from and to the correct data type). If the NODE_CLIENT does not receive the server’s data, it assumes that the server is dead. (If the client is disconnected, the following will not affect the algorithm in any way except for losing this client’s contribution)

Accordingly, the client will broadcast a vote message and listen for the votes. When a broadcast message is received, even by the server, the recipient NetNode will send its vote (ie: “msg:vote127.0.0.1”) and wait for a reply. The client that called for the vote will collect all the IPs in an IpList and after the vote time ends, it will message the NetNode with the highest IP (who is currently waiting for a message) that it (the waiting node) is the server. When such a message is received, the node will broadcast that it’s the server, and the recipients will update the server’s IP (for future use).

 

In the other hand, if a NedNode is a NODE_SERVER, it has to wait for clients’ connections and work on the genetic algorithm in a separate thread. The server will not respond to the client explicitly. However, it’ll broadcast its data whenever a client asks for a server. This will help the “late” generations (since we’re using an unreliable protocol, UDP) to catch up by checking the newest version of the “best Chromosomes”. The server has also to run a timer that, whenever it ticks, the server has to broadcast again. If the client is sending data, the server will update his Population accordingly.

 

 

Figure 3.1: The Flow

 

 

 

 

 

4- Tests (Ideas)

 

Since the NetNodes’s type are dynamic that is, the client can become a server, we cannot test the program on a single computer. For the listening port will be reused as a transmission port hence failing to bind a socket. A NetNode, since we’re working with broadcasts, should also ignore any packet whose IP is ours.

 

Accordingly, the program was tested “by chunks”. Debugging distributed systems is not an easy job, especially if the system is as dynamic as ours.

 

The system fires the correct events when messages are received. However, we could not test whether the voting algorithm is working or not.

 

We also wanted to test the behavior of the whole system (convergence wise) when more nodes are added, if the network is “dynamic”, that is, a lot of node keep joining or quitting the network. Unfortunately, neither time nor resources were provided.

 

Another interesting test would be to compare this distributed system with a client-server based system. Since the degree of failure in a LAN is very low, we might be wasting resources by waiting for a server’s failure all time.

 

5- Conclusion

 

     If only one NetNode is running, the program still succeeds to converge since the fitness function is easy to compute and the Chromosomes are of short length (which makes it easier to decode them and therefore evaluate their fitness faster). However, if we have NP-Hard problems for which no algorithms currently exist or problems that require a big amount of computational power, it’s obvious that we want to share the problem with many other NetNodes. An interesting point that might have come to our minds but haven’t discussed yet in the paper is consistency. Notice that we do not need to check for consistency in distributed genetic algorithms since we’re working on optimization problems. Of course, this project needs a lot of improvement, and it was designed to be easily modified. We might have failed to demonstrate the distributed system criterion of the project since no tangible tests were made, but still the theoretical ideas and design exits and are challenges for future development and improvement.

 

 

References

 

[1] Turing, A.M: Computing machinery and intelligence (1950)

[2] Negivinski, M.: Artificial Intelligence (2002)

[3] Russel, S. and Norvig, P.: Artificial Intelligence A Modern approach (2004)

[4] Tanenbaum, A.S and Van Steen, M: Distributed Systems (2003)