Kai Staats: writing

Smashing bugs

The past 3 days have been a fun dive back into my code. I discovered my code no longer worked with cos and sin, the matching fitness function (result = solution) failed with floats, and my minimalisation function was selecting the final Tree, not the best. Ugh!

The float issue all programmers deal with, namely, forcing all variables which are compared against each other into the same number of positions to the right of the decimal point. FIXED!

The cos and sin were related to the float issue. FIXED!

The minimalisation function was my own damned fault, as in my Tournament Selection I had copy/pasted a body of code with intent to them rework it from maximising to minimising, but got distracted, forgot (3-4 weeks ago?) and only tonight realised what was happening. FIXED!

Here are my first results from the working minimalisation function. What remains is the automated selection of the best of the best, not just a list of the leaders in the final generation. Easily done.

Desired result:
a*b + c where 1*10 + 0.05 = 10.05

Trial 1:
a*b + c – c**3/b where 1*10 + 0.05 – (0.05^3/10) = 9.64

Trial 2:
a*b + c + c/b where 1*10 + 0.05 + 0.05/10 = 10.055
a*b where 1*10 = 10.00
a*b + c where 1*10 + 0.05 = 10.05 is CORRECT
a*b + c + c/a where 1*10 + 0.05 + (0.05/1) = 10.10

Trial 3:
a*b + c where 1*10 + 0.05 = 10.05 is CORRECT
a*b + 1/b where 1*10 + 1/10 = 10.10

The above test works perfectly. Now, I have only to test the Iris classification set and I will have all 3 fitness functions fully tested and working.

I am roughly 5 weeks behind sched, but believe I can catch-up as I my code base is solid and designed for the rapid introduction of new fitness functions. In theory, this goes fairly smooth after I prove the base works (why do I keep saying this?!)

Cool!

By |2017-11-24T23:53:01-04:00September 6th, 2015|Ramblings of a Researcher|Comments Off on Smashing bugs

Floating points

In a moment of needing a break from working on the User Guide I played with Karoo GP for 30 minutes and came back to testing cos and sin functions. They broke the Absolute Value function where Classification and Matching yet worked.

A day into the battle, a deeper issue has unfolded which keeps Karoo GP from working with any floats (although this was tested a long time ago, but with a very limited and controlled number of decimal places).

Now, I have learned even when [result = solution], it fails to match.

algo_sym a + b/c
result 0.44404973357
solution 0.4440497336

algo_sym a + b/c
result 0.690166666667
solution 0.6901666667

algo_sym a + b/c
result 2.70666517168
solution 2.7066651717

ARGH!!! Rounding errors!!!

I need to introduce careful control of the number of decimal places invoked for both ‘result’ and ‘solution’. For as much headache as this has caused, I am pleased I took that break and played around with my code again, else I would have run into this when I came back to KAT7 data.

What’s more, this might help Karoo GP solve the Kepler planet problem (which still fails at, miserably).

By |2017-11-24T23:53:08-04:00September 5th, 2015|Ramblings of a Researcher|Comments Off on Floating points

A simple equation

Depth 10 GP Tree by Kai Staats

(sitting in the bookstore in Kalk Bay)

This is my first time working on Karoo GP since my daughter Lindah’s arrival to South Africa nearly two weeks ago. She departed just yesterday. Out time together was incredible. We both learned so much. I am so sad to see her go :(

My intent is to complete the User Guide by the close of the weekend. I was able to complete all in-line documentation for the main script. I am now in the process of completing the in-line documentation for the base_class.

In so doing, I derived the following simple but quite useful equation to determine the maximum number of nodes (assuming a Full tree) in any given GP tree:

nodes = 2^(depth + 1) – 1

For example:

Depth 1 = 3 nodes (1 functions, 2 terminals)
Depth 2 = 7 nodes (3 functions, 4 terminals)
Depth 3 = 15 nodes (7 functions, 8 terminals)

Depth 10 = 2047 nodes (1023 functions, 2024 terminals) *

* that is one huge-ass (scientific lingo) polynomial!

By |2017-11-24T23:53:17-04:00September 4th, 2015|Ramblings of a Researcher|Comments Off on A simple equation

Clouds over Sutherland

I stand in the cool breath of an amorphous white world.

Unable to see but a few meters to my front, rear, and sides.

The silver, white domes presents themselves as subtle outlines,
shimmering into and out of view.

Yet, sometimes, it seems, they are solid

and it is I who disappears.

By |2015-10-07T12:52:05-04:00August 30th, 2015|2015, Out of Africa, The Written|Comments Off on Clouds over Sutherland

Multi-core GP!

August 15-20

I was able to introduce multi-core functionality using ‘pprocess multicore library‘ (Thanks for the pointer Arun!). This effort required roughly four days research and coding, as it was my very first go. In the end, it is relatively simple, depending upon the code in which it is introduced. I included a user-defined core quantity, and the ability to modify the number of cores engaged during runtime.

The key to multi-core functionality is getting your head wrapped around protected memory spaces. That is, each core assigned by pprocess (or any multi-core library) will reproduce a fully functional copy (instance) of the section of your code which you are spawning on each core.

Each of these instances can read from any global variable, no problem. However, they cannot write back to global variables as all of them would try to write back to the same variable at the same time, known as a race condition. Bad things would happen.

It is imperative to keep in mind that what happens in each instance will not affect what happens in the other instances, on the other cores. Once spawned, they are all independent even though they work with the same variable names and are processed by copies of the same code.

If the returned values are to affect something, that something must be outside of the multi-core environment, post-collection. This may require some rearranging of your code.

For me, it meant that instead of a fitness = fitness + 1 inside the pprocess pipeline, I returned the single value of fitness for each instance on each core, and then conducted the sum function when I collect the results from pprocess.

Below I offer an example of the original for loop and the pprocess which replaces it, for multi-core support. I have left the single core for loop in place as a user invoked bypass of pprocess when the overhead of multi-core processing reduces performance (common on non-CPU intensive runs or on a limited number of cores).

In my code, each GP tree must evaluate all rows in the given data. I therefore employ a for loop to iterate through each row in previously loaded .csv file (not shown in this example). The Python method (function) fitness_eval() is what conducts the evaluations.

As fitness_eval itself calls another two Python methods (3 methods in total), the key was to rewrite my code such that any global variables for which values are updated (written to) were made local instead, passed from method to method directly. In the end, this makes for better code, forcing me to rethink the way I had designed this and others sections.

if cores == 1: # employ only one CPU core and bypass ‘pprocess’
   for row in range(0, data_rows): # increment through all rows in the data
      fitness = fitness + fitness_eval(row) # evaluate tree fitness

else: # employ multiple CPU cores using ‘pprocess’
   results = pp.Map(limit = self.cores)
   parallel_function = results.manage(pp.MakeParallel(fitness_eval))
   for row in range(0, data_rows): # increment through all rows in data
      parallel_function(row) # evaluate tree fitness

   fitness = sum(results[:]) # ‘pprocess’ returns the fitness scores in a single dump

In this example, the pprocess method parallel_function() is a wrapper for my original method fitness_eval(), such that I pass the incrementing variable row through parallel_function(row) which in turn hands if off to multiple copies of fitness_eval(). Cool, huh?! :)

One benefit of pprocess, as compared to the Python mulitiprocessing in Python library, is that pprocess can pass more than one variable through the called methods without using a pickle (a package used to send variables to and from methods while in mulit-core memory spaces). So, you can treat your methods just as you would on a single core, sending and returning variables using method(var_1, var_2, var_n) and subsequent return var_1, var_2, var_n.

But keep in mind what I shared above, about how those variables can only return an isolated value per instance, as changes to each of those variables on each core will not affect the other instances. Make sense?

Many thanks to the Stackoverflow community and Paul Boddie, co-author of pprocess.

By |2017-11-24T23:53:24-04:00August 20th, 2015|Ramblings of a Researcher|Comments Off on Multi-core GP!

Premature convergence, part 2

(continued from GP update 20150813)

I see 4 ways to deal with the premature convergence:

Karoo GP, premature converge by Kai Staats

a) Take a pill.

b) If any given tree falls below the user-defined number of nodes (node count, not depth count), that tree is forced to mutate over and over again ’till it is at or above the prescribed node count. This feels convoluted, as this is not how it happens in the biological world.

c) Nudge the fitness function (higher or lower, depending upon max or min function) such that a tree whose node count is below the user-defined number is less likely to be selected in a Tournament.

d) Simply block any tree whose node count is lower than the user-defined number from entering a tournament. As all four of my mutation types are channelled through tournament selection, this is an easy, 2 line solution.

Is it real-world? Is it any different than applying a maximum depth?

Hmmmm …

(resolved in Kepler’s Law resolved by GP!)

By |2017-11-24T23:53:31-04:00August 14th, 2015|Ramblings of a Researcher|Comments Off on Premature convergence, part 2

GP update 2015 08/13

(email to my fellow researchers)

Subject: premature convergence

No, this email is not about some “teenage” problem :)

Per my conversation with Emmanuel today, the classic Kepler’s 3rd Law of Planetary Motion (called the “Harmonies Law”) is the square of the period divided by the cube of the mean radius from the center of the Sun to the center of the planet.

My new “minimisation’ function is working beautifully! However, maybe it works a little too well. Karoo GP is quickly converging on p/p (period divided by period) which of course equals 1, where 6 of the 9 planets are defined as 1.0 in the dataset I am using (the other 3 planets are 0.99 or 0.98).

So, I am going to introduce a new user defined “Min Nodes” which will set the minimum number of nodes (elements) in the GP tree (equation). I feel like this is cheating, but Emmanuel confirmed that most GP problems require some tweaking of the code.

There are 2 ways a large tree can very quickly get smaller: Grow Mutation or Cross-Over mutation (Reproduction, Point Mutation, and Full Mutation do not alter Tree size). I can intercept either of these and force them to evolve again and again until the new tree is above the min boundary; or simply invoke an artificial fitness boost in the right direction.

What would Darwin say? What would Gandhi do?

It seems to invoke evolutionary pressure is the one more like the “real-world”, no? To force something to evolve again and again until it satisfies a certain criteria is a bit convoluted. But any more so than defining a maximum depth?

We’ll see how I feel in the morning :)

kai

(continued in Premature convergence)

By |2017-11-24T23:53:39-04:00August 13th, 2015|Ramblings of a Researcher|Comments Off on GP update 2015 08/13

Minimise, Maximise

(sitting at SKA)

Struggling to get back into my code.

Preparing to test Kepler’s 3rd Law of Planetary Motion. This is a minimisation problem, meaning the best overall fitness will be the smallest number. I will employ the Absolute Value Difference fitness function I recently developed. Simple in theory, but always a few hidden challenges in implementation.

(later)

Realised I need to re-think all my fitness functions, and embed both minimisation and maximisation into the tournament and ‘fitness gym’. Will move to define each fitness kernel as ‘min’ or ‘max’ at the opening.

By |2017-11-24T23:53:46-04:00August 12th, 2015|Ramblings of a Researcher|Comments Off on Minimise, Maximise

The Bottle and the Waves

Today was the first day I have seen this sea with any waves. It is usually quite flat. There were people surfing (which seldom happens here, at any time of the year). I found it difficult to stand in the waves, as they had tremendous power on-shore (remember, the beach in Barceloneta is entirely man-made, and drops off very quickly).

Two drunk guys on the beach today, here in Barceloneta, between 7:30 and 8:00 pm. One was completely wasted and trying to get back into the water. His friend, who had just opened another bottle of beer was blocking him, to the best of his ability.

The drunk guy (without the bottle in his hand) made it past his friend and fell face first into a wave. He lay there, face down, not moving while the wave tossed him a half meter high and low. As it tumbled him, he tried to stand but couldn’t get back to his feet.

His friend, still holding the beer, walked out into the water and tried to guide him back to shore. The next wave knocked him over as well, his beer now a mixture of salty water and brew.

The first guy was being tossed about as if in a washing-machine, mostly with his face under water. It was clear he was not going to get out again and would likely drown.

Thousands of people on the beach, yet no one doing anything. I ran down to the shore and waited. I did not want to go out into the water, as he could pull me under. The next wave tossed him to my feet, the water a half meter deep. I grabbed his shirt, lifted him, wrapped my arms under his and dragged him up onto the beach. He attempted to stand, stumbled, and I dragged him further up, to dry land.

Again, no one else assisting, but everyone watching.

He struggled to his feet, coughing, and tried to walk back into the water. I placed my leg behind his, pushed hard on his chest, and took him down. I threw his arm over his head and pinned it to the sand, placing my knee on his upper arm, the full weight of my body on his chest.

I yelled at him, “Hombre! No mas! No pincha mas, ok?!”

He nodded, but was still catching his breath.

His friend came along side, bottle still in hand. I pointed to the bottle and told him to empty it. He did, on command, and then thanked me for helping his friend.

I held him in position until he stopped struggling, and asked his friend to look after him (for what that was worth). I stood, they both shook my hand. Ten minutes later, they were both wobbling around the beach again.

By the public showers, I watched, waiting to see what would happen. Someone offered them two Cokes. The drunk fell down and lay still.

By |2017-11-24T22:54:54-04:00July 30th, 2015|From the Road|Comments Off on The Bottle and the Waves

To Swim a Mile

Barcelona, Spain – July 20
I was in the ocean twice today, at 7:30 am and again this evening. The water is so incredibly warm. I have not experienced anything like this since Hawaii. Amazing. I swam nearly 1km today, with one break on the beach. I am not an efficient swimmer, having had no lessons since I was six years of age.

This evening I will watch a few Youtube videos to see if I can improve my strokes.

I typically move from breast stroke to side stroke to back stroke to the other side and breast again, essentially rolling as I go to give muscle groups a break. All my days in the turbulent surf at Muizenberg, even if on a surf board, has given me greater confidence in the ocean, and more stamina.

Today, I recognised that I had hit a swimming “high”, the sensation that I could go on forever. As with running, it took about 30-40 minutes for me to get past that first plateau, and then the breathing and rhythm came easily.

My goal is to swim 1 mile, without a break, before I leave Spain.

Barceloneta Bay, Kai Staats swims a mile

July 24
I accomplished my goal! I swam 1.65 km (1 mile) without stopping. Damn! It took over an hour. Might have been faster to crawl on all fours (backward), but I made it.

The stretch between the man-made break (upper right) and the coastline near the W Hotel (lower right) was a bit scary for me as I have never swam that distance before, unable to see the bottom or return to something safe.

However, what the satellite image does not show are a half dozen buoys, anchored by long chains to the ocean floor. If need be, I could have clung to one of them, each was about 100 meter apart.

When I completed the lap, I felt as thought I had been run over by a bus, went back to Matt’s flat and slept for an hour.

By |2017-04-10T11:17:32-04:00July 24th, 2015|From the Road|Comments Off on To Swim a Mile
Go to Top