Kai Staats: writing

When the Moon Turns Red

Lunar Eclipse 2015 by Kai Staats
Lunar Eclipse 2015 by Kai Staats Lunar Eclipse 2015 by Kai Staats

The photographs were obtained between 3:15 and 4:20 am, in Muizenberg, Cape Town, South Africa. The cloud cover came and went, at times totally blocking the view. Unfortunately, as the Moon neared totality, the mist was heavy (thus the soft image). The final shot of the Moon resting on the adjacent building was only seconds after the clouds dissipated one last time. Totality was missed from this vantage point, but the total experience was mesmerising.

Canon 60D
Nikor 80-200mm lens (circa 1980) with Nikon/Canon adapter
ISO: 400 – 1000
Exposure: 1/200 – 2 seconds

By |2017-04-10T11:17:32-04:00September 28th, 2015|2015, Looking up!, Out of Africa|Comments Off on When the Moon Turns Red

GP update 2015 09/25

(email to my fellow researchers)

Today marked the first official day of Karoo GP processing KAT7 data.

My first run was with depth 5 trees against 10,000 lines of data with 5 features. The multi-core functionality saved my recursive ass as the first 10 generations of 100 trees took just over 5 hours to process.

In the end of this minimisation function, there were 3 trees presented as having the best fitness, 2 of which shared the same polynomial expression. I think that is a good sign, but not certain yet.

Precision was 86% for both. Recall quite a bit less.

I sent the first run back into another 20 generations (a new feature I added to Karoo GP this week which allows you to continue the evolution indefinitely), and started another run with the same settings, to see if it converges on anything close to the first set of equations.

Will find out on Monday …

kai

By |2017-11-24T23:44:49-04:00September 25th, 2015|Ramblings of a Researcher|Comments Off on GP update 2015 09/25

Lost days

Sometimes I spend an entire day searching for a place to work.

I venture to two or three cafes.

The first is too noisy, with traffic, construction, and trains screaming by.

I order a drink and dessert at the next only to learn the wi-fi is intermittent, at best, twenty minutes lost trying to connect. I do what I can off-line.

The final stop and I find a cozy, relatively quiet corner only to realise there is no power and my laptop has but 12% battery remaining.

I know. It sounds silly to complain about such things, for life is far more dynamic than this. But when what I do requires that I am on-line, days like today are days lost to a modern quagmire.

By |2015-10-06T23:20:45-04:00September 23rd, 2015|Critical Thinker, Humans & Technology|Comments Off on Lost days

GP update 2015 09/13

(email to my fellow researchers)

My classification TEST suite is complete, producing Accuracy, Precision, and Recall scores on associated trees.

I have a very basic evaluation built for the Abs Value (minimization) function. Not really sure what one usually uses to test one of these, other than comparing the distance from the known solution to the produced result.

Spent a few hours dealing with a ‘zoo’ (Pythonic nomenclature for “divide by zero”). Seems SymPy is willing to carry divide-by-zero functions as long as you don’t attempt to process them as a float. Then it freaks out. So, I had to intercept the polynomial processing with a str() test for ‘zoo’.

Anyone ever tried a Google search for “python zoo”?

Finally, I need to apply the sklearn split function across my data. The framework is in place (already modified the way the data is passed through the entire script to accommodate both TRAINING and TEST).

Should be easy. (stupid last words)

kai

By |2017-11-24T23:44:56-04:00September 13th, 2015|Ramblings of a Researcher|Comments Off on GP update 2015 09/13

Normalisation is abnormal

(sitting at AIMS)

From 10 am till 2 pm, this is what I built. Argh! Should not have been so hard.

def fx_normalize(self, array):
norm = []
array_min = np.min(array)
array_max = np.max(array)

for col in range(1, len(array) + 1):
n = float((array[col – 1] – array_min) / (array_max – array_min))
norm = np.append(norm, n)

return norm

… but now, my function appears completely different, the curve of the line gone. Is it just the scale? Yes, further testing confirms this. Good.

Now I will return to my work with Accuracy, Precision, and Recall.

By |2017-11-24T23:45:04-04:00September 10th, 2015|Ramblings of a Researcher|Comments Off on Normalisation is abnormal

Frustration in the easy things

(sitting at SAAO)

Per my work at SKA yesterday, I learned to use matplotlib to produce 3D plots of my functions in combination with a scatterplot of the Iris features. I was hoping to automate the solving for any given variable using Sympy, but have not found a means to that end. For now, I will manually reform each algebraic expression. Not ideal, but I need to move ahead. Spent too much time on this already.

I recognise that plotting is core to any modern research. I feel far behind, but know I will come up to speed quickly. Between the older gnuplot, matplotlib, and sympy’s plot functions, there are myriad approaches (too many, in fact).

As I have many times experienced over the past year, every day I am humbled by the challenge of learning something new, and at the same time rewarded by the same. Each day feels incremental, but when I look back to my very first line of Python a little over a year ago, and now, over 2500 lines of Object Oriented, multi-core code with a home-built Numpy array management system, yeah, I’ve learned a ton.

By |2017-11-24T23:45:11-04:00September 9th, 2015|Ramblings of a Researcher|Comments Off on Frustration in the easy things

My brain barrier

I pushed too hard, tried to do too many things at once. I felt the barrier coming, but ignored it, trying to make one more breakthrough in my code.

Last night and today I am learning how to plot multi-dimensional data. In practice, this is very simple. But this has been a real struggle for me.

I promised myself that in this Masters, I would not take any shortcuts, that what I learn would be fully integrated into my understanding. Due to my lack of a mathematical science foundation, some of the core principals of data reduction, statistical application, and multi-dimensional visualisation are totally new to me.

What’s more, my anxiety literally heats my brain to a point of dysfunction, the back of my neck feels like someone is pouring hot syrup down my spine (since 2011). When I feel stupid, when I can’t figure something out, and when the pressure is on … I spiral.

I feel as though I lost a full day (which I can’t afford to do).

On the walk to the train station I asked Arun a number of questions to clarify my understanding. The physical act of walking, gesturing with my arms and hands helped tremendously. Arun is always patient, and an excellent explainer. A few key concepts became clear and it fell into place for me.

By |2017-11-24T23:52:30-04:00September 8th, 2015|Ramblings of a Researcher|Comments Off on My brain barrier

Return to Floating point hell

(sitting at SKA again)

No choice but to return to floating points again as once in every several thousand trees, the fitness score stored has expanded beyond the set 4 floating point places, resulting in something like n.123456e (scientific notation).

I have searched on-line, dug deep into Stackoverflow, and asked my colleagues. No one has a means by which I can define a variable as a fixed number of floating points (to the right of the decimal) that stays that way. Ugh!

I therefore introduced a second round function in the Tournament method, to force the numbers back into spec. I have not since experienced the same error. This is not the ideal solution as the round function introduces its own issues, but for the sake of GP, it does the trick.

I then approached the issue of plotting 2D and 3D data, but hit a major mental block. Like an old engine with leaky coolant, my brain overheated and I had a bad day. Sorry Arun, I was really grumpy.

By |2017-11-24T23:52:37-04:00September 8th, 2015|Ramblings of a Researcher|Comments Off on Return to Floating point hell

Flowers for Algebra

(sitting at SKA)

It is my intent to complete the classification test suite today, using the benchmark Iris dataset.

The Iris dataset offers 4 features (columns in a .csv) for each of 3 unique plant species, as labelled in the right-most (5th) column. As 2 of these species are not discernible from each other with a linear function, there are 2 ways to approach this problem:

  1. Compare only 2 species at a time (in a round-robin) such that we work with A/B, A/C, and finally B/C; or
  2. Seek a non-linear function with a kernel which supports more than 2 classes at a time

Karoo GP - by Kai Staats

As Karoo GP supports only 2 classes at this time, I am going to work with the first option, and come back to the second when I have time to build a new kernel.

If it is our intent to plot the results, we can engage only 3 dimensions of data at one (without forcing higher dimensions into lower orders). However, the Iris dataset is compose of 4 features (a, b, c and d). As whatever holds true in lower dimensions will be retained in higher dimensions, it is safe to test a lower order of the featureset as long as the features we selected are decisive in the identification of the flower species. As such that we remove column d from all 3 comparisons.

This resulted in what appears to be a fully functional classification by Karoo GP, with both linear and non-linear functions that produce 100% (50/50) correct classifications.

By |2017-11-24T23:52:45-04:00September 7th, 2015|Ramblings of a Researcher|Comments Off on Flowers for Algebra

Kepler’s Law resolved with GP!

Kepler's Law resolved (Karoo GP, Kai Staats)

(Kepler’s 3rd Law of Planetary Motion table by the Physics Classroom)

(continued from Premature convergence)

Working from AIMS and my apartment these past two days, I was able to resolve a persistent floating point issue by employing a round function before the fitness evaluation.

I also fixed the minimisation function with the discovery of 2 copy/pasted lines of code I had apparently failed to come back to. It appears this has not been working for some time, as I have been focused on other aspects of the code.

Finally! Just like that! Karoo GP now resolves Kepler’s 3rd Law of Planetary Motion! YES!!!

I ran it with the default Depth 3 and minimum node count of 3 and again, it came up with t/t = 1. So I ran it again with Depth 5 [2^(d+1)-1 = 63 possible nodes] and minimum node count of 9.

While it struggled for the first 5-6 generations, converging on what appeared to be 1 again, some mutation gave it the correct answer and in just 3 generations the correct trees dominated! Coooool!

{1: t**2/r**3,
2: t**2/r**3,
3: t**2/r**3,

88: t**2/r**3,
92: t**2/r**3,
94: t**2/r**3}

This proves 2 of the 3 desired functions: regression maximisation (match) and regression minimisation. Only classification remains to prove Karoo GP road worthy.

By |2017-11-24T23:52:52-04:00September 6th, 2015|Ramblings of a Researcher|Comments Off on Kepler’s Law resolved with GP!
Go to Top