From the Road – Page 18

GP update 2015 10/02

(email to my fellow researchers)

This week has come and gone quickly. I have been to SKA four times, including a week ago Friday. The pace of my work differs now, as my data runs are a minimum of 5-6 hours, but more recently 50+ hours as I push GP to 50 generations of 100-200 Trees against 10,000 lines of features.

I arrive. Log on. Run each accomplished tree against the TEST data. Save the results in my diary. Archive the trees. Mod parameters. Start a new run. Two hours in commute for an hour of work. Sounds like living in San Fran, not Muizenberg.

So far, very good! No glitches in my software. Not a single crash (except when Nadeem accidentally killed Karoo GP 35 gens into a 50 gen run, trying to kill zombies at my request. Silly us! You can’t kill zombies, they’re already dead! I know, old UNIX joke, but it’s still funny :) The multi-core is solid and linear scaling on the 40 core box. The server version (configuration file + single line execution) works well for repeat runs.

I have conducted four full runs, with the fifth now in progress. Keeping a diary of the results, including the Precision / Recall against the TREE ID and it’s polynomial expression. What’s more, every tree is saved in a .csv file at the end of each Generation. Even when Karoo was terminated accidentally, nothing lost.

Now, I need to write a script which loads a .csv and runs with it, as a total population seed (common according to the literature). The continue function is already in place, so just need to slip a loaded list of arrays into population_a and cont.

Consistently, I am seeing 82-86% Precision (in a 50/50 dual class feature set) with Recall just a few points below. I need to look at AUC and one other analysis (rcm by Thuso; can’t recall the name) to get a full understanding of how Karoo is doing.

Ok. Back to work …

By Kai Staats|2017-11-24T23:44:43-04:00October 2nd, 2015|Ramblings of a Researcher|Comments Off

The Waters of Mars

water on Mars by NASA

(photo courtesy of NASA)

The race for space began with fear that one of our kind might leave home before the other and gain a military advantage. It was not an expedition but a political decision to fuel the Saturn V rockets that carried our species further than ever before.

Four decades later, we have advanced our technology such that each of us carries in our pockets more computational power than all of NASA at the time of the Apollo program, yet we remain grounded, the International Space Station the only reminder of a time when we believed we would inherit the stars.

In my lifetime, humans have walked on the moon and orbited the Earth countless thousands of times. But I must ask without confidence, Will I live to see humans walk on the surface of the Moon again? Will we lay hammer to the rocky surface of an asteroid or sample the flowing waters on Mars?

With the British Interplanetary Society, Icarus Interstellar, and the Initiative for Interstellar Studies thought leaders are helping to put words to thought, and designs to words. The Planetary Society continues to lead with real spacecraft moving into interplanetary trajectories, even into interstellar space.

With NASA’s bold declaration of water on the surface of Mars, perhaps, finally, the dead centre will be shifted to an edge over which politicians without the power of imagination but with the power of economic control will be forced to follow.

Maybe then we will be made aware not of what makes us different, but what unites us under a common goal.

Exploration. Discovery. The unknown.

By Kai Staats|2015-10-06T23:11:40-04:00September 29th, 2015|Looking up!|Comments Off

When the Moon Turns Red

The photographs were obtained between 3:15 and 4:20 am, in Muizenberg, Cape Town, South Africa. The cloud cover came and went, at times totally blocking the view. Unfortunately, as the Moon neared totality, the mist was heavy (thus the soft image). The final shot of the Moon resting on the adjacent building was only seconds after the clouds dissipated one last time. Totality was missed from this vantage point, but the total experience was mesmerising.

Canon 60D
Nikor 80-200mm lens (circa 1980) with Nikon/Canon adapter
ISO: 400 – 1000
Exposure: 1/200 – 2 seconds

By Kai Staats|2017-04-10T11:17:32-04:00September 28th, 2015|2015, Looking up!, Out of Africa|Comments Off

GP update 2015 09/25

(email to my fellow researchers)

Today marked the first official day of Karoo GP processing KAT7 data.

My first run was with depth 5 trees against 10,000 lines of data with 5 features. The multi-core functionality saved my recursive ass as the first 10 generations of 100 trees took just over 5 hours to process.

In the end of this minimisation function, there were 3 trees presented as having the best fitness, 2 of which shared the same polynomial expression. I think that is a good sign, but not certain yet.

Precision was 86% for both. Recall quite a bit less.

I sent the first run back into another 20 generations (a new feature I added to Karoo GP this week which allows you to continue the evolution indefinitely), and started another run with the same settings, to see if it converges on anything close to the first set of equations.

Will find out on Monday …

kai

By Kai Staats|2017-11-24T23:44:49-04:00September 25th, 2015|Ramblings of a Researcher|Comments Off

Lost days

Sometimes I spend an entire day searching for a place to work.

I venture to two or three cafes.

The first is too noisy, with traffic, construction, and trains screaming by.

I order a drink and dessert at the next only to learn the wi-fi is intermittent, at best, twenty minutes lost trying to connect. I do what I can off-line.

The final stop and I find a cozy, relatively quiet corner only to realise there is no power and my laptop has but 12% battery remaining.

I know. It sounds silly to complain about such things, for life is far more dynamic than this. But when what I do requires that I am on-line, days like today are days lost to a modern quagmire.

By Kai Staats|2015-10-06T23:20:45-04:00September 23rd, 2015|Critical Thinker, Humans & Technology|Comments Off

GP update 2015 09/13

(email to my fellow researchers)

My classification TEST suite is complete, producing Accuracy, Precision, and Recall scores on associated trees.

I have a very basic evaluation built for the Abs Value (minimization) function. Not really sure what one usually uses to test one of these, other than comparing the distance from the known solution to the produced result.

Spent a few hours dealing with a ‘zoo’ (Pythonic nomenclature for “divide by zero”). Seems SymPy is willing to carry divide-by-zero functions as long as you don’t attempt to process them as a float. Then it freaks out. So, I had to intercept the polynomial processing with a str() test for ‘zoo’.

Anyone ever tried a Google search for “python zoo”?

Finally, I need to apply the sklearn split function across my data. The framework is in place (already modified the way the data is passed through the entire script to accommodate both TRAINING and TEST).

Should be easy. (stupid last words)

kai

By Kai Staats|2017-11-24T23:44:56-04:00September 13th, 2015|Ramblings of a Researcher|Comments Off

Normalisation is abnormal

(sitting at AIMS)

From 10 am till 2 pm, this is what I built. Argh! Should not have been so hard.

def fx_normalize(self, array):
norm = []
array_min = np.min(array)
array_max = np.max(array)

for col in range(1, len(array) + 1):
n = float((array[col – 1] – array_min) / (array_max – array_min))
norm = np.append(norm, n)

return norm

… but now, my function appears completely different, the curve of the line gone. Is it just the scale? Yes, further testing confirms this. Good.

Now I will return to my work with Accuracy, Precision, and Recall.

By Kai Staats|2017-11-24T23:45:04-04:00September 10th, 2015|Ramblings of a Researcher|Comments Off

Frustration in the easy things

(sitting at SAAO)

Per my work at SKA yesterday, I learned to use matplotlib to produce 3D plots of my functions in combination with a scatterplot of the Iris features. I was hoping to automate the solving for any given variable using Sympy, but have not found a means to that end. For now, I will manually reform each algebraic expression. Not ideal, but I need to move ahead. Spent too much time on this already.

I recognise that plotting is core to any modern research. I feel far behind, but know I will come up to speed quickly. Between the older gnuplot, matplotlib, and sympy’s plot functions, there are myriad approaches (too many, in fact).

As I have many times experienced over the past year, every day I am humbled by the challenge of learning something new, and at the same time rewarded by the same. Each day feels incremental, but when I look back to my very first line of Python a little over a year ago, and now, over 2500 lines of Object Oriented, multi-core code with a home-built Numpy array management system, yeah, I’ve learned a ton.

By Kai Staats|2017-11-24T23:45:11-04:00September 9th, 2015|Ramblings of a Researcher|Comments Off

My brain barrier

I pushed too hard, tried to do too many things at once. I felt the barrier coming, but ignored it, trying to make one more breakthrough in my code.

Last night and today I am learning how to plot multi-dimensional data. In practice, this is very simple. But this has been a real struggle for me.

I promised myself that in this Masters, I would not take any shortcuts, that what I learn would be fully integrated into my understanding. Due to my lack of a mathematical science foundation, some of the core principals of data reduction, statistical application, and multi-dimensional visualisation are totally new to me.

What’s more, my anxiety literally heats my brain to a point of dysfunction, the back of my neck feels like someone is pouring hot syrup down my spine (since 2011). When I feel stupid, when I can’t figure something out, and when the pressure is on … I spiral.

I feel as though I lost a full day (which I can’t afford to do).

On the walk to the train station I asked Arun a number of questions to clarify my understanding. The physical act of walking, gesturing with my arms and hands helped tremendously. Arun is always patient, and an excellent explainer. A few key concepts became clear and it fell into place for me.

By Kai Staats|2017-11-24T23:52:30-04:00September 8th, 2015|Ramblings of a Researcher|Comments Off

Return to Floating point hell

(sitting at SKA again)

No choice but to return to floating points again as once in every several thousand trees, the fitness score stored has expanded beyond the set 4 floating point places, resulting in something like n.123456e (scientific notation).

I have searched on-line, dug deep into Stackoverflow, and asked my colleagues. No one has a means by which I can define a variable as a fixed number of floating points (to the right of the decimal) that stays that way. Ugh!

I therefore introduced a second round function in the Tournament method, to force the numbers back into spec. I have not since experienced the same error. This is not the ideal solution as the round function introduces its own issues, but for the sake of GP, it does the trick.

I then approached the issue of plotting 2D and 3D data, but hit a major mental block. Like an old engine with leaky coolant, my brain overheated and I had a bad day. Sorry Arun, I was really grumpy.

By Kai Staats|2017-11-24T23:52:37-04:00September 8th, 2015|Ramblings of a Researcher|Comments Off