A Summer Wasting

Feeling blue, like Belle & Sebastian’s song on Summer Wasting

Disclaimer: If you’re not literate in mathematical/engineering colloquialism, skip to the last 2 paragraphs and just enjoy Belle & Sebastian.  

Although I most certainly do not regret participating in the excitement of the last few days during the unscheduled HOPE conference, it has come at a cost. This summer I’ve been working on submitting papers for SEAL later this year in India. The goal was to send at least 2-3 papers, so that it could be justifiable for the university to send me such a long way from home. I managed to finish a paper on surrogate ranking using ordinal support vector machines and a genetic algorithm, within the original deadline mind you.

Abstract: Surrogate ranking in evolutionary computation using ordinal regression is introduced. The fitness of individual points is indirectly estimated by modeling their rank. The aim is to reduce the number of costly fitness evaluations needed for evolution. The ordinal regression, or preference learning, implements a kernel-defined feature space and an optimization technique by which the margin between rank boundaries is maximized.  The technique is illustrated on a classical numerical optimization functions using an evolution strategy. The benefits of surrogate ranking, compared to surrogates that model the fitness function directly, are discussed. 

My contribution to the paper was on model improvement, e.g. exploring what points are best to omit from the training set; and checking how often the surrogate model needed to be updated, generationwise. The most interesting part of the improvement process was that only potential candidate points (mu best) needed to be correctly ranked, not the entire population (all lambda), saving considerable computation on the long run. Technically the paper was a proof of concept and the surrogate SVM-model calls were more expensive than the simple Rosenbrock function calls. But in theory, the model worked and outperformed what had been done before. Hence, happy Holy.

The other papers weren’t as straight forward. The idea was to find good feasible (preferably optimal) schedules for job shop scheduling problem (JSSP) and/or assembly line balancing problem (ALBP) using hyper-heuristics. There exist more than 100 “good” scheduling policies for online implementation for each, making the problems NP-hard. The trick is deciding which polices work well with a particular data distribution. How are ties broken? Are they a combination of policies? Are they constant with respect to time? If one has no underlying knowledge of the problem domain (like this little programmer) it is too exhaustive to check every single combination of policies using a trial-and-error technique.

I started in March working on complete feasible schedules, i.e. offline implementation. I worked with a sequence representation of the schedule. Starting with random sequences and a (1,1)-ES random hill climber, the method almost always found the optimal makespan within 1,000 generations (there were some tough data sets it couldn’t solve, note to self don’t work on data sets uniformly distribution between 0 and 100). These were promising findings, alas, there wasn’t any natural comparison available. There is only one case (that I know of) that works solely with complete schedules and iteratively repairing them (Dietterich and Zhang 1995), but they worked on non-feasible, by starting with a relaxed problem, i.e., each job was scheduled as early as its temporal partial order would permit. Meaning resource constraints on the machines were initially ignored. Then the schedules would be repaired so the resource constraints were satisfied in the minimum amount of iterations. Similar, yes, but not quite comparable.

Finding good features for complete schedules also proved hard. There were hardly any natural transformation from the online policies to offline. In the end the only features seemed to be the distribution on how the processing matrix was set on the machines and the distribution of slacks between jobs. But with no comparison, this experiment will have to wait while. So it was back to the drawing board.
The new idea was to work solely on online implementation, at least there could be some comparison to what’s generally done in the literature. The features were similar to ‘typical heuristic policies’, i.e., for each job represented in the sequence the features were its processing time; its starting time; the time job will be released; and the new time machine will be free again. The general heuristic rule was to form an ordinal support-vector regression and the choose the job that had the feature set corresponding to the highest rank. Several models were simulated, all slightly different. The appropriate model was chosen by heuristically guided genetic algorithm (HGA). In HGA, each gene in a chromosome represents  which model should be used at each time step. Coded such that {1,…,K} denotes using the K SVM-models, but 0 denoting using the original heuristic model as described above. Each data set in the training set (#100) was run 30 times and each having max 200 generations [Read: very time consuming simulation]. I’m currently running it parallel on 4 processors on my desktop computer at my office in Tæknigarður, but it’s been 2 days and it’s only done half of the work.  A brilliant solution I had, since I didn’t have much time, was to run the program parallel on 20 nodes at my favorite cluster, Sól. I made the huge mistake of inserting save model.mat after all the simulations had finished instead of on the fly after each data set. This wouldn’t have been a problem if there hadn’t been a segmentation fault in the very end. Never have the words

??? Attempted to access P(201,:); index out of bounds because size(P)=[200,36]

made me want to cry as much. Hours waiting for a simulation wasted! Not exactly the ideal situation on the last day of the extended deadline. So close, yet so far.

Now there is nothing left to do, except to face the fact I’m not going to India in December and start googling what conferences are left of this year that don’t have an expired submission deadline.

Steering away from academia, once I came to terms with these devastating news, I realized the day couldn’t possible get any worse so I decided to jump of the deep end and do something I wanted to, but wouldn’t generally dare. I asked someone out on a date. He said yes. We’re going out tomorrow. Hence, happy again.


Tags: , ,

About tungufoss

a PhD student that sews whilst her code compiles...

4 responses to “A Summer Wasting”

  1. tungufoss says :

    Here's hoping ISDA'10 will be the answer to my conference prayers this year.

  2. Birna says :

    Ég ætla ekki að kommenta á ensku, finnst það kjánó :) En hvað heldurðu að margir hafi lesið þetta blogg? Ég reyndar renndi yfir það allt, en ég stórefa að margir hafi gert slíkt hið sama :)

  3. tungufoss says :

    Ég gerði ráð fyrir að það væri því að það væri ekki vel tekið í tæknilegheitin. Þess vegna setti ég viðvörina í upphafi.

Trackbacks / Pingbacks

  1. Resurrection (Penguin Prison Remix) « tungufoss - September 15, 2011

Your two cents...

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: