My New Genetic Algorithm For Time Series

Published in

Becoming Human: Artificial Intelligence Magazine

6 min readOct 30, 2019

I developed new algorithm for timeseries forecast. This basically elimination algorithm which finds fittest points for general dataset and final points on data. Then, according to assumption, if last points test data error is low than the best gen is selected for forecasting.

Main Idea

Here is the main idea of algorithm finding first n minimum error gene to last n points of predicted data points. With randomly playin 2*n points, assumption is if the first n data fits well, other n value will fit well.

I shall start my algorithm by steps.

Justify Function

This function simply justifies past data similar to last data points. This is made because system learns from past data. You can change this to ’n’ last data if you want.

def justice_data(dataFrame,Series,day_range):
    for i in range(dataFrame.shape[0]-day_range*4):
        values_mean = Series[dataFrame.shape[0]-day_range*2:dataFrame.shape[0]].values.mean()
        Series[i:i+nDay*2] = Series[i:i+nDay*2] + (values_mean-Series[i:i+nDay*2].mean())
        
    return dataFrame

Here I simply add and substract mean of last points to other data points

Population Creation

This function creates gens according to differences and between means with normal distribation.

def createPopulation(adet,day_range,mean,diffmean):
    Population = []
    for i in range(adet):
        gen = []
        for j in range(day_range*2):
            gen.append(random.randint(int(mean-diffmean)-1,int(mean+diffmean)+1))Population.append(gen)
        
    return Population

adet means number of population which I take 2000 for use and hold them in a python list.

Training the Population

Before the mutation or crossover, I simply select best gens which fits the data points best with mea score.

def train(dataFrame,Series,GenHavuzu,day_range):
    seçili_genler = []
    
    for i in range(dataFrame.shape[0]-day_range*2):
        values = Series[i:i+day_range*2].values
        min_mae = mean_absolute_error(GenHavuzu[0],values)
        for gen in GenHavuzu:
            mae = mean_absolute_error(gen,values)
            if mae < min_mae:
                min_mae = mae
                seçili_genler.append([GenHavuzu.index(gen),i])
                
                
    return seçili_genler

After defined the functions I justify the data.

for i in range(100):
    data_train = justice_data(data,data.beerProduction,nDay)

However, for the function transformation, data and data_train are the same. You can see the result by plotting

plt.plot(data_train.beerProduction,color = 'green')
plt.plot(data.beerProduction,color ='red')
plt.ylabel('simulation result of ratios')
plt.show()

I train 2000 number gen for future process. After which pass the process I will select the gens which passes the 2000 number.

topList = []
value = 0
i = 0
j = 0
while( i <= len(seçili_Genler)):
    try:
        if seçili_Genler[j][1] == i:
            value = seçili_Genler[i][0]
            j += 1
        else:
            topList.append(value)
            print(value)
            i += 1
    except IndexError:
        break

This Process for selecting best options for genes.

For the probability part, I hold the values as this.

öncelikliListe, öncelikliListe_counts = np.unique(topList,return_counts=True)
modifiedGen = []

Then the crossover function comes:

def crossover(Series,day_range):
    global data
    global öncelikliListe
    global öncelikliListe_counts
    global genHavuzu
    global topList 
    global modifiedGen
    genHavuzu = np.array(genHavuzu)
    for i in range(data.shape[0]-day_range*4,data.shape[0]-day_range*2):
        öncelikliListe, öncelikliListe_counts = np.unique(topList,return_counts=True)
        values = Series[i:i+day_range*2].values
        run = True
        batch_threshold = 20
        batch = 0
        while(run):
            if batch >= batch_threshold:
                run = False
            genHavuzu_Selected = np.random.choice(öncelikliListe,4,p = öncelikliListe_counts/sum(öncelikliListe_counts))
            oldGen_1_15 = np.random.choice(genHavuzu[genHavuzu_Selected[0]-1],int(day_range/2))
            oldGen_2_15 = np.random.choice(genHavuzu[genHavuzu_Selected[1]-1],int(day_range/2))
            oldGen_3_15 = np.random.choice(genHavuzu[genHavuzu_Selected[2]-1],int(day_range/2))
            oldGen_4_15 = np.random.choice(genHavuzu[genHavuzu_Selected[3]-1],int(day_range/2))
            modifiedGen =  np.concatenate((oldGen_1_15, oldGen_2_15,oldGen_3_15,oldGen_4_15),axis =None)
            target = mean_absolute_error(modifiedGen,values)
            val_1 = genHavuzu[genHavuzu_Selected[0]-1]
            val_2 = genHavuzu[genHavuzu_Selected[1]-1]
            val_3 = genHavuzu[genHavuzu_Selected[2]-1]
            val_4 = genHavuzu[genHavuzu_Selected[3]-1]
            thr_1 = mean_absolute_error(val_1,values)
            thr_2 = mean_absolute_error(val_2,values)
            thr_3 = mean_absolute_error(val_3,values)
            thr_4 = mean_absolute_error(val_4,values)
            if  target < thr_1 and target < thr_2 and target < thr_3 and target < thr_4:
                print("Completed")
                genHavuzu = np.vstack((genHavuzu, modifiedGen))
                topList.append(len(genHavuzu))
                batch += 1

Here I select from genHavuzu (means gen pool) by probability weights and mix them with day_range value. This is 15, four of them fits the 60 size. If the four of them pass the old 4 one, I add the modified gen into gen pool.

crossover(data.beerProduction,nDay)

And lastly, the priority list are updated again:

öncelikliListe, öncelikliListe_counts = np.unique(topList,return_counts=True)
modifiedGen = []

Mutation Function

This function mutates the gen with 10 of values.

def mutation_gen(gen,Series):
    x = np.random.choice(gen,10)
    for i in range(len(gen)):
        if gen[i] in x:
            gen[i] = np.random.choice(Series.values)
            
            
    return gen

It selects randomly from the series values.

Modification Function

def modification(Series,day_range):
    global data
    global öncelikliListe
    global öncelikliListe_counts
    global genHavuzu
    global topList 
    global modifiedGen
    mutated_kromozom = np.zeros(day_range*2)
    nonMutated_kromozom = np.zeros(day_range*2)
    genHavuzu = np.array(genHavuzu)
    for i in range(data.shape[0]-day_range*4,data.shape[0]-day_range*2):
        öncelikliListe, öncelikliListe_counts = np.unique(topList,return_counts=True)
        values = Series[i:i+day_range*2].values
        run = True
        batch_threshold = 100
        batch = 0
        while(run):
            if batch >= batch_threshold:
                run = False
            genHavuzu_Selected = np.random.choice(öncelikliListe,1,p = öncelikliListe_counts/sum(öncelikliListe_counts))
            mutated_kromozom = mutation_gen(list(genHavuzu[genHavuzu_Selected[0]-1]),Series)
            nonMutated_kromozom = genHavuzu[genHavuzu_Selected[0]-1]
            thr_1 = mean_absolute_error(mutated_kromozom,values)
            org_1 = mean_absolute_error(nonMutated_kromozom,values)
            batch += 1
            if  thr_1 < org_1:
                print("Completed")
                genHavuzu = np.vstack((genHavuzu[:,0], mutated_kromozom))
                topList.append(len(genHavuzu))

However, I could not update the öncelikliListe for priorities, This selects the first eliminated gens.

modification(data.beerProduction,nDay)

Then I select the last N gen from list beginning from first pool size

seçilmişGenler = genHavuzu[2000:len(genHavuzu)]#seçilmişGenler means selected gens

Maximum Fit Selection

This function provides to select minimum error gen for last prediction

def select_the_max(Series,day_range):
    global seçilmişGenler
    seçilmişGenler = list(seçilmişGenler)
    min_mae = mean_absolute_error(seçilmişGenler[0][0:day_range],Series[data.shape[0]-day_range:data.shape[0]])
    lastGen = []
    for gen in seçilmişGenler:
        if mean_absolute_error(gen[0:day_range],Series[data.shape[0]-day_range:data.shape[0]]) < min_mae:
            min_mae = mean_absolute_error(gen[0:day_range],Series[data.shape[0]-day_range:data.shape[0]])
            lastGen = gen
    return lastGen,min_mae

I select the last gen to ‘sonGen’ variable and the error as ‘Hata’

sonGen,Hata = select_the_max(data.beerProduction,nDay)

Last Value Modification

The last gen is modified if there is another data points fits the gene. Again the 2*n data points are modified but first n is controlled.

def sonDegerModifikasyon(Series,day_range):
    global sonGen
    Last_gen = np.zeros(day_range)
    for i in range(100000):
        mutated_kromozom = mutation_gen(list(sonGen),Series)
        target = Series[data.shape[0]-day_range:data.shape[0]]
        val_1 = sonGen[0:day_range]
        org_1 = mean_absolute_error(val_1,target)
        thr_1 = mean_absolute_error(mutated_kromozom[0:day_range],target)
        if  thr_1 < org_1:
            if (mean_absolute_error(mutated_kromozom[0:day_range],target)) < (mean_absolute_error(Last_gen[0:day_range],target)):
                Last_gen = mutated_kromozom
                print("Completed")
            
            
    return Last_gen

Last Eliminated Gen with Modification

The last gene is generated by last modification

Son_elenmiş_gen = sonDegerModifikasyon(data.beerProduction,nDay)

Plot of last Fitted Last Values

plt.plot(Son_elenmiş_gen,color = 'green')
plt.plot(data['beerProduction'].values[data.shape[0]-nDay:data.shape[0]],color ='red')
plt.ylabel('simulation result of ratios')
plt.show()

The mae of the past n data is:

target = data.beerProduction[data.shape[0]-nDay:data.shape[0]]
val_1 = Son_elenmiş_gen[0:nDay]
org_1 = mean_absolute_error(val_1,target)
org_112.7552167839006

So it is a great optimization algorithm to fit past data to last n points.

Last Words

Actually, there is one step further which is testing every last n data points on actaul data and seeing the actual error on validation data set. I leave it to you this and mail me or comment here for any significant information. Thanks for reading.