Tutorial ======== For an explanation of the algorithm and motivation behind **leabra7**, see Prof. Randy O'Reilly's book `Computational Cognitive Neuroscience `_. This tutorial assumes general familiarity with the concepts in that book, and focuses on the mechanics of using the **leabra7** package. **Contents:** .. contents:: :local: Two neurons ----------- The simplest **leabra7** network consists of two neurons, with one neuron driving the other. First, import the necessary libraries: .. code-block:: python import matplotlib.pyplot as plt import pandas as pd import leabra7 as lb from typing import * Now, we can create an empty network: .. code-block:: python net = lb.Net() We can now add two layers, each with one neuron: .. code-block:: python layer_spec = lb.LayerSpec( log_on_cycle=("unit_v_m", "unit_act", "unit_i_net", "unit_net", "unit_gc_i", "unit_adapt", "unit_spike")) net.new_layer(name="input", size=1, spec=layer_spec) net.new_layer(name="output", size=1, spec=layer_spec) Each layer gets a spec ("specification") object which contains simulation parameters. In this case, the :code:`log_on_cycle` parameter tells the network to record each of the following attributes every cycle: - :code:`"unit_v_m`: The membrane potential of each neuron - :code:`"unit_act"`: The activation of each neuron - :code:`"unit_i_net"`: The net current into each neuron - :code:`"unit_net"`: The net input into each neuron - :code:`"unit_gc_i"`: The inhibitory current in each neuron - :code:`"unit_adapt"`: The adaption current in each neuron - :code:`"unit_spike"`: Whether the neuron is spiking Spec objects can be used to specify many more parameters; see :doc:`specs` for more information. Connect the two layers with a projection: .. code-block:: python net.new_projn(name="proj1", pre="input", post="output") Before cycling the network, we will force the input layer to have an activation of :code:`1`: .. code-block:: python net.clamp_layer(name="input", acts=[1]) The :code:`acts` parameter is the sequence of activations that the input layer will be clamped to. It does not have to specify a value for every unit, since shorter patterns will be tiled. Now, we can cycle the network: .. code-block:: python for _ in range(200): net.cycle() Let's retrieve the network logs: .. code-block:: python wholeLog, partLog = net.logs(freq="cycle", name="output") The returned logs are simply Pandas dataframes. Note that there are two kinds of logs: whole logs, and part logs. Whole logs contain attributes that pertain to the whole object, for example, the average activation of a layer. Part logs contain attributes that pertain to the parts of an object. Since units are parts of a layer, and we logged entirely unit attributes, we will work with the parts logs: We can plot the Pandas dataframe: .. code-block:: python partLog = partLog.drop(columns=["unit"]) ax = partLog.plot(x="time") ax.set_xlabel("Cycle") .. image:: img/two_neurons.png Pattern association ------------------- Now, we will try a slightly more complicated task: training a network to recognize a simple set of patterns. First, as usual, we will import the necessary libraries. We will also configure a project name and set up Python's logging library to timestamp all of our logs (this is useful for tracking the progress of training): .. code-block:: python import logging import sys from typing import * import numpy as np import pandas as pd import leabra7 as lb PROJ_NAME = "pat_assoc" logging.basicConfig( level=logging.DEBUG, format="%(asctime)s %(levelname)s %(message)s", handlers=(logging.FileHandler( "{0}_log.txt".format(PROJ_NAME), mode="w"), logging.StreamHandler(sys.stdout))) With this housekeeping out of the way, we can start to write the actual network training code. We will break down the task of training a network into a series of functions that can be modified for any general network training task. First, define a function to load and preprocess the input and output data: .. code-block:: python def load_data() -> Tuple[np.ndarray, np.ndarray]: """Loads and preprocesses the data. Returns: An (X, Y) tuple containing the features and labels, respectively. """ X = np.array([ [1, 1, 1, 0], [0, 1, 1, 1], [0, 1, 0, 1], [0, 1, 0], ]) Y = np.array([ [1, 0], [1, 0], [0, 1], [0, 1] ]) return (X, Y) In this case, we are loading the following pattern associations: .. image:: img/pat_assoc_patterns.png :align: center :width: 200px We follow scikit-learn's convention of using the array shape :code:`[n_samples, n_features]`. Now, we will define a function to construct the network. We will use a simple feedforward architecture with one input layer and one output layer. .. code-block:: python def build_network() -> lb.Net: """Builds the classifier network. Returns: A leabra7 network for classification. """ logging.info("Building network") net = lb.Net() # Layers layer_spec = lb.LayerSpec( gi=1.5, ff=1, fb=0.5, fb_dt=0.7, unit_spec=lb.UnitSpec( adapt_dt=0, vm_gain=0, spike_gain=0, ss_dt=1, s_dt=0.2, m_dt=0.15, l_dn_dt=0.4, l_up_inc=0.15, vm_dt=0.3, net_dt=0.7)) net.new_layer("input", size=4, spec=layer_spec) net.new_layer("output", size=2, spec=layer_spec) # Projections spec = lb.ProjnSpec( lrate=0.02, dist=lb.Uniform(0.25, 0.75), thr_l_mix=0.01, cos_diff_lrate=False) net.new_projn("input_to_output", pre="input", post="output", spec=spec) return net In this case, we have tweaked many of the parameters. This is not strictly necessary, but it helps this specific network train faster. Now, we will define a function to run a network *trial*, consisting of a minus and a plus phase. In the minus phase, the input pattern will be clamped to the network's input layer, and the network will be cycled. In the plus phase, both the input and the output patterns will be clamped to their respective layers, and the network will be again cycled. The difference between these two phases (actual and expectation) will drive learning. .. code-block:: python def trial(network: lb.Net, input_pattern: Iterable[float], output_pattern: Iterable[float]) -> None: """Runs a trial. Args: input_pattern: The pattern to clamp to the network's input layer. output_pattern: The pattern to clamp to the network's output layer. """ network.clamp_layer("input", input_pattern) network.minus_phase_cycle(num_cycles=50) network.clamp_layer("output", output_pattern) network.plus_phase_cycle(num_cycles=25) network.unclamp_layer("input") network.unclamp_layer("output") network.learn() The next function we will define will run an *epoch*, where one trial is run for each pattern in the dataset: .. code-block:: python def epoch(network: lb.Net, input_patterns: np.ndarray, output_patterns: np.ndarray) -> None: """Runs an epoch (one pass through the whole dataset). Args: input_patterns: A numpy array with shape (n_samples, n_features). output_patterns: A numpy array with shape (n_samples, n_features). """ for x, y in zip(input_patterns, output_patterns): trial(network, x, y) network.end_epoch() With these building blocks in place, we can write the main training function: .. code-block:: python def train(network: lb.Net, input_patterns: np.ndarray, output_patterns: np.ndarray, num_epochs: int = 3000) -> pd.DataFrame: """Trains the network. Args: input_patterns: A numpy array with shape (n_samples, n_features). output_patterns: A numpy array with shape (n_samples, n_features). num_patterns: The number of epochs to run. Defaults to 500. Returns: pd.DataFrame: A dataframe of metrics from the training run. """ logging.info("Begin training") data: Dict[str, List[float]] = { "epoch": [], "train_loss": [], } perfect_epochs = 0 for i in range(num_epochs): epoch(network, input_patterns, output_patterns) pred = predict(network, input_patterns) data["epoch"].append(i) data["train_loss"].append(mse_thresh(output_patterns, pred)) logging.info("Epoch %d/%d. Train loss: %.4f", i, num_epochs, data["train_loss"][-1]) if data["train_loss"][-1] == 0: perfect_epochs += 1 else: perfect_epochs = 0 if perfect_epochs == 3: logging.info("Ending training after %d perfect epochs.", perfect_epochs) break return pd.DataFrame(data) This training function continuously runs epochs until we get three perfect epochs in a row, or we hit 3000 epochs. It also records the training loss for each epoch, and returns the recorded metrics at the end of training. The training function makes use of two housekeeping functions which we will now define: 1. The thresholded mean-squared-error loss function, which measures the performance of the network. 2. The prediction function, which calculates predictions for an array of input patterns. First, let's define the thresholded MSE loss function: .. code-block:: python def mse_thresh(expected: np.ndarray, actual: np.ndarray) -> float: """Calculates the thresholded mean squared error. If the error is < 0.5, it is treated as 0 (i.e., we count < 0.5 as 0 and > 0.5 as 1). Args: expected: The expected output patterns, with shape [n_samples, n_features]. actual: The actual output patterns, with shape [n_samples, n_features]. Returns: The thresholded mean square error. """ diff = np.abs(expected - actual) diff[diff < 0.5] = 0 return np.mean(diff * diff) And now, the prediction function: .. code-block:: python def output(network: lb.Net, pattern: Iterable[float]) -> List[float]: """Calculates a prediction for a single input pattern. Args: network: The trained network. pattern: The input pattern. Returns: np.ndarray: The output of the network after clamping the input pattern to the input layer and settling. The max value is set to one, everything else is set to zero. """ network.clamp_layer("input", pattern) for _ in range(50): network.cycle() network.unclamp_layer("input") out = network.observe("output", "unit_act")["act"].values return list(out) def predict(network: lb.Net, input_patterns: np.ndarray) -> np.ndarray: """Calculates predictions for an array of input patterns. Args: network: The trained network. input_patterns: An array of shape (n_samples, n_features) containing the input patterns for which to calculate predictions. Returns: np.ndarray: An array of shape (n_samples, n_features) containing the predictions for the input patterns. """ outputs = [] for item in input_patterns: outputs.append(output(network, item)) return np.array(outputs) That was a lot of work, but we can now train the network with a few lines of code! .. code-block:: python X, Y = load_data() net = build_network() metrics = train(net, X, Y) # Save metrics and network for future analysis metrics.to_csv("{0}_metrics.csv".format(PROJ_NAME), index=False) net.save("{0}_network.pkl".format(PROJ_NAME)) The script's output should look something like this: .. code-block:: shell 2018-09-06 22:07:50,202 INFO Begin training pat_assoc 2018-09-06 22:07:50,202 INFO Building network 2018-09-06 22:07:50,204 INFO Begin training 2018-09-06 22:07:50,460 INFO Epoch 0/3000. Train loss: 0.6138 2018-09-06 22:07:50,709 INFO Epoch 1/3000. Train loss: 0.5818 2018-09-06 22:07:50,963 INFO Epoch 2/3000. Train loss: 0.5814 2018-09-06 22:07:51,215 INFO Epoch 3/3000. Train loss: 0.5811 ... 2018-09-06 22:09:27,980 INFO Epoch 392/3000. Train loss: 0.0319 2018-09-06 22:09:28,227 INFO Epoch 393/3000. Train loss: 0.0315 2018-09-06 22:09:28,476 INFO Epoch 394/3000. Train loss: 0.0000 2018-09-06 22:09:28,722 INFO Epoch 395/3000. Train loss: 0.0000 2018-09-06 22:09:28,970 INFO Epoch 396/3000. Train loss: 0.0000 2018-09-06 22:09:28,971 INFO Ending training after 3 perfect epochs. At which point the network will have learned to perfectly recognize the desired patterns. That is to say, when we clamp the input pattern to the input layer and cycle the network, the correct output pattern will be reproduced on the out layer. The file :code:`pat_assoc_metrics.csv` can be further analyzed as you like, and the network can be easily loaded from the :code:`pat_assoc_network.pkl` file in the future. See the :doc:`network ` reference for more information.