Methods

Subjects

Digitised experimental data were available for 2 normal subjects (ages 20, 29) and 6 autistic subjects (ages 15, 26, 27, 32, 36, 42). All subjects were male. All autistic subjects met the diagnostic criteria for infantile autism of DSM-III and DSM-III-R, had no additional psychiatric or neurological diagnoses, and used no psychotropic medication.

Experimental Paradigm

Raw EEG data were obtained from the shift-attention experiment of Courchesne & al. [1994]. In this paradigm, common (75%) and rare (25%) stimuli were presented in two modalities, auditory and visual. The auditory stimuli were 1kHz and 2kHz tones presented binaurally through headphones, and the visual stimuli were red and green filled squares subtending approximately 1.2deg. of the visual field, presented on a CRT. The subject's task was to attend to only one modality at a time, and to press a button as quickly as possible whenever a rare stimulus occurred in that modality. In addition, the occurrence of a rare stimulus was a cue to shift attention to the other modality. So, for example, on hearing a rare tone while attending to the auditory modality, a correctly responding subject would press the button, and then begin ignoring all auditory stimuli and waiting for a rare visual stimulus. Complementarily, on seeing a rare colour while attending to the visual modality, he would press the button, and then begin ignoring all visual stimuli and waiting for a rare auditory stimulus. No behavioural response or change in attention was required in the case of a rare stimulus in the unattended modality. Stimuli were randomly ordered and 50ms in duration. The period between the end of one stimulus and the onset of the next varied randomly from 450ms to 1450ms in ten equal intervals.

Electrode placement followed the international 10-20 system. Recording sites included FPz, Fz, Cz, Pz, Oz, and a lower-eye electrode, though only Pz was used as input to the ANNs of this experiment. All electrodes were referred to the right mastoid. Impendances through all electrodes were below 10kOhm. Data were amplified by a factor of 20000 (Grass Model 12 Neurodata Acquisition System) with half-amplitude bandpass settings 0.01s-1 and 100s-1, and digitised at a rate of 195s-1 (sampling interval 5.12ms) at a resolution limit of at most 0.122uV. (Resolutions varied slightly across channels and between recording sessions, due to variation in amplifier sensitivities.) Amplifiers were calibrated at the beginning of each recording session with a sequence of at least 40 square pulses, +20uV in amplitude, 100ms in duration, sampled at mid-pulse. The median of the digitised values of these calibration pulses was used to compute a normalisation factor for amplifier sensitivity. Subjects were instructed to maintain gaze on a fixation point at the centre of the CRT, and compliance was monitored by closed-circuit television.

Scoring

When stimuli follow one another rapidly, response latency is not known a priori, and all responses occur via the same signal, scoring is necessarily somewhat arbitrary. We have chosen to follow the scoring procedure of the original investigators. According to that procedure, a button-press is scored as a correct response if and only if it occurs from 200ms to 1400ms after the onset of the cue to shift. The subject may begin the experiment attending to either modality, so the initial modality is established retrospectively by counting the first possibly correct response as correct. So if the subject's first response is to a tone, we score that response as correct and make the current modality visual, and, complementarily, if the subject's first response is to a square, we score that response as correct and make the current modality auditory.

Suppose that the subject misses a cue, that is, a rare stimulus occurs in the currently attended modality, but there is no button-press during the following 200-1400ms. The focus of the subject's attention becomes uncertain in this case, because we don't know whether he completely missed the cue, or whether he was simply too slow to react. In such a condition the current modality must be redetermined by the same retrospective procedure as is used at the outset of the experiment.

Artefact Correction

Given the limited amount of data available, and the frequent occurrence of eyeblinks in most of the autistic subjects, it became necessary to find a way of removing these blink artefacts so that all the data could be utilised. While certainly not the most conservative approach, correction rather than rejection of eye artefacts often is the only practical approach when dealing with subjects who are unable to control blinks adequately and who cannot keep still for long recording sessions. (Using a rather liberal 125uV range threshold on the eye channels, a rejection algorithm discarded more than half the trials for some autistic subjects!)

Correction in the time domain proved impractical, since peak-finding and thresholding rules were often confused by noise in the autistic subjects. Time-domain correction also tended to overcompensate for blinks, actually introducing a negative artefact in place of the positive blink component. This overcompensation in the time domain has been reported previously and may be due to differences in transfer between blinks and saccades [Weerts & Lang 1973]. Saccade artefacts arise from changes in orientation of the retinocorneal dipole, while blink artefacts can be attributed to the alteration in conductance arising from contact of the eyelid with the cornea [Overton & Shagass 1969]. Transfer of blink artefact decreases rapidly with distance from the eye, while transfer of saccade artefact decreases more slowly, so that the effect of saccades on potentials at the vertex is about double that of blinks [Overton & Shagass 1969].

While the use of a frequency-domain filter introduces acausality, the effect of a blink correction on EEG earlier in the epoch is negligible (figure 0), and the slight undesirability of the acausal influences allowed by frequency-domain methods is outweighed by their computational ease, their allowance for frequency-dependent transfer, and their robustness in the presence of noise.

Since there was no upper-eye electrode, the difference between the voltage at FPz and the voltage below the eye was used as an estimate of the VEOG. VEOG transfer to the EEG was then estimated by transforming both the VEOG and the EEG to the frequency domain and computing a complex correlation coefficient for each frequency in the transform [Woestenburg & al. 1983]. The entire continuous record was blocked into sequential epochs of 1.3s (256 samples). Each of these epochs was tested for possible amplifier saturation, as indicated by the persistence of a constant value on any channel for 20.48ms (four samples) or longer. Every epoch that passed this saturation test was thresholded at 75uV. VEOG epochs whose range exceeded this threshold were presumed to contain substantial eye artefact, so they were included (after Fourier transformation) in the correlation computation. This computation yielded a complex coefficient for each frequency which represented the transfer of EOG to EEG at that frequency:

Though it is common to address the problem of contamination of the EEG by the EOG, in attempting to correct for this contamination one must also consider the reverse direction of propagation: the vertical potential difference across the eye actually is a superposition of signal generated by the eye and signal generated by the brain. In our case, the original experimenters' use of FPz as one pole of the VEOG aggravated this problem. The effect of this contamination on the computed values of the correction factors r[nu] is minimised by thresholding the VEOG and including in the correlation computation only those epochs that contain high-voltage activity on the VEOG. But since the frequency range of blinks overlaps with the frequency range of brain potentials, this precaution doesn't prevent a fraction of the EEG from being filtered out when the r[nu] are applied [Verleger & al. 1982].

This excessive zeal of the blink filter can be compensated for by Wiener filtering of the EOG. Each of the VEOG epochs whose range fell below the 75uV threshold was Fourier-transformed, and the average power spectrum of all these epochs was computed. This average spectrum of `quiet' VEOG was presumed to reflect the contribution of brain potentials to the VEOG. For each epoch to be corrected, the single-trial VEOG power spectrum was computed, and the average quiet spectrum was proper-subtracted from it to form a Wiener filter coefficient. The final corrected EEG thus becomes

.

After computation of the correlation coefficients and the average quiet EOG power spectrum, epochs time-locked to stimulus presentation were created from the continuous data. These were again 1.3s (256 samples) in length. Epochs were rejected if any channel failed the amplifier saturation test, or if the range of any eye channel in the 200ms preceding a visual stimulus exceeded 125uV. The latter test was done to ensure that the subjects' eyes had actually registered the visual cues. For data that were being transformed back to the time domain after filtering, a baseline correction was applied by substituting the average of the 200ms presample for the zero-frequency component. Time-domain data were output for presentation to the networks as vectors of 256 consecutive samples. For frequency-domain data, the DC and Nyquist components were dropped, and the remaining data were output as vectors of 127 complex coefficients.

Network Simulations

Visual and auditory data in both the time domain and the frequency domain were presented to separate instances of completely-connected feed-forward networks, trained with back-propagation [Rumelhart & McClelland 1986]. Both two-layer (input and output) and three-layer (input, hidden, output) networks were tested using the activation function

h = 1.7159tan-1(2/3 x)

which gives h(-1) = -1, h(0)=0, and h(1)=1. In all cases the output layer consisted of a single node whose desired value was -1 for a missed target (i.e., failure to respond within the 200ms-1400ms window) and 1 for a hit. Due to a ceiling effect of the task, the ratios of misses to hits were in the neighbourhood of 0.1 to 0.2 (table 0). Misses were duplicated in the data set in order to make this ratio 1, thus preventing the network from cheating by simple guessing. For each subject, trials to be duplicated were selected in a round-robin manner from the set of misses. The sizes of the data sets pooled from all subjects were 399 (autistic auditory), 365 (autistic visual), 96 (normal auditory), and 105 (normal visual). Each data set was split evenly into a training set and a testing set; the small amount of data precluded a three-way division into training, testing, and verification sets.

Table 0. Data set sizes and compositions, network performances, and training times for those performances.

Since these networks were learning a binary classification, error was defined as zero in cases where the output had absolute value greater than, and sign identical to, the desired value of -1 or 1. After each pass through the training set, the learning rate was set to 0.004 of the mean squared error divided by the fan-in. For performance evaluation, output less than zero was classified as a prediction of a miss, and output greater than zero was classified as a prediction of a hit.

Weights were initialised to random values in the range +/-0.2, and networks were run until at least one of the following termination conditions was satisfied:

(0) mean squared error on the most recent pass through the training set 10-4 or less

(1) all classifications on the last 104 passes through the training set were correct

(2) all classifications on the most recent pass through the testing set were correct

(3) number of training presentations (size of the training set times number of passes) 4*106 or more

Results