Changes

Tyler Collins · 6b98c36b
--- a/essinit.md
+++ b/essinit.md
+[[_TOC_]]
+
+# Overview
+This script is an attempt to create a standardised import function for the BUCANL lossless pipeline. It accepts raw data files and will preform all of the necessary steps to prepare the raw files for the [eeg_pipe_asr_amica pipeline](https://git.sharcnet.ca/bucanl_pipelines/eeg_pipe_asr_amica/wikis/home) (this pipeline). This script was designed to be very versatile for any level of automation and knowledge base of [batch context](https://git.sharcnet.ca/bucanl_eeglab_extensions/batch_context/wikis/home) configuration scripts.  
+
+Raw files can vary greatly depending on the modality used to collect the data, while init files going into the pipeline need to be prepared in a particular way for smooth processing during the initial stages and for ICA. This script currently can prepare data from two different styles:  
+
+**1. ESS Capsule**  
+To load a capsule make sure it it is at the level 1 stage, and load the study_description.xml file into the [run history batch file GUI](https://git.sharcnet.ca/bucanl_eeglab_extensions/batch_context/wikis/home) under the file upload section. The metadata stored in the capsule will be automatically accessed in the script and be used to find all the files the study contains.  Using the [file structure](https://git.sharcnet.ca/bucanl_pipelines/eeg_pipe_asr_amica/wikis/directories) designed for this pipeline is best as the saved files will be directed into the *analysis/data/1_init* folder when they are complete.   
+
+:so: We recommend storing the *Main_Study_Folder* inside of *analysis/data/0_raw*  making the study_description.xml available in *analysis/data/0_raw/Main_Study_Folder/level_1/study_description.xml* 
+
+:so: To find out more about ESS capsules go to the [Big EEG Consortium](http://www.bigeeg.org/).
+
+**2. Raw Files**  
+You can also load raw files  into the [run history batch file GUI]](https://git.sharcnet.ca/bucanl_eeglab_extensions/batch_context/wikis/home) file upload as normal. Using the [file structure](https://git.sharcnet.ca/bucanl_pipelines/eeg_pipe_asr_amica/wikis/directories) for this pipeline is best as the saved files will be directed into the *analysis/data/1_init* folder when they are complete. This means uploading your raw files from the *analysis/data/0_raw* folder.
+
+:so: On a side note make sure your EEGLAB work space is set to the *eeglab_asr_amica* folder so that *analysis* is the first folder on your path.  
+
+This script takes the raw data file through 4 main process:
+1. Loading the File                                                                                                   (LOAD)
+2. Merging other files and renaming events                                                (MERGE)
+3. Warping Channel locations to MNI head                                                  (WARP)
+4. Creating events and marks to indicate In/Out Task Time                 (EVENTS)
+
+The picture below shows how each section develops the file. 
+
+![STUDYBUDDY2](/uploads/196c4c169ab3d191c0920b88f0f55da7/STUDYBUDDY2.png)
+
+
+# Configuration and GUI
+This script will only produce GUI's if it is missing a necessary piece of information. This means that if you carefully fill out the configuration file then you can run the complete pipeline automatically! On the other hand if you are:
+ * Not comfortable with the configuration files  
+ * Unsure of some of the this file information  
+ * Missing clear events to indicate "out of  task time"
+Then the script can be run with just the basic configuration setup and will use GUI's to prompt you for the information as needed. This process is more time consuming but will help you understand what is needed in the configuration.  
+
+:so: You can always run a smaller batch first to sort out these details then run the whole set once the config is complete. Refer to the table below to help plan your script use. 
+
+| Method |          Variables         |   Run Style  |
+|:------:|:-------------------------:|:-----------:|
+|   ESS  |                           |             |
+|    1   |       ESS and Config      |     Auto    |
+|    2   | ESS and Config and Manual | Interactive |
+|  OTHER |                           |             |
+|    1   |           Config          |     Auto    |
+|    2   |     Config and Manual     | Interactive |
+
+# Mandatory Variables
+If you want to run this script fully automatic or on a remote cluster you need to ensure that each of the mandatory variables described below are denoted in either the batch config or the ESS capsule metadata.  
+
+**ESS LoadType - Automatic**   
+ * Mandatory variables are split between the ESS capsule metadata and the config file. Do not worry about the XML variables in this case as they are automatically found by reading the ESS study_description.xml file that you loaded when batching.
+
+**OTHER LoadType - Automatic**   
+ * All of the mandatory variables need to be in the config file.
+
+The table below indicates with an **x** the methods in which you can describe each variables depending on your load type. You will see that all variables can be described in the Config or the ESS metadata. If those are not filled out, then some variable allow you to enter the values later, while others will revert to defaults.
+
+| LOADTYPE         |ESS        |  ESS       |  ESS        |    OTHER    |      OTHER                             |                                              |
+|----------------|:---:|:------:|:------:|:------:|:------:|---------------------------------------------------------------------------------|
+| Variable       |  XML | Config | Manual | Config | Manual | Description                                                                     |
+| loadType       |     |    x   |    x   |    x   |    x   | ESS or OTHER type of study you are loading                                      |
+| modality       |     x  |       |        |    x   |    x   | System used for recording                                                       |
+| oneloc         |  x     |       |      |    x   |    x   | 1 if you are using a single montage location file, 0 for subject specific files |
+| locs_file_name |      x  |       |        |    x   |    x   | Path to montage or naming convention for subject specific files                 |
+| remData   |   x  |       |        |    x   |        | Channel numbers that do not contain EEG data and need to be removed before location file is loaded                                   |
+| remChans      |     x  |      |        |    x   |      | Channel numbers of the Fiducials or other channels that are not in the data                       |
+| ref_loc_file   |       |    x   |        |    x   |        | Warp  reference surface, default is the MNI head if no config is given                               |
+| tMatrix        |      |    x   |    x   |    x   |    x   | Matrix used to warp your location files to MNI head                             |
+| mergefiles       |       |    x   |    x   |    x   |    x   | A list of the names of the files you wish to merge using swap strings. Leave blank for browsing                          |
+| merges      |      |    x   |    x   |    x   |    x   | The total number of files you wish to merge, 1 means just the loaded file with no merge                            |
+| change_srate       |      |    x   |    x   |    x   |    x   | The sampling rate you wish to resample to.  0 or the current rate wont change the data.                         |
+| Onevents       |       |    x   |    x   |    x   |    x   | Events that represent HED tag 'Custom/Marks/RecStart'                           |
+| Offevents      |      |    x   |    x   |    x   |    x   | Events that represent HED tag 'Custom/Marks/RecStop'                            |
+| Inevents       |         |    x   |    x   |    x   |    x   | Events that represent HED tag 'Custom/Marks/InEvent'                            |
+| filerename       |         |    x   |       |    x   |       | Name of the saved file,  using string swaps                           |
+
+:so: Filling in the config file is a great way to standardise the init stage, but using the manual option becomes much more advantageous for variables that changes between subjects. For example if a couple of your files need merges but the other ones do not. Simply leave the ```merges``` and ```mergefiles``` fields empty. This will prompt your for each file so that you can adjust it as you need.
+
+# Loading Variables
+This next section contains a short description on each of the variables and examples of what you might want to put in the config. If you are using the configs that come with this git project, you will have 2 templates one for ESS and one for OTHER. 
+
+**loadType**   
+The load type refers to whether you are starting from an ESS capsule or from a list of raw files.  Use
+```[loadType],ESS```  or ``` [loadType],OTHER ``` in the config replace string. If you leave config empty ```''``` the following GUI will be displayed. Simply checkbox your answer. 
+![loadtype](/uploads/8c7f6e20088d9e686c54d127cb5bd86d/loadtype.png)
+
+***
+
+**modality**  
+The modality refers to which system you did your data recording on. The script needs to know this in order to select the right plugin and function to load your data. If you are using ESS then this information is not needed in the config. Currently the following are supported:  
+ * ```[modality],BIOSEMI```   
+ *  ```[modality],BRAINVISION ```  
+ * ```[modality],EEG ```   
+ *  ```[modality],NETSTATION ```   
+
+If you leave config empty ```''``` the following GUI will be displayed. Simply checkbox your answer.  If you check other it will attempt a multitude of  loading functions based on the file extension but there is no guarantee that any of these will work for you unique file type.  
+:so: Note when loading Netstation file we manually return the reference channel for the sake of warping. 
+
+![modality2](/uploads/5d117fa0aee49c5df436bdb8d1b0d81b/modality2.png)
+
+***
+
+**oneloc**  
+This variable is either a 0 or a 1, based on your location file. ESS capsules are always set to **0**.
+ * 0 means that the file has a unique location file for each subject. The file is located in the same folder as the data file.  
+ * 1 means that the files use a common location montage or generic location file.   
+
+Use ```[oneloc],0```  or ``` [oneloc],1 ``` in the config replace_string. If you leave config empty ```''``` the following GUI will be displayed. Simply checkbox your answer.  
+![oneloc](/uploads/d1df4787900ffc960d93d3828d1e2366/oneloc.png)
+
+***
+
+**locs_file_name**   
+This variable is a continuation of your **oneloc** answer above. ESS capsules use the built in metadata and the location file names will be found automatically. You will have to change this if you are using OTHER.
+ * If you chose **oneloc 0**  ,  use a naming convention in the config such as ```'[batch_dfn,.,-1]_locs.elc'``` see [swap strings](https://git.sharcnet.ca/bucanl_eeglab_extensions/batch_context/wikis/script-files#swap-string) for more information, in order to collect the correct file for each subject.  
+
+ * If you chose **oneloc 1**  , write the path and file name of the montage file such as ```'analysis/support/misc/BioSemi_BUCANL_7Eyes.sfp'```.   This file will be used for all of the subjects.  
+
+If you leave the config empty ```''``` the following GUI will be displayed. Simply type the same thing as above. 
+![locfile](/uploads/d9f43b4a91af12cd236d42ba3f7ed35f/locfile.png)
+
+***
+
+**remData**  
+These are the channel indices that do not contain EEG information this includes unplugged channels but will **not include fiducials**. ESS will by default not load any channels that do not contain EEG. For example our lab never uses the EXG8 channel of our BioSemi so there is meaningless data in its EEG.data row, this corresponds to using ```[remData],[136]``` (128 electrodes + 7 eyes + 1 junk) to remove the very last junk channel. This item is one of the few that does not contain a backup GUI as an empty config is a acceptable answer and this default will remove no channels. 
+
+:so: This feature will depend on your labs modality. It is good practice to load a raw file and investigate the EEG.data structure.  
+
+***
+
+**filerename**   
+The new name of the file once the script is config. Use batch string swaps when batching multiple files. If left empty ```''``` the saved file will default to         ``` '[batch_dfn,.,-1]_init.set'```  meaning it will keep the same name, and exchange the old extension for *_init.set*. This feature is particularly useful with raw files or ESS files that have very large names, where all the information it not necessary.
+
+***
+
+# Merging  Variables
+In order for you to merge files they will need to have the same number of channels and the same sampling rate. After you choose the number of files and which ones they are, this script will check for file consistency. If one or more files have a different number of channels or srate you will receive a **pop-up** to notify you that the files are being edited.  If you do not want to make these changes then click cancel to about the script, readjust your configuration so that it does not include any of the atypical files.
+
+:warning: For channels, all the files will be reduced to have the same number of channels. For example if one file has 128, while the others have 135, each of the 135 files will have there last 7 channels removed. This is common for systems like the Biosemi, depending on weather the session used the EXG eye channels. You will still have the opportunity to adjust your channels further using your location file or the ```remChans``` variable.
+
+:warning: For srate, all the files will be re sampled to have the same frequency as the file with the lowest rate. For example if one file's srate is 512, and the others are 1024, they will be re-sampled to 1/2 the frequency.
+You will still have the opportunity to adjust your channels further using the ```change_srate``` variable.    
+
+**merges**  
+This number indicates the number of files you wish to merge together. Entering a 1 into this field will prompt you for no merges, otherwise use the number of files you are combining. If you leave this field empty ```''``` then you will get the following GUI prompt for a number.
+![nummerges](/uploads/9885d54325fd95c2c5bc144bbe24419c/nummerges.png)  
+
+***  
+
+**mergefiles**   
+This field can be left empty if you are not merging files. if you are nmergeing files leave this empty to prompt a browser to select the additional files you would like to merge to the loaded file. For example if  in the end I would like to combine 8 files. I use from above ```[merges],8``` and ```[mergefiles],''``` to browse for the other **7** files. Alternatively you can use swap strings to generate the names of the files and place them in a cell array. For example, ```[mergefiles],'[batch_dfn,_,-1]_m1.bdf' '[batch_dfn,_,-1]_m2.bdf'```
+ 
+![browsemerge](/uploads/64d029e9d32074d660a98fbc8cc3e885/browsemerge.png)  
+
+***  
+
+**change_srate**    
+This item allows you to down sample your data's sampling rate. Use ```[change_srate],0``` or your current EEG.srate to leave the file as is. Type a new number to re sample to that frequency. If the field is left empty a GUI will prompt.
+![changesrate](/uploads/dc2c66e994f4d26e3abd695118edfb7d/changesrate.png)  
+
+***  
+
+**erename**    
+This variable stands for *Event-Rename* and can be used several different ways:
+
+**1. ESS**   
+This config field can be left blank, all you event codes will be renamed to their designated labels found in the study description.
+![esserename](/uploads/212adfb5d51dec8a5f2b6e8fcb229e94/esserename.png)
+
+**2. Manual**   
+Prompts a GUI where the user can see the old event names listed in a text editor. The list appears to be the event names all listed twice, but your job is to switch the second version of each of the names to the name you would like to use. This GUI will prompt for each of your merge files, in case you have the same events codes but wish to label them something different. If you do not wish to change the name then leave the second instance of the event code as is.
+![erenamegui](/uploads/294fc6f4e5e614a1d2de163e43bf3664/erenamegui.png)
+
+**3. Config File Reference**  
+If you would like to use the config to help rename your events you need to place the name of a event renaming file that matches the a specific design.
+Use  something like ```[erename],'Eyes_Erename.txt'``` in the config, and make sure your file is on the matlab path. :so: Store it in the *analysis/support/misc* folder so its on the path and you can use it later for reference.  
+
+The event renaming reference file should look like this:  
+:warning:  Note there are only 3 columns! The first number in each row is my text editor displaying the line number!   
+
+![erenamefile1](/uploads/281357a5d0cef623f1168f7d37bce061/erenamefile1.png)
+...
+
+![erenamefile2](/uploads/7779b6e1397d4638770f59364ce26c8a/erenamefile2.png)
+
+Where the first column is the file number, in reference to your merges, the second column is the existing event code, and the third column is the new event name. In the example above we were interested in the file one events only so they are renamed to understandable names. The events codes for the second file are left as is. You can add extra events that do not exist by accident, but make sure you do not leave one of your event codes out, as the name will be  set to empty. Separate the columns by spaces.
+
+The next example has a range of codes for each image for example *house_up* is codes 11-16. This file will combine these so they all have the same name.
+![erenamefile3](/uploads/59a7191bfba6e931eb9a45398b5f82f3/erenamefile3.png)  
+
+:warning:  If you rename the events to InEvent , RecStart or RecStop it will be considered in the marks EVENTS stage coming up and will influence the marks structure. Although this can be very useful, please do not use these names until you understand how these special marks work from the information below.
+
+***
+
+# Warping Variables
+
+**remChans**  
+The indecies listed will be set as *not EEG data* and placed in the *EEG.chaninfo.nodatchans* rather than *EEG.chanlocs* when the location file is loaded. ESS capsules denote this information by specifying the channel *Type* as *FID* and will automatically use these when warping , but if you are using config you will have to specify these. For example our lab uses the first three channel locations as the LPA, NZ and RPA fiducials, and the CMS and DRL are near the end. Using```[remChans],[1 2 3 132 133]```will properly load these.
+
+***
+
+**ref_loc_file**   
+The reference location file is the montage that you want to warp your channel locations to. For this pipeline you will be warping to the MNI head. The config should point to the location file. The config should be```[ref_loc_file],analysis/support/misc/standard_1020.elc```. If the config is empty it will default back to this file as well. In the future this may want to be changed and thus was made a string_swap variable.
+
+*** 
+
+**tMatrix**  
+Warping  and transforming, ensures that the location file that you loaded aligns with the standard MNI head model that is used later on in the pipeline for referencing. Both ESS capsules and OTHER sets need to provide this information through the config or through the interactive GUI. Essentially the pipeline needs a transformation matrix of 9 values to expand,translate and rotate in each dimension to get your locations to our positions. If you are using a standard transformation matrix for all of your subjects you can place the matrix in the config, for example ```[tMatrix],[ 0 , -23 , -50 , 0 , 0 , -240.3 , 1100 , 1100 , 1100 ] ```. If you you would like to do exact transformations for each file you will have to go through the GUIs. The first GUI allows you to input a file specific transformation matrix for that subject if you know it. If you do not know what it is yet, then click the manual checkbox to open the EEGLAB co-register GUI.
+![warp](/uploads/3ab26d4bbfe41c2f08b0f6100f2846f2/warp.png)
+Adjust the matrix that is displayed at the bottom until you find the right fit and press  ```[ OK ]```.  
+
+:so:  After doing you first file by manually warping, you will be able to see the transformation matrix that will be used. You can use this matrix to help you get a *head start* ( :smile: ) on the next file, or use this in the config to morph all of the files the same way for a faster approximation.
+![warp4](/uploads/0ffd5834f3356d22aae23e0861fe6ff0/warp4.png)
+
+***
+
+# Event and Marks Variables
+The pipeline proceeding this script needs to be contain specific events that indicate in-task time and out of task time. This is because ICA analysis is sensitive to messy task time that could contain movement or non-stationary so we only want to submit the essential data to ICA. To do this the out of task time is flagged using our Out/On/In events to [create removal marks](https://git.sharcnet.ca/bucanl_eeglab_extensions/vised_marks/wikis/marks). This system uses the [Vised Marks plugin](https://git.sharcnet.ca/bucanl_eeglab_extensions/vised_marks/wikis/home) to do the data flagging.
+
+:so: If you are building your own ESS capsule or you are lucky these events can be built into the capsule already. Use the HED tags:
+ * Custom/Marks/RecStart
+ * Custom/Marks/RecStop
+ * Custom/Marks/In  
+
+These tags will be identified and used as your On/Off/In events respectively.
+
+**Onevents/Offevents/Inevents**    
+To tell the script where to flag, you need to give it Onevents and Offevents **OR** Inevents.  Using this information **New**, **Standardised** events will be created and placed at the same latency as your other marks.
+
+* **In events** are events that need to be surrounded within In task time. Any time that is not within 3 sec of an Inevent is considered out of task. This is ideal for studies with many repeated tasks that are within a few seconds of each other.   
+ * Config Example: A Go/NoGo task with 1 as Go and 2 as NoGo with one or the other every 2 seconds. This task has many short breaks between the sessions. Inevents would be ideal as you could specify ```[Inevent], 1 2```and that would ensure that all of the Go and NoGo tasks are included for ICA, while the breaks and lead up times would be removed.     
+
+
+* **On/Off events** can also be thought of as *Record On* and *Record Off*. They are specified in pairs to act as boundaries for in task time. Any time on the outside of these events will be marked black and out of task.   
+
+ * Config Example: A study where participants are asked to clear their mind for 5 min. The start of this task is event 11 and the end is 22. In this case Inevents would not work, as they would only gather 12 sec of the task,6 at the start and 6 at the end on either side of the start and stop events. Instead if you specify the pair ```[Onevent], 11```and ```[Offevent], 22``` that would ensure that the whole 5min is set as in task recording. Notice how 11 and 22 are a paired set. If 11 was always on but off alternated between 22 and 33, then you would specify the following in the config:```[Onevent], 11 11```and ```[Offevent], 22 33```. 11 is recorded twice but each time it has a different stop indicator. You also have to be weary about the number of events of each type you have in the file. **For every On there must be an Off**, since the marking function uses intervals.
+
+If you do not have a prebuilt ESS and you do not fill out the config file you will prompted with the following GUI that asks for the same information.
+Depending on which method you are using simply type in the event numbers separated by spaces. For example ```22 33```. If your study does not contain events that suit either of these methods you can manually add *Recording Start* and *Recording Stop* marks. To do this select the *Manual Mode* checkbox.
+![events](/uploads/38fbc765ee402361dffbc6e0f34c0795/events.png)
+
+If you selected *Manual Mode* then this little GUI will pop up. This GUI simply warns you that you are entering the [Vised-Marks editor](https://git.sharcnet.ca/bucanl_eeglab_extensions/vised_marks/wikis/manual-vised-edit) and that any annotations you add here will be added to the [marks structure](https://git.sharcnet.ca/bucanl_eeglab_extensions/vised_marks/wikis/marks) permanently. The two commands you will be using are:
+* ```a``` to drop a RecStart event at your cursor  
+* ```b``` to drop a RecStop event at your cursor  
+
+See the example below where we placed events before and after each block of tasks.
+![events2](/uploads/ed4a00ef37fd4d59ed7a979d5f7078c3/events2.png)
+
+***
+
+![vised3](/uploads/fcc6034816cd3f29dc38781760c4ea8f/vised3.png)
+
+*** 
+
+After you click *Update Marks Structure* the events that you dropped into the data will be saved into the events information structure. Vised Marks will then proceed to place *Out of task marks* everywhere outside your boundaries. Below is a example of what the final result will look like if you were to plot the data afterwards, note that below we plotted the whole file at once so you could get a birds eye view.
+![events5](/uploads/8e0c5b4c3044b2c85479bfe184a4f9a4/events5.png)
+
+These methods may seem like a lot of work but they can easily be simplified by using the events that already exist in the experiment. For the example above if we used Inevents we could have listed all of the event types in the experiment in the config file. We would get the same result as there were no events during the break times. That method would have run automatically and would have saved  bunch of time. Consider your experiment when choosing a marking method.
+
+ESS capsule references on this page are to the *RSVP Target Detection* study found at the Big EEG Consortium [Study Catalog](http://www.studycatalog.org/) .
+
+*** 
+
+[ :house: **Return to Main Page**](https://git.sharcnet.ca/bucanl_pipelines/eeg_pipe_asr_amica/wikis/home)
\ No newline at end of file