Skip to content
Snippets Groups Projects
Commit 185907f0 authored by Roel Vrooman's avatar Roel Vrooman
Browse files

Update README

parent 0c3380aa
No related branches found
No related tags found
No related merge requests found
......@@ -9,10 +9,9 @@ Process for creating this model can be found below
Path to data: P:\4180000.23\Mouse2human
## Current version (untested):
## Current version (untested and only for ABA data):
The *Operable code functions.ipynb* contains the code to import all dependecies and define the three functions needed to perform the translations
from mouse to human and vice versa.
The *Operable code functions.ipynb* contains the code to import all dependecies and define the three functions needed to perform the translations from mouse to human and vice versa.
These three functions are:
**get_all_data(asset_dir='Data_folder')**
......@@ -220,10 +219,22 @@ For comparison we will also try to create the model based on data from the follo
*Ortiz, C., Navarro, J. F., Jurek, A., Märtin, A., Lundeberg, J., & Meletis, K. (2020). Molecular atlas of the adult mouse brain.
Science Advances, 6(26), eabb3446.*
At this point the data is preprocessed by being filtered with the homology lists from Ensemble.
The data was preprocessed by being filtered with the homology lists from Ensemble.
See *Dinos' data preprocessing.ipynb* for code
After this the Dinos data needs to be further processed to make it useable for our ends. To do this I replace the coordinate system with ABA stucture names and create a 3D ontology map only containing ABA structure for which there is expression data in the Dinos dataset. Finally the expression data is summarized so that we're left with a dataframe containing an avarage expression value for each available ABA structure for each gene.
See *Creating Dinos x variable and ontology mask.ipynb* for code
As a last step I have to make sure the input brain maps are in a form we can use in the model. To do this I preprocess them by reshaping, etc.
See *Processing y input for Dinos model.ipynb* for code
Finally I create a workflow similar to the one for the ABA mouse data using the preprocessed Dinos dataset.
See *Workflow Dinos.ipynb* for code.
### Quality control
Since the ABA mouse data differs in quality per expression map, we have performed a QC step to split the data into different groups. Now, when running the
......@@ -232,5 +243,11 @@ ABA mouse dataset. Low contains the full dataset, but also the worst quality, hi
See *QC mouse expression.ipynb* for code
### Expression Coverage
To see what coverage over the brain the datasets provide I create separate coverage maps for each dataset. The ABA datasets are both (mouse and human) based on the QC set 'Doubt'. Which is currently the best working subset of the data. The Dinos set is the full expression (filtered by homology and availability in the human ABA data)
See *Creating the expression coverage maps.ipynb* for code.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment