Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Andrew Stirn
commited on
Commit
·
afabfe4
1
Parent(s):
25688bd
first pass at documentation
Browse files
app.py
CHANGED
|
@@ -98,7 +98,6 @@ if __name__ == '__main__':
|
|
| 98 |
st.session_state.off_target = None
|
| 99 |
|
| 100 |
# title and documentation
|
| 101 |
-
st.title('TIGER Cas13 Efficacy Prediction')
|
| 102 |
st.markdown(Path('tiger.md').read_text())
|
| 103 |
st.divider()
|
| 104 |
|
|
|
|
| 98 |
st.session_state.off_target = None
|
| 99 |
|
| 100 |
# title and documentation
|
|
|
|
| 101 |
st.markdown(Path('tiger.md').read_text())
|
| 102 |
st.divider()
|
| 103 |
|
tiger.md
CHANGED
|
@@ -1,3 +1,45 @@
|
|
|
|
|
| 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
Wessels, H.-H., Stirn, A., Méndez-Mancilla, A., Kim, E. J., Hart, S. K., Knowles, D. A., & Sanjana, N. E. (2023). Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning. Nature Biotechnology. https://doi.org/10.1038/s41587-023-01830-8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
## TIGER Cas13 Efficacy Prediction
|
| 2 |
|
| 3 |
+
Welcome to TIGER!
|
| 4 |
+
This Hugging Face space is an online tool that accompanies our Nature Biotechnology article.
|
| 5 |
+
TIGER's ability to make accurate on- and off-target predictions enables biologists to both design highly effective gRNAs and precisely modulate transcript expressing by engineering gRNA mismatches.
|
| 6 |
+
If you utilize our model, please consider citing us:
|
| 7 |
|
| 8 |
+
> Wessels, H.-H., Stirn, A., Méndez-Mancilla, A., Kim, E. J., Hart, S. K., Knowles, D. A., & Sanjana, N. E. (2023). Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning. Nature Biotechnology. https://doi.org/10.1038/s41587-023-01830-8
|
| 9 |
+
|
| 10 |
+
[//]: # (In our article, TIGER predicts log2 fold-change (LFC) from target sequence, guide sequence, and additional scalar features.)
|
| 11 |
+
|
| 12 |
+
[//]: # (Prior to training, we normalize our survival screen's LFC values on a per-gene basis to discourage TIGER from learning which target transcripts come from more essential genes.)
|
| 13 |
+
|
| 14 |
+
[//]: # (As such, TIGER outputs a normalized LFC estimate.)
|
| 15 |
+
|
| 16 |
+
This tool differs from our manuscript in two ways.
|
| 17 |
+
First, this version of TIGER predicts using just target and guide sequence, which will marginally reduce performance (fig 3c).
|
| 18 |
+
Second, we map TIGER's outputs to the unit interval to make estimates more interpretable: a one corresponds to high gRNA activity and a zero denotes no activity.
|
| 19 |
+
This transformation, maps estimates with no detectable Cas13 activity to (0,0.025] and the most active 2.5% of estimates to [0.975,1)
|
| 20 |
+
This transformation is monotonically decreasing and therefore preserves Spearman, AUROC, and AUPRC performance.
|
| 21 |
+
We label these transformed LFC estimates as `Guide Score` in our prediction tables.
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
### Using TIGER
|
| 25 |
+
|
| 26 |
+
We support two methods for transcript entry:
|
| 27 |
+
- Manual entry of a single transcript
|
| 28 |
+
- Uploading a FASTA file that can contain one or many transcripts provided each has a unique ID
|
| 29 |
+
|
| 30 |
+
We currently offer three run modes:
|
| 31 |
+
- We report all on-target gRNAs for each provided transcript. This mode does not support off-target identification due to current computational constraints.
|
| 32 |
+
- We report the top ten most active, on-target gRNAs for each provided transcript. This mode allows for the optional identification of off-target effects.
|
| 33 |
+
- We report the top ten most active on-target gRNAs for each provided transcript and their titration candidates (all possible single mismatches). This mode also does not support off-target identification due to current computational constraints.
|
| 34 |
+
|
| 35 |
+
We use version 19 of gencode (protein-coding and lncRNA) to identify off-target candidates.
|
| 36 |
+
|
| 37 |
+
### Feature Roadmap
|
| 38 |
+
|
| 39 |
+
- Off-target scanning speed improvements
|
| 40 |
+
- Off-target scanning for titration mode
|
| 41 |
+
- Allow user to select more than the top ten guides per transcript
|
| 42 |
+
- Incorporate non-scalar features (target accessibility, hybridization energies, etc...)
|
| 43 |
+
|
| 44 |
+
To report bugs or to request additional features, please click the "Community" button in the top right corner of this screen and start a new discussion.
|
| 45 |
+
Alternatively, please email [Andrew Stirn](mailto:andrew.stirn@cs.columbia.edu).
|