To obtain the exact sequences we used, please click the UniProt, GENCODE and Ensembl release numbers above. Please note that a small number of these sequences needed to be excluded from RNAct due to limitations of our algorithm: short or extreme length (proteins ≤50 aa or >14,507 aa, RNAs ≤50 nt or >28,227 nt), or unsuccessful RNA secondary structure prediction using the ViennaRNA package which catRAPID relies on internally.

* GENCODE "basic" contains a selected subset of the transcriptome: "The transcripts tagged as 'basic' form part of a subset of representative transcripts for each gene. This subset prioritises full-length protein coding transcripts over partial or non-protein coding transcripts within the same gene, and intends to highlight those transcripts that will be useful to the majority of users." (GENCODE FAQ, no. 4)

