Presynaptic. Over the last decade, there has been significant interest in understanding the role of synaptic genes in neurodevelopmental disorders and epilepsy. While we have made significant progress on conditions such as STXBP1– and SYNGAP1-related disorders, many synaptic genes remain uncharacterized. In a recent publication, we delineated the phenotypic range of the newest member of this group of genes: BSN, encoding the presynaptic protein, Bassoon. Here is how we deciphered the landscape of BSN-related disorders, integrating real-world data and biorepositories.

Figure 1. BSN as a novel gene for neurodevelopmental disorders. (A) Phenotypes of BSN variant carriers by age across a range of biorepositories. Seizures occur largely in childhood but are also present in a subset of adults. (B). A volcano plot comparing the phenotypes of BSN variant carriers to 14,895 individuals who underwent exome sequencing. (C) A schematic overview of Bassoon at the presynapse and variants identified in the BSN gene. Figures adapted from Guzman et al., 2025.
Heidelberg, DKFZ. Bassoon, I don’t know if you remember, but we first met a long time ago. In the Fall of 1998, I was listening to a series of neuroscience presentations at the German Cancer Research Center (DKFZ) in search of a topic for my doctoral thesis. I remember a presentation about Piccolo and Bassoon: two novel presynaptic proteins that had recently been characterized. It was not only the creative naming that stuck with me, but also the fascination that there is still so much to be understood about the synapse. Piccolo, encoded by the PCLO gene, had since been implicated in a recessive form of Pitt-Hopkins Syndrome, but the role of Bassoon in brain disorders remained unclear.
Enter Bassoon. This situation changed when we identified de novo variants in BSN in two individuals seen in our neurogenetics clinic. We subsequently found 13 additional individuals with de novo BSN variants through GeneMatcher, which is the traditional way of collecting information about individuals with novel genes identified through diagnostic testing. Our question, however, was somewhat broader than the information gleaned from this database. We already knew from other newly-identified epilepsy genes that diagnostic testing often does not provide a complete overview of the disease spectrum and may instead skew perceptions of the phenotypic landscape. Therefore, we used the data science tools our team had developed over the last few years to dive into various biorepositories to which we had access. What had started as a typical gene discovery story suddenly became a lesson in data integration. This story formed the basis of our publication with Stacy Guzman as the first author. This project was part of Stacy’s recently-defended PhD thesis.
Biobanks and EMR. Let me pause my description of BSN and put in a plug for biorepositories. There is a singular theme that you may have noticed in my recent presentations and blog posts: we don’t use existing data as much as we should. Clinical information on rare disease is not absent, it is present in a variety of resources, databases, and biorepositories that were initially built to provide this information to the clinical and research community. Data on rare disease is not rare—it is ubiquitous—but we need to know how to access it. For example, for conditions such as STXBP1, SYNGAP1, or SCN8A, we know there are hundreds of patient-years of observational data buried in existing databases, electronic medical record systems, and biorepositories that we can decipher and turn into natural history data to enhance clinical trial readiness. We can use this information to generate real-world knowledge of rare conditions where little information existed previously. But can we also deploy these tools for novel genes like BSN?
The BSN spectrum. It turns out we can indeed use information distributed across biobanks to gain significantly more insight into BSN-related disorders than we would have expected. In fact, more than half of the 29 individuals we included in our study were identified through biobank data. It was only through this data that the full range of phenotypic features of BSN-related disorders emerged. This was critically important, as the disease spectrum associated with BSN variants is somewhat unusual for a synaptic disorder. More than half of all individuals did not have evidence of autism or developmental delay, which sets BSN apart from the various other synaptic conditions with the exception of STX1B-related disorders. Roughly 50% of all individuals with BSN variants have epilepsy, but the phenotype appears be milder in adults. Obesity is an unusually common feature, seen in more than one third of individuals, reflecting some of the non-CNS functions of Bassoon. However, like other synapse disorders, BSN can also cause severe disease: we identified a single individual with a de novo BSN variant with a neonatal encephalopathy through the Birth Defects Biorepository. Clearly, BSN has its own personality among synapse disorders, but how is it different from other genetic neurodevelopmental disorders?
Comparing BSN. We were also able to assess how BSN compares to other neurodevelopmental disorders through a systematic comparison to a large cohort of 14,895 individuals who underwent exome sequencing. We found that febrile seizures and behavioral features were more than five time more common in individuals with BSN variants than those with other neurodevelopmental disorders. In addition, BSN carriers were phenotypically more alike than we would have expected by chance, emphasizing that there is a subtle, but recognizable, phenotypic signature hidden in the broad spectrum of Bassoon-related disorders. These analyses were performed using systematic phenotype analysis through the Human Phenotype Ontology (HPO), a technique we introduced in our blog posts a few years ago.
What you need to know. BSN, coding for the presynaptic protein Bassoon, is a new gene for neurodevelopmental disorders with a broad range of phenotypes, ranging from individuals with epilepsy and typical development to neonatal presentations of seizures and cortical visual impairment. Given this spectrum, genes like BSN require a new approach to data assessment. We explored how critical information about phenotypes can be gleaned from existing biorepositories and found that such an assessment provides a more comprehensive overview than assessing clinically diagnosed individuals alone. Frameworks such as these, that combine data aggregation across biorepositories and real-world data, will be important to define meaningful outcome measures and endpoints for clinical trials in the future.