Intro
Hi, I'm Ryan.
I am a data scientist with a background in neuroscience research and investigations, and currently looking for a new role. Capable of working independently and on a team, I have driven several scientific and data science projects to completion. I strive for efficient and clear insights derived from data science and software engineering best practices.
Technologies
Skills
- Data mining (e.g. documents, OSINT, and web scraping)
- Data cleaning and loading
- Exploratory data analysis
- Data visualization
- Analytics and insights
- Technical communications and presentations
Data Science Projects
The following are a collection of projects that I've worked on, mostly in my spare time. Some are meant to install and use in other projects, some are works in progress, and some are stand-alone analyses.
General Tools
The following repos were written to be installed and used in other projects to streamline work.
Gordian
Project from Bellingcat's 2nd Hackathon. It was forked and adapted from the original project, which was a webapp, into a module that will assist with graph analysis and processing. Still a work in progress
link to source
DragonGlass
Module for programmatically working with the Obsidian notetaking platform. Allows the creation and mounting of vaults and writing and deleting notes.
link to source
Analytics
KOL Mapper
This project was written to mine all of the literature about a particular topic from Google Scholar and then do graph analysis to determine authors who are the most central to the field.
link to source
Air Quality
Project that ingested down data from swarms of atmospheric sensors from several air quality monitoring networks and combined with data from the TRAX air quality monitoring project. These data were then process and used to render dynamic heatmap animations of air quality over time in Salt Lake City, Utah. The heatmaps were generated using inverse density weighted calculation and rendering the maps with an unconventional combination of Mapdeck, Selenium, and image processing in Python.
source currently private

Inverse density weighted heat map of pm 2.5 air pollution in Salt Lake City, Utah during winter (common time for atmospheric inversion). The visualization was constructed from data collected by a swarm of fixed air sensors throughout the city and from the data read in from the elevated rail system. Can you find the copper smelter in the west of the city?
Visualization of pm 2.5 air pollution levels, as measured by the TRAX elevated rail system on The Forth of July. Can you tell when and where the firework show takes place?
Dashboards
Power Generators
Shiny dashboard that allows viewing of all power generators on the USA power grid and some analysis of different power generation technologies.
link to source
link to app
Yeast Genetics
Data was collected from several several sources (Kaggle and online scientific resources like Yeastract) and assembled into a shiny dashboard that allows the visualization of gene expression in different yeast strains and growing conditions.
link to source
link to app
link to data
Data Collection
pipycrawler
Docket image that sets up a containerized headless Selenium scraper on a raspberry pi system.
link to source
Pubmed/NCBI Abstract Miner
A script that uses the NCBI API to pull publicaiton information on a whole body of literature.
link to data
COVID Genome Webscrawler
A script that uses the NCBI API to pull genetic sequences of the COVID genome strains.
link
SGD_spider
Text
link
link to data
Yeastract Spider
Scraper for the site Yeastract.com. The data was used for the Yeast Genetics dashboard project above.
link
Ukraine Airspace
Webscraper that pulls flight data from a flight monitoring website with a geobounded box around Ukraine. This was set up to collect flight data from Ukraining airspace, running on a raspberry pi, starting 2 days before the Russian invasion and was run for several months.
source currently private
link to data
Scientific Projects
Transcription factors are proteins used by cells to switch genes on/off, regulating cellular programming (gene expression). I became interested in the molecular logic of gene expression while working as an undergraduate in the laboratory of Michael Rosbash on the expression of circadian genes and this trend carried through my graduate (Lloyd Greene) and postgraduate work (Alexandra Joyner).
Postdoctoral Work
Memorial Sloan Kettering Cancer Center
The engrailed family of homeodomain transcription factors are typically expressed in regions of developing tissue and play important roles for shaping those structures. Although, mutations in engrailed genes were known to cause disruptions in growth and shape of the developing cerebellum, our group chose to focus on a more targeted approach to identify the roles of these genes in specific cell lineages and developmental times. The work led to the discovery that a development of a set of deeper structures in the cerebellum and a signal integration hub for cerebellar activity, known as the cerbellar nuclei, affect the size and shape of the cerebellum as a whole.
The work also exposed me to graph theory for the first time and, together with the genomics component of the project, got me started me on my data science journey.
link
Ph.D. Work
Columbia University
GATA-2 is a zinc finger transcription factor that has been shown to be crucial for cell fate specification and proliferation in a number of different tissues and organisms. Originally identified in a screen as a gene regulated by nerve growth factor (NGF), and found to be expressed throughout the developing midbrain and sympathetic nervous system, I carried out a number of studies primarily using shRNA knockdown and in-utero electroporation on embryonic rat midbrain to demonstrate that GATA-2 is required for differentiation of neurons in the superior collicolus of the brain.
link
Additional papers (on programmed cell death)
RTP801/REDD1 Regulates the Timing of Cortical Neurogenesis and Neuron Migration
link
Sertad1 Plays an Essential Role in Developmentaland Pathological Neuron Death
link
Curriculum Vitae
Ryan T. Willett, Ph.D.
Resume Download
Professional Experience
Gryphon Strategies (New York, NY)
Data Scientist
- Responsible for design and production of data integration, workflow automation and CRM systems for due diligence investigations.
- Scoping, design, orchestration, and development of a system to carry out mining of probate court record data from a range of US jurisdictions using optical character recognition (OCR), NLP, entity resolution, and AWS
- Architect of internal ETL/ELT, NLP, security, API, and automatic reporting libraries.
- ETL, analytics, and report preparation for the plaintiffs in the National Prescription Opiate Litigation MDL under direction of the expert witness Lacey Keller (now of MK Analytics) and a class action lawsuit in the health insurance space.
- Built a feature engineering pipeline and risk model for a crime gun tracing platform for a major metropolitan police department
Freelancer (New York, NY)
Freelance Data Scientist and Scientific Consultant
- Litigation support in for National Prescription Opiate Litigation MDL (w/ GRYPHON above)
- Built a data ingestion, analytics and visualization pipeline using network graph analysis to identify concerted bot accounts on a popular social network site thought to be involved in a disinformation campaign. A total of > 250,000 accounts were analyzed.
- Collaborated with the Shindell Lab at Duke University to build data visualizations and web applications, showcased at the Climate & Clean Air Coalition Science Policy Dialogue
- Produced a data pipeline to build and render high resolution x animations from hundreds of stationary and mobile sensors over a multi-year period for presentation to the U.S. House of Representatives personnel. Additional animations of these data were included in a documentary with investigators from the University of Utah.
- Carried out ~10 scientific due diligence investigations for an Austrian-based venture capital firm specializing in biotech startup investment.
Chameleon Communications (New York, NY)
Scientific Associate
- Composed, edited and verified the scientific accuracy of communication products based on clinical and scientific research data from pharma clients, including commercial materials and research abstracts/manuscripts/posters.
- Provided scientific and business intelligence support for several pharmaceutical brands at various stages of their drug development process
Memorial Sloan Kettering Cancer Center (New York, NY)
Research Fellow
- Researched the assembly of (biological) neural networks and growth of brain structures from stem cells
- Studied the molecular genetics and genomics of the engrailed family of transcription factors during cerebellum development
- Conceived of and led an original research project, resulting in 1 research paper and 2 funded research grants
Columbia University (New York, NY)
Graduate Student and Research Fellow
- Conceived of and led an original research project, resulting in 1 research paper and 2 funded research grants
- Conceived of and led an original research project, resulting in 1 research paper
- Developed novel tools and methodologies for manipulation of gene expression in developing rat brain, leading to 2 additional research papers with collaborators
Education
Columbia University, Ph.D. Pharmacology and Molecular Signaling
Brandeis University, Bachelor of Science in Biology (High Honors) and Biochemistry
NYC Data Science Academy, Certificate in Data Science
Publications
Willett RT, Bayin NS, Lee AS, et al. Cerebellar nuclei excitatory neurons regulate developmental scaling of presynaptic Purkinje cell number and organ growth. Elife. 2019;8:e50617. Published 2019 Nov 19. doi:10.7554/eLife.50617
Willett RT, Greene LA. Gata2 is required for migration and differentiation of retinorecipient neurons in the superiorcolliculus. J Neurosci. 2011;31(12):4444-4455. doi:10.1523/JNEUROSCI.4616-10.2011
Malagelada C, López-Toledano MA, Willett RT, Jin ZH, Shelanski ML, Greene LA. RTP801/REDD1 regulates the timing of cortical neurogenesis and neuron migration. J Neurosci. 2011;31(9):3186-3196. doi:10.1523/JNEUROSCI.4011-10.2011
Biswas SC, Zhang Y, Iyirhiaro G, et al. Sertad1 plays an essential role in developmental and pathological neuron death. J Neurosci. 2010;30(11):3973-3982. doi:10.1523/JNEUROSCI.6421-09.2010
Contact
Location: New York City Area
Phone: (917) 359-8238
Email: ryan.willett@gmail.com
Elements
Text
This is bold and this is strong. This is italic and this is emphasized.
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }
. Finally, this is a link.
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';
Lists
Unordered
- Dolor pulvinar etiam.
- Sagittis adipiscing.
- Felis enim feugiat.
Alternate
- Dolor pulvinar etiam.
- Sagittis adipiscing.
- Felis enim feugiat.
Ordered
- Dolor pulvinar etiam.
- Etiam vel felis viverra.
- Felis enim feugiat.
- Dolor pulvinar etiam.
- Etiam vel felis lorem.
- Felis enim et feugiat.
Icons
Actions
Table
Default
Name |
Description |
Price |
Item One |
Ante turpis integer aliquet porttitor. |
29.99 |
Item Two |
Vis ac commodo adipiscing arcu aliquet. |
19.99 |
Item Three |
Morbi faucibus arcu accumsan lorem. |
29.99 |
Item Four |
Vitae integer tempus condimentum. |
19.99 |
Item Five |
Ante turpis integer aliquet porttitor. |
29.99 |
|
100.00 |
Alternate
Name |
Description |
Price |
Item One |
Ante turpis integer aliquet porttitor. |
29.99 |
Item Two |
Vis ac commodo adipiscing arcu aliquet. |
19.99 |
Item Three |
Morbi faucibus arcu accumsan lorem. |
29.99 |
Item Four |
Vitae integer tempus condimentum. |
19.99 |
Item Five |
Ante turpis integer aliquet porttitor. |
29.99 |
|
100.00 |