I am acclimated and confident with “solving mysteries without any clues”, which at the core is what data science (analysis, engineering, architecture) actually is. There is no such thing as the perfect data set, if it is extracted from the real-world its messy, incomplete, and clumsy. In scientific research & healthcare, one needs to be comfortable with assessing, challenging assumptions & drawing conclusions with incomplete, imperfect data – esp in science and health care as the recommendations do truly matter. At the core, I am a scientist – I generally don’t provide recommendations unless I have challenged all approaches and reproached data analysis from different angle(s). I am naturally skeptical, so before I confidently state anything I have ran it thru the ringer personally to a degree that I am confident of results within p=0.05. Moreover, I have very little ego and would never assume that I am always right – I thrive in an environment that provides smart pushback and challenges results presented. Constructive criticism is not just welcomed, its almost demanded as everyone gets better that way. I would much rather “fail early and fail often” because failing is not “null”, it’s an elimination of wrong answers. I have very little issue recognizing and admitting being wrong and would ask any team I work with to challenge each other because that improves the overall output of team efforts.
Broadly speaking, this is a new team that will offer fresh approaches and novel analysis techniques on broad datasets, company-wide. Most of my major successes in research and professional settings have come from a cross-pollination of established techniques from divergent disciplines. My prediction models for what type of drug zebrafish were on, coded as a spatiotemporal series, were only successful because I was able to find, reach out to & collaborate with a European group with expertise in movement pattern analysis. We used the models behind the “EURO” hurricane trajectory projections on zebrafish swimming paths and it worked. Before finding that group, I was looking into musical theory and mathematics based on waveform signal processing and transformation. Before that, we tried to just throw our time-series data into standard analysis techniques – like those used in financial forecasting. Neither provided practical significance. When met with dead-ends, I am motivated by what I know is possible and am able to walk away, readjust and re-approach from a different perspective. Again, “no” does not equal “null” – finding out quickly what doesn’t work is a faster way to narrow down to what will work. I think this is a basic, realist notion on how human knowledge exploration actually works. We are a trial and error-based learning species. No one has the answers 100% right on the first go-around, and if they claim they do – they are not properly tuned into how real progress is made.
Highlights/Summary of Data Projects
See “Integrating cross-scale analysis in the spatial and temporal domains for classification of behavioral movement” @ https://digitalcommons.library.umaine.edu/cgi/viewcontent.cgi?article=1056&context=josis (this is a portion of my dissertation, the dissertation went deeper than this article)
Published Article on 3D Spatiotemporal Analysis of zebrafish exploration: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017597
In graduate school, we also put custom movement RFID tags to analyze there exploration or movement patterns, I lead this project from hardware design to software analysis and team members daily activity.
All of these projects involved image processing/video analysis and signal integration to distinguish subject from background, and then perform feature extraction to find movement patterns with significant predictability on the drug type fish were exposed it.
I have also performed time series analysis on financial data (oil futures, cryptocurrency) & streaming environmental data in Agriculture. There have been multiple clients here, including myself – but analysis of this granular (0.3s intervals) time-series data can be quite challenging. Have to know how to wrangle it & then process/visualize it to make sense of it or to allow for data-driven strategic decisions.
I have evaluated and used several time-series analysis methodologies – including auto & cross correlation, FFT, and Wavelet analysis. In R, MatLab, Python & Rapidminer over the years.
As for IoT sensors, 6 month project, performed Google Cloud Migration of wearable sensors, capturing the raw data from wearable watches, processing it, prepping it & serving for analytics. They were looking at tremors and Parkinson’s disease, I also helped with basic feature extraction.
Custom built air and water sensors from raspberry pi CPU base, which streamed temp, humidity, VPD, PPFD etc over a cellular connection (we were in dead zone) in 3 year agricultural cultivation resource efficiency research and development. A broad summary of some of this R&D can be found @ https://bit.ly/SGIpaper
All of our animal models in the lab required monitoring of temperature, pressure (gait analysis) and proximity (as assessed from video of subject). Microphones, Accelerometers, Gyroscopes are all straight forward data if the signal is properly acquired & processed – movement thru space over time.
I can audit sensors to ensure data acquisition is high quality and viable for scientific research, including automated QA/QC processes built on top of streaming data ingestion from sensors as the data moves thru the data pipeline. I can perform and automate exploratory data analysis, which then informs feature extraction and ultimately the best data components, patterns or attributes to feed into training machine learning models.
You can see some of this work on the last page of my resume, on the technical skills diagram (Also attached here). On my resume, this experience is found in CCV Research, Seinergy, Conscious Cannabis Ventures, Fleurish Farms, SunGrown Zero, Neuroscience Information Framework & Tulane University.